Preprocessing of Slang Words for Sentiment Analysis on Public Perceptions in Twitter

Media Anugerah Ayu; Abdul Haris Muhendra

doi:10.5772/intechopen.113725

Abstract

Nowadays, many people express their evaluations on certain issues via social media freely, which makes huge amounts of data generated every day on social media. On Twitter, public opinions are diverse, which makes them possible to be processed for sentiment analysis. However, many people conveniently use slang words in expressing their opinions on Twitter. These slang words in the text can sometimes lead to miscalculation of language processing due to the absence of the “real words.” This research aimed to investigate the effect of adding slang words as part of the preprocessing stage to the performance of the conducted sentiment analysis. The sentiment analysis was performed using Naïve Bayes Classifier as the classification algorithm with term frequency-inverse document frequency (TF-IDF) as the feature extraction. The research focused on comparing the performance of the conducted sentiment analysis on data that was preprocessed using slang dictionary and the ones that did not use slang dictionary. The case used in this research was texts related to COVID-19 pandemic in Indonesia, especially the ones related to the implementation of vaccines. The performance evaluation results indicate that sentiment analysis of data preprocessed using slang word dictionary has shown better accuracy than the ones preprocessed without it.

Keywords

sentiment analysis
slang words
social media
performance evaluation
public opinions
Naïve Bayes
Twitter

Author Information

Show +

Media Anugerah Ayu*
- Faculty of Engineering and Technology, Sampoerna University, Jakarta, Indonesia
Abdul Haris Muhendra
- Faculty of Engineering and Technology, Sampoerna University, Jakarta, Indonesia

*Address all correspondence to: media.ayu@sampoernauniversity.ac.id

1. Introduction

The rapid growth of the Internet nowadays has made huge amounts of information spread through different platforms, such as blog posts, online discussion forums, product websites, social media, and so forth. Several tools/applications are used, in the form of social media, as the basis for people to communicate and share their opinion or information with different methods such as texts, images, videos, audios, and so on. One of the popular social media that are capable of gathering information and opinion from general people is Twitter. Twitter is one of the 10 most-visited websites that have been used as a platform to collect data; for example, it is used to collect the tweets related to the candidate for election [1, 2]. Using several unique features such as hashtags and retweets can make data collection easier. The collected data then is analyzed to see whether the opinion goes toward positive, negative, or neutral sentiment. Sentiment analysis or opinion mining is one of the methods of text mining to determine the attitude of a subject toward a certain topic [2, 3]. Many studies have been done with a different approach. In their work, Bouazizi and Ohtsuki [4] approached the work by proposing multi-class classification sentiment analysis, while [5] approached the work by comparing the preprocessing method in sentiment analysis. Sentiment analysis requires the classification of the tweets that have been collected, toward the determination of its positive, negative, or neutral review.

Classification is a process or technique of categorizing different sets of data into different classes [1]. There are two techniques for classifying the data, which are lexicon based and machine learning. The lexicon-based approach works by classifying the sentiment based on the dictionary that has been provided beforehand. The dictionary contains a large amount of data, where each of them is labeled by annotators, either manually or automatically. On the other hand, machine learning uses training and testing data to predict the output in classifying the data. Some of the examples use common algorithms like Naïve Bayes, Maximum Entropy, Support Vector Machine, and K-means for classification.

In machine learning, Naïve Bayes is one of the most commonly used techniques for classification. Naïve Bayes works best when used on a well-formed text corpus. Corpus is a collection of documents with a large number of total documents. This means that the algorithm will use training data as a way to learn the input data given and make decisions from it. The decision is then divided into three sentiments, which are positive, negative, and neutral sentiments. In this research, the Naïve Bayes algorithm has been assessed for finding accuracy, precision, recall, and F-measure.

Out of many specific kinds of sentiment analysis that have been conducted, assessing sarcasm is regarded as one of the hardest challenges to explore, especially in Indonesia where research on that area is limited. Sarcasm or irony can also be a burden on the performance of sentiment analysis [6]. Another issue in Indonesia is that a popular way to type a tweet is by using slang words or abbreviations. Singh and Kumari [7] stated that slang is one of the major challenges in this area other than noise, relevance, emoticons, and folksonomies. Disambiguation because of the ignorance of the slang sometimes leads to miscalculation of the sentiment. Some researchers have done research optimizing the data cleaning when the slang word occurs in the document. In Indonesia, some researchers such as [6, 8, 9] specifically focus on the slang word in their paper. The method used in their paper varies, from improving the stemming process for the slang to generating their slang lexicon. Using one of the basic stemming algorithms for the Indonesian language, evaluating the sentiment can be done better in terms of accuracy. The common method for the Indonesian language stemming is by using Nazief and Adriani Stemming Algorithm [10]. Some other research studies, such as Drus and Khalid [1], Jianqiang and Xiaolin [5], Rahayu et al. [6], Nuritha et al. [11], Adarsh and Ravikumar [12], Ferdiana et al. [13], Fitri et al. [14], Mandloi and Patel [15], show the effectiveness of term frequency-inverse document frequency (TF-IDF) from the lexicon-based approach as feature extraction, while Naïve Bayes is the optimum classification from machine learning approach.

One interesting case for sentiment analysis to be done in Indonesia is the topic regarding the coronavirus (COVID-19) pandemic. Over a year of pandemic events throughout the world, Indonesia had become the country with the highest case prevalence and fatality rate among Southeast Asia countries. By checking the trending tweets that are discussing the virus on Twitter, hashtags related to it, such as “#covid”, “#covid19”, “#delta”, “#omicron”, “#vaccine”. Social and physical distancing to reduce the transmission of the virus had been implemented in several countries, including Indonesia. The campaign to limit human-to-human transmission as well as self-hygiene was required to be done. After more than a year of the first case of coronavirus in Indonesia, positive cases in Indonesia had risen with a total of 4,763,252 as of 12 February 2022. The government had taken action to apply the coronavirus vaccine to help reduce the spread of the virus. Up until 12 February 2022, 135,209,233 people of Indonesia had been given fully dosed vaccine, which is 50.7% of the population.

Observing the sentiment of people talking about the virus may become one of the measurements to see if people’s perceptions toward global pandemic can be used to measure the emotion of the people in relation to the pandemic. Therefore, one of the objectives of this research is to help in concluding the temporary result of the perception of Indonesian people toward pandemic. The main objective of this research study is to seek a better result of sentiment analysis if the slang words and abbreviations that are commonly used in tweets can be considered in the process. The data collection and processing will be retrieved from Twitter API. The selected sentiment of people’s opinion on Twitter can be done by choosing several popular words related to COVID-19 and its vaccination. The collected data are then processed into two different stages, one that uses slang word and abbreviation dictionary while the other one does not use slang word and abbreviation dictionary in the preprocessing step. Evaluation then will be done by comparing the performance measure of both processes, the one with slang words included and the one without.

The remainder of this paper is structured as follows: Section 2 describes the related work from previous study and Section 3 discusses the method used in this research. Section 4 presents result from preliminary research and the main experiments and their discussions. Section 5 discusses the conclusion.

2. Related work

This section discusses previous studies done that are related to this research study. The discussed studies are grouped into four, that is, studies related to public perception, sentiment analysis, Twitter, and COVID-19.

2.1 Public perception

Public opinion/perception refers to the social and political attitudes held by the public toward the emergence, spread, and change of social events in a certain social space. It can be expressed according to entities, behaviors, and emotional words. Previous research has been conducted on many branches of the topic of assessing public perception.

Assessing public perceptions is usually conducted through the use of surveys, including defined preference or customer satisfaction surveys. Casas and Delmelle [16] discussed how Twitter can be a method to assess public perceptions of BRT (bus rapid transit) in area of Cali, Colombia. The main purpose of their research was that they wanted to know what discussion is happening in terms of transportation systems, especially on the topic of user satisfaction and/or service quality. Moreover, they wanted to ensure that the information of tweets in the Latin American context is similar to the knowledge about the quality factors in the country. They used Twitter Search API, twitterSearch library, within a 9-day time frame, which was filtered by geographic location within a 60 km radius from the center of Cali city. Moreover, they only filtered two search keywords of tweets: MetroCali and MIO.

While other research used public perception to understand user satisfaction of public transportation in a city, public perception can be also used as a way to get crowdsourcing information in disasters, for example, in getting information of building seismic safety following the Canterbury earthquakes in New Zealand [17]. The purpose of their research is close to this research, which is related to the topic of a nation-level disaster of coronavirus 2019, which seeks for risk and expert opinion to relieve public anxiety and acceptance of building standards regarding the durability to withstand earthquakes.

In terms of social media itself, many researchers discuss it more specifically, especially when talking about public opinion with social media data. Klašnja et al. [18], in part of Oxford Handbooks Online, discussed social media data and public opinion. They stated three factors of why social media can be used to measure public opinion. First, social media offers a chance to observe the opinions of the public without any prompting or framing effects from analysts. It means that the analyst does not need any other burdensome environment or deciding a topic from the analyst’s view; rather we can observe them by choosing what the analyst wants and filtering all of the related opinions. The second factor is the reach of their data. Since social media can be found all over the world, they provide tons of data on a daily or even hourly basis. Twitter itself is likely already the biggest time series dataset of individual public opinion available to the public. Third and the last factor is cost and practicality. With a few codes executed in a simple device, anyone can capture a selected topic in real time for free. These three factors are the main reasons why social media is considered to be a good choice for examining public opinion.

2.2 Sentiment analysis

Sentiment analysis or opinion mining is the study of determining people’s perspective of opinion, attitude, and emotion into something related to them, such as entities, individuals, issues, events, or topics [2, 3]. Its focus is to analyze opinions from a text document. It is part of natural language processing (NLP), which is a technique for analyzing and describing text naturally. The study involves classifying the attitude of texts into three common parts, which are a positive, negative, and neutral statement. To classify the different sentiment methods, various algorithms were developed.

In their paper, Drus and Khalid [1] present a systematic literature review (SLR) of sentiment analysis topic. Taken from five online resource databases that publish literature, which are Emerald Insight, Science Direct, Association for Computing Machinery (ACM), Scopus, and IEEE, they identified a total search of 407 articles with keywords “Sentiment analysis, social media, Facebook, Twitter” during publish time between 2014 and February 2019. After screening the available articles, a total of 24 articles are selected. Out of 24 papers, 7 papers used lexicon-based methods, 10 papers used machine learning methods, and 7 papers showed the combination of both methods. Another paper [2] also conducted an SLR on sentiment analysis that focused on Twitter data. Out of 42 papers deeply reviewed, 23 used machine learning-based approaches, 10 employed lexicon-based approaches, and 9 papers used hybrid-based ones.

2.2.1 Lexicon-based approach

Lexicon is one of the methods to approach sentiment analysis, which does not require any training data but only depends on the dictionary that has been prepared before. Lexicon-based approach is included as an unsupervised learning method [1]. The lexicon-based method works by determining the overall sentiment tendency of a given text by utilizing a pre-established lexicon of words weighted with their sentiment orientation or dictionary. It works by identifying the final polarity score of the given text from prepared language resources of positive, negative, and neutral words.

Many papers have discussed using the lexicon method to get the sentiment of people’s opinions. In their work, Al-Thubaity et al. [19] create their lexicon by using the dataset of Saudi Dialect Twitter Corpus (SDTC) that consists of 5400 tweets containing Saudi dialect. The corpus was chosen to minimize the risk of dataset prejudice against a specific topic. Then, the tweet classification is done using SaudiSenti, which is a lexicon containing 4431 words. The lexicon is then compared with the previous lexicon available AraSenTi with the result that SaudiSenti outperformed AraSenti when comparing neutral tweets. Mukhtar et al. [20] works on a lexicon-based approach in the Urdu language. The method used is to first create a Sentiment Lexicon in the Urdu language with the help of annotators; then, the analyzer is created to perform sentiment analysis. Even though they use the lexicon approach, some machine learning approaches are still in use, such as stop word removal, sentences classification, and attribute selection. The result is that the lexicon-based approach outperforms the machine learning approach in many aspects, such as accuracy, precision, recall, F-measure, time taken, and effort. This can happen because the lexicon and the analyzer are well-developed.

Besides being used to get the sentiment, the lexicon can also be used to collect other things, such as a slang dictionary. Wu et al. [21], Salsabila et al. [22] and Muliady and Widiputra [23] discussed the context of making a slang dictionary. The crawled slang words are retrieved by the online dictionary in their respective language, and some provide them with a sentiment score beside the meaning and choose most of them to avoid mistakes.

2.2.2 Machine learning approach

According to Vieira et al. [24], machine learning is “an area of artificial intelligence that is concerned with identifying patterns from data and using these patterns to make a prediction about unseen data.” It involves learning patterns in the data, storing the processed patterns, and then making them as a method to do predictions. It differs from a traditional statistic in at least four ways: it has a capability of speculating at the individual level; it focuses on maximizing generalizability; it is a data-driven approach; it takes into account individual heterogeneity. Based on the category of machine learning, supervised learning is by far the most commonly used approach in research that requires machine learning. Supervised learning is a machine learning algorithm where prepared correlations between data and expected outcomes are provided as examples [25]. It uses the algorithm to learn the optimal function that occupies the relationship between the input and the variable.

Taking an example, the learning process can be compared with student learning with a teacher. The teacher knows the correct answers to some questions, and the student tries to answer the questions as close to the correct answers as possible. If the student happens to get the wrong answers, the teacher corrects the mistake. It means the process of predicting the result with the difference of the predictions and target should be as small as possible. A supervised method works based on training classifiers by using combinations of features, for example: in tweet context, the information features can be in the form of hashtags, retweet, emoticon, capital words, and so forth [26]. It works by utilizing algorithms to extract and detect sentiment from data with the most commonly used algorithm: Naïve Bayes, Support Vector Machine, and Random Forest.

Work presented in Singh et al. [27] performs Twitter sentiment analysis using the Rapid Miner tool. The author uses two common algorithms, Naïve Bayes and k-NN algorithms. The dataset was fetched from Twitter with the topic of government campaign and ready to be classified into positive and negative opinions. Both common algorithms Naïve Bayes and K-NN perform with 100% accuracy to find positive values but fail to find negative values. The author suggests using a tool other than the Rapid Miner tool, which is the NLTK toolkit from Python since it consists of many sources of inbuilt libraries.

2.3 Twitter

Twitter is one of the famous social media that allow users to post brief text updates, with one tweet (text message) limited to 280 characters. The official release of this microblogging service was on 13 July 2006, which can be accessed via web or mobile [14]. With over 313 million monthly active users and over 500 million tweets per day, Twitter has become one of the most promising platforms to enhance the social, political, or economic side of individuals or organizations [5].

Many interesting features have made Twitter popular as a data source for many studies related to public opinions. With limitation for 280 characters, users only need to spend a little time creating one tweet. Moreover, properties like “ReTweet” make spreading information become so much faster. Users only need to click or tap the retweet icon (described as a double arrow sign that creates a loop) to make the tweet appear on their homepage. Hashtag (labeled by the sign “#”) usage is also making people find the topic easily. According to Bouazizi and Ohtsuki [28], hashtags are “labels used on social network and microblogging services which make it easier for users to find messages with a specific theme or content.” It is useful not only to spread news or discussion to refer to the topics being discussed but also to set a trending topic. Another uniqueness of Twitter is that the data provided can be accessed freely by using the Twitter API, thus making the data easier to collect. By registering for Twitter Developer, collecting and processing the data can be done without the need to do anything that breaks the rules.

Various studies have used Twitter as their data source in doing sentiment analysis. Work presented in Drus and Khalid [1] has reviewed 24 papers related to sentiment analysis, whereby only 6 of them did not use Twitter as their context, rather using other sources, such as YouTube, Facebook, Stock Twits, or news blog. Another work presented in Wang et al. [2] has reviewed 42 papers using Twitter as their data source for conducting sentiment analysis. A study by Zimmer and Proferes [29] shows a topology of Twitter research over 380 academic publications ranged from 2006 to 2012 that used Twitter as their main platform of data collection and analysis. Furthermore, a recent study presented in [30] has also been based on Twitter data to develop a sentiment analysis model in relation to stock market price.

2.4 COVID-19

It is mentioned in Harapan et al. [31] that the coronavirus was first identified as a cold in 1960, which was treated as a simple nonfatal virus. It was known as COVID-19 when the first case was identified at Wuhan, China, in December 2019. Later, a new type of coronavirus 2019-nCoV was found from the outbreak in Wuhan. WHO declared that this is a global pandemic on 11 March 2020 since it affected 172 out of 195 countries with more than 30,000 reported deaths. The way the coronavirus spread generally was through airborne droplets. People can get the infection if one of the following body parts is in contact with the infected droplet: eyes, nose, or mouth. The effect causes respiratory infection including pneumonia, cold, sneezing, and coughing [32].

The strategy to reduce the spread of the virus is by doing simple practices; covering the mouth and nose while coughing or sneezing, maintaining a minimum of 1-m distance between persons, and frequent handwashing just postpone the virus from spreading. The movement of “social distancing” was being held in many countries that listed containing positive cases, with the strategies of closing any educational institutions and workplaces, canceling any event that required mass gatherings, self-quarantining people who were suspected with the contact of the virus, stay-at-home recommendations, and even lockdown in some cities [33]. Self-quarantine of people with symptoms of this virus is because the incubation period of the virus is 14 days or less with an average of 5 days [34]. Hence, the facilities still open even in this outbreak need to check common symptoms that people have. Every facility needs to be equipped with at least a thermal detector and hand sanitizer.

A study presented in Nicola et al. [35] reviewed the pandemic in terms of socioeconomic aspects. The classification is divided into three sectors: primary sectors, which are industries that consist of raw materials; secondary sectors, producing complete products; and tertiary sectors, including service providers. We can see that there is an important missing part, which is social impact. Lockdown in many countries had increased the level of problems in domestic violence and physical, emotional, and sexual abuse. Many instances have been found that it is more difficult to expose domestic violence since no one can leave their house if it is not necessary. Thus, the guideline to find and report domestic abuse can be found in several media. Vieira et al. [33] talks about how to treat well-being during the pandemic. Stress is one of the unavoidable effects of lockdown due to limited activity that can be done. The author suggests that people need to be aware of this pandemic to prevent the risk of health problems due to stress. Updating on the situation needs to be done daily on reliable sources of information. Misinformation among news should be reduced by using more diverse channels such as television, radio, newspaper, and online news. Information should be spread out in ways that people understand what they need to do.

Another study in Chen et al. [36] has focused on retrieving public opinion from one of the popular news websites with keywords related to the topic of coronavirus, ranging from 1 January 2020 to 7 July 2020. By using a skip-gram model of word to vector and manual screening, the filtered trigger words are selected as the dataset. They construct a relationship between the dominant public opinion by analyzing the frequency and probability of keywords in each category.

3. Methodology

As mentioned earlier, this study aims to investigate the effect of including a step with slang word dictionary in the preprocessing phase of the tweet-data to the performance of the conducted sentiment analysis. The dataset is retrieved by crawling tweets with related keywords on Twitter. The search query used to get the twitter is related to the topic of COVID-19 in Indonesia, such as “corona,” “covid-19,” “vaksin”, as well as the hashtags related to it, such as “#vaccine,” “#vaksin,” and “#corona”. The tweets data were taken every day, which was limited to 7 days (1 week) from the day of execution. After that, the data would be stored as a CSV file, which will be used to get the sentiment score. The scoring of the sentiment would be held automatically using the Indonesian lexicon approach that is available on Github.

To be able to get Twitter datasets, we need to create a Twitter developer account. Apps of the developer are also needed to generate the key and token. There are four keys to getting access to data collection: Access token, access token secret, consumer key, and consumer key secret. These keys will be used to crawl the tweets legally via Twitter API. The dictionary for slang words is retrieved from other work, which are Okky Ibrohim’s slang word dictionaries [37] that can be found in GitHub (link), Louis Owen’s in GitHub (link), and Rama Prakoso’s in Github (link). This dictionary later will be used in preprocessing part of the slang word process or usually called normalization.

Later on, the dataset from the crawling process will be divided into two parts, which are training data and testing data. The training data will be labeled with positive, negative, or neutral sentiment before being applied to the classification process. When the data has been labeled and trained into the classification process, the testing data will be applied to the process as the data that will be evaluated. This process is repeated once again but with different treatment from the last time. The first treatment will be without slang word dictionary as the base compared to the other experiment. The other experiment will use the combination of the slang dictionary mentioned above as the treatment. The details of the research process can be seen in Figure 1.

Figure 1.
Process flow of the sentiment analysis with slang words.

Figure 1 shows the research model of sentiment analysis, and the process was divided into four different processes to make it easier. In the beginning, the data was collected from Twitter through API credentials. The collected data were stored in a database in a corpus type file (.csv) and then moved to the preprocessing stage to read tweets. The preprocessing stage was divided into two, which in the first method did not use slang word and abbreviation dictionary, while the second method used it. The Python library is called “Sastrawi,” which allows the words in the Indonesian language (Bahasa Indonesia) to be reduced into their base form (stemming). The results were labeled by using the TextBlob library of Python language. The training set and the testing set were processed for the feature extraction; then, the model was evaluated based on the result given. Otherwise, the error was prompted when the machine learning algorithm fails to predict the sentiment. In the end, by looking for both accuracy and error, this study can conclude the result of the tweets.

3.1 Data preprocessing

Steps done in the preprocessing phase of this research are: case folding, cleansing, converting negation, converting emoticon, tokenization, stop words removal, and stemming. The difference between process one and two is the additional slang word and abbreviation dictionary that is applied before the stemming process. Methods to do the preprocessing are listed in Figure 1 as well.

Case folding is a step where all the uppercase letters in the tweeted document will be converted to lowercase. The only word from “a” to “z” that accepted in this stage. The purpose is to remove the data redundancy where the difference is only from the letter. Next, cleansing is done to clean the words that do not correlate with the result of sentiment classification. The component of the tweeted document has various attributes that do not affect the sentiment since every tweet mostly has those attributes. Examples of unimportant attributes of tweets are: the mention feature (symbolized by “@”), hashtag (symbolized by “#”), link (symbolized by “http,” “bit.ly,” and “.com”), and character (∼!@#$%^&*()_ + {}[]|?<>;’:). These attributes will be replaced by a space ““character to make it easier to be classified. Then, the process to convert negation word that exists in a tweet. This negation will change the sentiment value of the document; thus, the negation word will be combined with the next word. Examples of negation words are “bukan,” “jangan,” “tidak,” and so forth. It is then followed by convert emoticon, which removes every emoticon from the text. Examples of emoticon are (“

”, “

”).

The next process is tokenizing, which cuts every word and arranges it into a single piece. The word in the document is the word that is separated by space. The result of this process is a single word for weighting. Then, stop word removal is performed to remove the word that is not suitable for the document topic, in which the word does not affect the accuracy of sentiment classification. The removed words will be stored in the stop word database. If in the document, there are stop words, then it will be replaced by a space character. Then, the process with slang words which is the main part of the research. By comparing this additional process and the one without it in terms of performance, the comparison can be analyzed. This process is done by changing the word that is not following the Indonesian standard word (EYD, “Ejaan yang Disempurnakan”) referring to the slang word dictionary used. After that, stemming is done to convert the words in a document to be back to their root by using certain rules. The process of Indonesian language stemming is done by removing suffix, prefix, and confix, on the document.

3.2 Feature extraction with TF-IDF

TF-IDF is one of the methods commonly used in feature extraction. This method is famous for being efficient, easy, and accurate. It is used to calculate the weight of the words used in information retrieval. It calculates the value of TF and IDF on every token (words) in every document in the corpus.

TF is the amount of word occurrence in a document. The more a word appears in a document, the more it affects the document. Otherwise, the less a word appears in a document, the less it affects the document. IDF is word weighting that is based on how much a document contains a certain word. The more a document contains a certain word, the less the word affects the document. Otherwise, the less a document contains a certain word, the more the word affects the document. The equation to determine TF-IDF can be found below:

IDFw=logNDFwE1

TF−IDFwd=TFwd×IDFWE2

Where IDF(w) is the inverse document frequency of word W, N is the number of documents, DF(w) is the number of documents containing word W, TF-IDF(w,d) is the weight of a words in all document, TF(w,d) is the frequency of word W occurrence in document, and W is a word and d is a document.

3.3 Classification with Naïve Bayes algorithm

In this paper, the algorithm used for the classification process is the Naïve Bayes algorithm. The algorithm was chosen because it is simple and can perform well with a small dataset, which will be useful for classifying positive and negative words that are conditionally independent of each other. Depending on the probability model, this classifier can be trained to run the supervised learning effectively. The algorithm is derived from the classifier that is based on the appearance or absence of class A in a given document B. The following is the basis formula used in Naïve Bayes algorithm:

PAB=PAPBAPBE3

Where A belongs to a positive or negative class and B belongs to the document whose class is being predicted. The numerator (P(A) and P(B|A) was obtained during data training. It represents every tweet in attribute (a₁, a₂, a₃, …, a_n) where a₁ is the first word, a₂ is the second, and so on, where V represents the class set. When the classification begins, this method will create a category or class with the highest probability (V_MAP) by inserting attributes (a₁, a₂, a₃, …, a_n). The equation is given below:

VMAP=Pvj∈vargmaxvja1a2a3…anE4

By using Bayes theorem, Eq. (4) can be written as:

VMAP=Pa1,a2,a3,…anVjPVjPa1a2a3…anvj∈vargmaxE5

P(a₁, a₂, a₃, …, a_n) becomes constant for every v_j; thus, the equation can be declared by Eq. (6) as below:

VMAP=Pvj∈vargmaxvja1a2a3…anPVjE6

Naïve Bayes Classifier simplifies this by assuming that in every category, each attribute is conditionally independent of each other. Thus:

Pa1,a2,a3,…anVj=∏iPaivjE7

Then, by substituting Eq. (6) to Eq. (7), it will create a formula (8) as below:

VMAP=Pvjvj∈Vargmax×∏iPaivjE8

P(v_j) and probability of word a_i for every category, P(a_i|v_j) will be calculated at training process based on the following formulas (9) and (10):

vj=docsjtrainingE9

Paivj=ni+1n+vocabularyE10

Where docs_j is the sum of a document in category j and training is the sum of documents used in the training process, while n_i is the amount of appearance of word a_i in category v_j, n is the amount of vocabulary that appears in category v_j, and vocabulary is the number of unique words on every training data.

3.4 Design of experiments

There are two phases of experiments conducted in this research, which are preliminary works and main experiments. In the preliminary research, we did experiments by looking at several variables, which are the effect of using slang dictionaries, and the other one is the splitting of training and testing data to different ratios. Table 1 shows the design of experiments (DoEs) for preliminary research. For the experiment, the data used was from 4000 tweets crawled on 14 July 2021. The slang word dictionary used for the preliminary works was dictionary A (Okky Ibrohim), which generates six results for the preliminary works. The results were then analyzed to choose which data splitting is going to be used in the main experiments.

		Slang word dictionary
		Using slang dictionary	Not using slang dictionary
Data splitting	60:40	Experiment 1	Experiment 4
	70:30	Experiment 2	Experiment 5
	80:20	Experiment 3	Experiment 6

Table 1.

DoEs for preliminary research.

The main experiments were then conducted with different parameters involved, which are various slang word dictionaries. There were eight different experiments conducted as presented in Table 2.

Main experiment	Slang word dictionary	Main experiment	Slang word dictionary
Experiment 1	No slang dictionary	Experiment 5	Dictionary A and B
Experiment 2	Dictionary A	Experiment 6	Dictionary A and C
Experiment 3	Dictionary B	Experiment 7	Dictionary B and C
Experiment 4	Dictionary C	Experiment 8	Dictionary A, B, and C

Table 2.

DoEs for the main experiment.

4. Result and discussion

This section presents the results and discussions from two phases of the research study, that is, preliminary works and main experiments.

4.1 Preliminary works

The preliminary work was conducted with tweets crawled using Python script to get query by limiting the search area within a 50 km radius from the central geocode of Jakarta, Indonesia. Table 3 shows the example of first five results of the raw crawled data.

Created at	Text	Location	Username	Language
2021–07–14 14:31:21	@ridwanhr @msaid_didu Innalillahiwainnalilahirajiuun Ada kah data2 org yg minggal covid ini sdh divaskin atau blm? Kalau ada brp org sdh vaksin yg mninggal terhitung dari vaksinasi ini di mulai, klo mau detail merk vaksin nya skalian	Jakarta	ndra_833	in
2021–07–14 14:30:52	@BeBuzzerNKRI Dampaknya juga gak signifikan vaksin GR individu karena jumlahnya sedikit. Tapi ekses kecemburuan sosialnya begitu besar. Ekses ini yang bisa membuat penjaga kedai kopi, semir sepatu dan pedagang kecil lainnya terbakar emosinya. Kalau mau berdampak, perusahaan kepada pekerja.	Jakarta	Uki23	in
2021–07–14 14:30:06	Vaksin Covid-19 baru bisa diberikan untuk anak berusia 12-17 tahun. Meski demikian, ada beberapa cara untuk menjaga imunitas anak yang belum divaksin. https://t.co/70tG0Dhbe4	Jakarta	kompascom	in
2021–07–14 14:29:49	@detikinet @detikinet bahas dunk apa boleh yg disuntik vaksin merekam video saat penyuntikan vaksin Karena ada beberapa video yg nakes bilang tidak boleh merekam saat proses vaksin Di satu sisi merekam proses vaksin bisa jadi bukti penyuntikan sesuai SOP & sesuai dosis Cc @KemenkesRI @PBIDI	Jakarta	ari_aditya	in
2021–07–14 14:29:29	Pak pres. @jokowi mohon pak dibuat peraturan saja wajib pakai sertifikat vaksin untuk semua layanan transportasi. Pasti org yg anti vaksin itu akhirnya minta divaksin.	Jakarta	Yehezkiel_Sound	in

Table 3.

The first five results of crawled tweets.

Preprocessing stage was then conducted to the scrapped tweets. As explained in the methodology section, the preprocessing was done to clean the data to ease and simplify further process. Two types of preprocessing were performed: (i) tweets were cleaned without using slang word dictionary and (ii) tweets were cleaned using slang word dictionary. Table 4 shows the results from both preprocessing channels, respectively. It can be observed that there are some differences in the number of words that are not covered when not using slang dictionary.

With slang word dictionary	Without slang word dictionary
innalillahiwainnalilahirajiuun kah data2 orang minggal covid sdh divaskin orang sdh vaksin tinggal hitung vaksinasi detail merk vaksin	innalillahiwainnalilahirajiuun kah data2 org yg minggal covid sdh divaskin blm brp org sdh vaksin yg mninggal hitung vaksinasi klo detail merk vaksin skalian
dampak signifikan vaksin gede individu ekses cemburu sosial ekses jaga kedai kopi semir sepatu dagang bakar emosi dampak usaha kerja	dampak gak signifikan vaksin gr individu ekses cemburu sosial ekses jaga kedai kopi semir sepatu dagang bakar emosi dampak usaha kerja
vaksin covid 19 anak usia 12 17 jaga imunitas anak vaksin	vaksin covid 19 anak usia 12 17 jaga imunitas anak vaksin
bahas dunk suntik vaksin rekam video sunti vaksin video tenaga sehat bilang rekam proses vaksin sisi rekam proses vaksin bukti sunti sesuai sop amp sesuai dosis cc pbidi	bahas dunk yg suntik vaksin rekam video sunti vaksin video yg nakes bilang rekam proses vaksin sisi rekam proses vaksin bukti sunti sesuai sop amp sesuai dosis cc pbidi
pres mohon atur wajib pakai sertifikat vaksin layan transportasi orang anti vaksin vaksin	pres mohon atur wajib pakai ifikat vaksin layan transpo asi org yg anti vaksin vaksin

Table 4.

Results from cleaning process of the first five tweets.

Next, results from the preprocessing stage were labeled using a lexicon-based approach, which retrieved from the number of words containing the sentiment value and scored it based on the dictionary of positive and negative words. The scoring of sentiment is divided into three, which are positive for a score above 0, negative for a score below 0, and neutral for a score exactly 0. Tables 5 and 6 show the results of the first five tweets that have been labeled with lexicon-based approach.

Text	Tokenized words	Polarity score	Polarity
innalillahiwainnalilahirajiuun kah data2 orang minggal covid sdh divaskin orang sdh vaksin tinggal hitung vaksinasi detail merk vaksin	[“innalillahiwainnalilahirajiuun,” “kah,” “data2,” “orang,” “minggal,” “covid,” “sdh,” “divaskin,” “orang,” sdh,” vaksin,” “tinggal,” “hitung,” “vaksinasi,” “detail,” “merk,” “vaksin”]	1	Positive
dampak signifikan vaksin gede individu ekses cemburu sosial ekses jaga kedai kopi semir sepatu dagang bakar emosi dampak usaha kerja	[“dampak,” “signifikan,” “vaksin,” “gede,” “individu,” “ekses,” “cemburu,” sosial,” “ekses,” “jaga,” “kedai,” “kopi,” “semir,” “sepatu,” “dagang,” “bakar,” “emosi,” “dampak,” “usaha,” “kerja”]	−9	Negative
vaksin covid 19 anak usia 12 17 jaga imunitas anak vaksin	[“vaksin,” “covid,” “19,” “anak,” “usia,” “12,” “17,” “jaga,” “imunitas,” “anak,” “vaksin”]	−7	Negative
bahas dunk suntik vaksin rekam video sunti vaksin video tenaga sehat bilang rekam proses vaksin sisi rekam proses vaksin bukti sunti sesuai sop amp sesuai dosis cc pbidi	[“bahas,” “dunk,” “suntik,” “vaksin,” “rekam,” “video,” “sunti,” “vaksin,” “video,” “tenaga,” “sehat,” “bilang,” “rekam,” “proses,” “vaksin,” “sisi,” “rekam,” “proses,” “vaksin,” “bukti,” “sunti,” “sesuai,” “sop,” “amp,” “sesuai,” “dosis,” “cc,” “pbidi”]	10	Positive
pres mohon atur wajib pakai sertifikat vaksin layan transportasi orang anti vaksin vaksin	[“pres,” “mohon,” “atur,” “wajib,” “pakai,” “sertifikat,” “vaksin,” “layan,” “transportasi,” “orang,” “anti,” “vaksin,” “vaksin”]	−4	Negative

Table 5.

First five tweets tokenized and labeled using slang word dictionary.

Text	Tokenized words	Polarity score	Polarity
innalillahiwainnalilahirajiuun kah data2 org yg minggal covid sdh divaskin blm brp org sdh vaksin yg mninggal hitung vaksinasi klo detail merk vaksin skalian	[“innalillahiwainnalilahirajiuun,” “kah,” “data2,” “org,” “yg,” “minggal,” “covid,” sdh,” “divaskin,” “blm,” “brp,” “org,” “sdh,” “vaksin,” “yg,” “mninggal,” “hitung,” “vaksinasi,” “klo,” “detail,” “merk,” “vaksin,” “skalian”]	3	Positive
dampak gak signifikan vaksin gr individu ekses cemburu sosial ekses jaga kedai kopi semir sepatu dagang bakar emosi dampak usaha kerja	[“dampak,” “gak,” “signifikan,” “vaksin,” “gr,” “individu,” “ekses,” “cemburu,” “sosial,” “ekses,” “jaga,” “kedai,” “kopi,” “semir,” “sepatu,” “dagang,” “bakar,” “emosi,” “dampak,” “usaha,” “kerja”]	−9	Negative
vaksin covid 19 anak usia 12 17 jaga imunitas anak vaksin	[“vaksin,” “covid,” “19,” “anak,” “usia,” “12,” '17,” “jaga,” “imunitas,” “anak,” “vaksin”]	−7	Negative
bahas dunk yg suntik vaksin rekam video sunti vaksin video yg nakes bilang rekam proses vaksin sisi rekam proses vaksin bukti sunti sesuai sop amp sesuai dosis cc pbidi	[“bahas,” “dunk,” “yg,” “suntik,” “vaksin,” “rekam,” “video,” “sunti,” “vaksin,” “video,” “yg,” “nakes,” “bilang,” “rekam,” “proses,” “vaksin,” “sisi,” “rekam,” “proses,” “vaksin,” “bukti,” “sunti,” “sesuai,” “sop,” “amp,” “sesuai,” “dosis,” “cc,” “pbidi”]	4	Positive
pres mohon atur wajib pakai ifikat vaksin layan transpo asi org yg anti vaksin vaksin	[“pres,” “mohon,” “atur,” “wajib,” “pakai,” “ifikat,” “vaksin,” “layan,” “transpo,” “asi,” “org,” “yg,” “anti,” 'vaksin,” “vaksin”]	−4	Negative

Table 6.

First five tweets tokenized and labeled without slang word dictionary.

The results of tokenizing the tweets show differences between the ones with slang word dictionary and the ones without. This can make different results of calculations when the feature extraction process is applied. This propagates to the differences in polarity score, even though the polarity labels are all the same.

Figure 2 shows the sentiment distribution of the tweets dataset used in the form of number of tweets and percentage. It can be seen that there is a difference in the total of number of tweets resulted from the preprocessing and labeling with slang word dictionary, which was 1952 tweets, and the one without, which was 1958 tweets.

Figure 2.
Sentiment distribution of dataset after preprocessing and labeling.

After the dataset has been cleaned and labeled, it goes to feature extraction process. The TF-IDF feature extraction has been selected with n-gram and bigram features. The dataset was then split into two, which are the training data and the testing data. The data was then classified using Naïve Bayes and assessed to see the performance. Three combinations of ratio for data splitting, that is, 60:40, 70:30, and 80:20, were used in the experiments, and the performance evaluation results are displayed in Figure 3. Ratio 3 (80:20) has shown the best performance among the three as presented in Figure 3.

Figure 3.
Performance evaluation results of the sentiment classification process.

Next, main experiments were performed with the following notes:

The crawling data used for it was approximately 14,000 tweets crawled with the same keywords used in the preliminary works. Furthermore, the previous dataset from the preliminary work was also used in the main experiment;
There were four (3 + 1 self-developed) slang word dictionaries used in the main experiments, which are Okky Ibrahim (Dict. A) with 15,167 words, Louis Owen (Dict. B) with 1026 words, and Rama Prakoso (Dict. C) with 1319 words and our own dictionary (Dict. D) with 882 words;
The main experiment covered 8 different experiments as presented in Table 7. These main experiments were conducted at 80:20 ratio of data splitting as the results from preliminary works has shown that best performance resulted from this data splitting ratio.

Experiment no.	Dictionary used	Experiment no.	Dictionary used	Experiment no.	Dictionary used
Exp. 1	No dictionary	Exp. 7	Dictionary AC	Exp. 13	Dictionary ABD
Exp. 2	Dictionary A	Exp. 8	Dictionary AD	Exp. 14	Dictionary ACD
Exp. 3	Dictionary B	Exp. 9	Dictionary BC	Exp. 15	Dictionary BCD
Exp. 4	Dictionary C	Exp. 10	Dictionary BD	Exp. 16	Dictionary ABCD
Exp. 5	Dictionary D	Exp. 11	Dictionary CD
Exp. 6	Dictionary AB	Exp. 12	Dictionary ABC

Table 7.

DoE of the main experiment.

Performance evaluation covering accuracy, precision, recall, and F1-score was done to each of the 16 experiments in the main phase. On top of this performance evaluation, the computation time for each experiment was observed and monitored as well. It took approximately 1 hour (59 minutes and 26 seconds) to complete conducting experiment 1 and less than 1 hour (41 minutes and 55 seconds) to conduct experiment 16. This shows that using slang word dictionary in the preprocessing of the data can reduce the total computation time required.

The performance evaluation results in Figure 4 show that using slang word dictionary can improve the accuracy of the sentiment classification process. Experiment 1, which did not use slang word dictionary, has the lowest accuracy compared to other experiments that used slang word dictionary. These results are also quite promising compared to another recent study conducted by [38], which reported 71.97% being the highest accuracy of the conducted sentiment analysis of tweets in social networks.

Figure 4.
Performance evaluation results from the main experiment.

ANOVA test was then conducted to analyze whether the experimental results show significant difference between treatments [39]. Since only one factor, which is slang word dictionary, was used in the experiments, ANOVA with single factor was used for the analysis. Results from ANOVA analysis are presented in Table 8.

Source of variation	SS	df	MS	F	p-value	F-crit
Between groups	0.006469437	15	0.000431296	191,557,314	404253E⁻¹⁸	182,558,574
Within groups	0.001440975	64	0.000022515
Total	0.007910412	79

Table 8.

The ANOVA results from the main experiment.

Data in Table 8 shows that the p-value, which is 404253E⁻¹⁸, is way less than the significance level (0.05) of the ANOVA used. By this, it can be concluded that the null hypothesis H₀ is rejected and alternative hypothesis H₁ is accepted.

H0:μ1=μ2=…=μ2null hypothesis

H1:μ1≠μmalternate hypothesis.E11

Where H₀: there is no significant difference in treatment of dictionary in sentiment analysis and H₁: there is a significant difference in treatment of dictionary in sentiment analysis.

In the next step, since there is a significant difference between the group of dictionaries, the least significant difference (LSD) test then can be conducted to see which group has the significant difference [40]. The test can be done by calculating it via the following formula. The formula was used because the same number of repetitions were performed in each experiment.

LSD=tv,αMSSA2S

=1.9977296540.0000225152352610331∗25=0.005995218E12

Table 9 shows the usage of the LSD as well as the notation labeling for finding the significant difference among the group of experiments.

Groups	Average	Average + LSD	Notation
Experiment 1	0.729273163	0.735268	a
Experiment 2	0.733774834	0.73977	ab
Experiment 3	0.738109573	0.744105	b
Experiment 5	0.746944543	0.75294	c
Experiment 10	0.746944543	0.75294	c
Experiment 15	0.750900742	0.756896	cd
Experiment 11	0.75132462	0.75732	cde
Experiment 9	0.751465913	0.757461	cdef
Experiment 4	0.753903214	0.759898	defg
Experiment 7	0.755351466	0.761347	defgh
Experiment 6	0.756481809	0.762477	defghi
Experiment 13	0.756870364		defghi
Experiment 12	0.756976333		efghi
Experiment 8	0.757541505		ghi
Experiment 14	0.758106676		ghi
Experiment 16	0.76156835		i

Table 9.

Results of LSD test of the main experiment.

The results in Table 9 show that Experiment 16 is the one that has a significant difference from the other group. The concept used to determine which experiment(s) shows significant difference is based on the notation given. For example, Experiment 13 has six notations “defghi,” which means that the other experiment that has the same notation does not give a significant difference toward Experiment 13. Another example is from Experiments 1 and 16; it can be seen in Experiment 1 has the notation “a,” while Experiment 16 has the notation “i,” which means that both experiments are significantly different from each other. Even though the last experiment, Experiment 16, bears the same notation “i” with 6, 13, 12, 8, and 14, it has the most significant difference toward the other 8 notation “a,” “b,” “c,” “d,” “e,” “f,” “g,” and “h” and is the experiment with the highest accuracy.

In regard to the combination of dictionaries used in the research, the difference in the result of the accuracy can be seen. Experiment 1 shows 72.92% of accuracy, while Experiment 16 has 76.15% of accuracy. The amount of dictionary words used increased the accuracy of the sentiment result. However, it can be seen that the number of words from Dictionary A (Okky Ibrohim), which is 16,167 words, compared with our own dictionary that was created with the help of annotators, which only has 882 words, has raised a question. It is because when we see other groups result, for example, Experiment 12 (Dictionary ABC) with an accuracy of 75.697% and Experiment 15 (Dictionary BCD) with an accuracy of 75.090%. We further analyzed why this problem had happened, and it was because the Dictionary A was outdated with slang terms that are rarely used nowadays, although some common slang words still in use are still available there. In the dictionary D, the slang words were taken from the raw crawling data itself, taken manually and vetted by annotators, and then translated the meaning with the help of annotators as well as KBBI (Kamus Besar Bahasa Indonesia). Even though only one tenth of the Dictionary A words, there are around 270 unique words compared to the dictionary A, which help the preprocessing to be more accurate.

The above discussion shows that preprocessing with slang word dictionaries has significantly improved the performance of the sentiment analysis conducted. However, it needs to also be highlighted that the quality of the dictionary used related to its slang word collection has an effect to the contributed improvement. The research works conducted were limited to only involving four slang word dictionaries in Bahasa Indonesia, with their limited number of word collections. To determine the optimum number of slang word collections need to be used in preprocessing stage is a challenge that could significantly contribute to the sentiment analysis performance. Another limitation of this work that can be expanded further is the machine learning algorithm used. It would also be interesting to find out how the combination of different algorithm and slang word dictionary contributes to the performance of the sentiment analysis.

5. Conclusion

This study has shown that sentiment analysis can be performed well using Naïve Bayes Classifier combined with the TF-IDF for feature selection. Moreover, it also has been shown that the number of instances in the dataset used has an impact on the performance of the conducted sentiment analysis. In the preliminary stage, with the same data splitting of 80:20, the accuracy score was 64.796%, while the accuracy score in the main experiment, when the number of instances was much bigger, was 73.722% as being the lowest score. Its performance improved in about 8.926% of accuracy.

Another highlight from this study is how the inclusion of slang word dictionary in the preprocessing part has contributed to the improvement of the sentiment analysis performance. The experiment without the dictionary and all of the dictionaries combined has given different results of evaluation score, where there was improvement from 73.722% in Experiment 1 to 76.248% in Experiment 6, with an increment of 2.526% in its accuracy. In addition, the total time required for the complete sentiment analysis process has been significantly reduced, from computation time of 59 minutes and 26 seconds without slang word dictionary to 41 minutes and 55 seconds with slang word dictionary.

References

1. Drus Z, Khalid H. Sentiment analysis in social media and its application: Systematic literature review. Procedia Computer Science. 2019;161:707-714. DOI: 10.1016/j.procs.2019.11.174
2. Wang Y, Guo J, Yuan C, Li B. Sentiment analysis of Twitter data. Applied Sciences. 2022;12:11775. DOI: 10.3390/app122211775
3. Heikal M, Torki M, El-Makky N. Sentiment analysis of Arabic tweets using deep learning. Procedia Computer Science. 2018;142:114-122. DOI: 10.1016/j.procs.2018.10.466
4. Bouazizi M, Ohtsuki T. Multi-class sentiment analysis on Twitter: Classification performance and challenges. Big Data Mining and Analytics. 2019;2(3):181-194. DOI: 10.26599/BDMA.2019.9020002
5. Jianqiang Z, Xiaolin G. Comparison research on text pre-processing methods on Twitter sentiment analysis. IEEE Access. 2017;5:2870-2879. DOI: 10.1109/access.2017.2672677
6. Rahayu DA, Kuntur S, Hayatin N. Sarcasm detection on Indonesian twitter feeds. Proceeding of the Electrical Engineering Computer Science and Informatics. 2018;5(5):137-141. DOI: 10.11591/eecsi.v5i5.1724
7. Singh T, Kumari M. Role of text pre-processing in Twitter sentiment analysis. Procedia Compuer Science. 2016;89:549-554. DOI: 10.1016/j.procs.2016.06.095
8. Maylawati DS, Zulfikar WB, Slamet C. An improved of stemming algorithm for mining Indonesian text with slang on social media. In: 6th International Conference on CYber and IT Service Management (CTTSM). 2018
9. Yunitasari Y, Musdholifah A, Sari AK. Sarcasm detection for sentiment analysis in Indonesian tweets. Indonesian Journal of Computing and Cybernetics Systems. 2019;13:53-62. DOI: 10.22146/ijccs.41136
10. Adriani M, Asian J, Nazief B, Tahaghoghi SM, Williams HE. Stemming Indonesian: A confix-stripping approach. ACM Transactions on Asian Language Information Processing. 2007;6(4):1-33. DOI: 10.1145/1316457.1316459
11. Nuritha I, Arifiyanti AA, Widartha VP. Analysis of Public Perception on Organic Coffee through Text Mining Approach using Naive Bayes Classifier. In: East Indonesia Conference on Computer and Information Technology (EIConCIT). 2018. pp. 153-158. DOI: 978-1-5386-8050-6/18/$31.00
12. Adarsh MJ, Ravikumar P. Sarcasm detection in text data to bring out genuine sentiments for sentimental analysis. In: 2019 1st International Conference on Advances in Information Technology (ICAIT). 2019. DOI: 10.1109/icait47043.2019.8987393
13. Ferdiana R, Jatmiko F, Purwanti DD, Ayu AS, Dicka WF. Dataset Indonesia untuk Analisis Sentimen. Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI). 2019;8(4):334-339. DOI: 10.22146/jnteti.v8i4.533
14. Fitri VA, Andreswari R, Hasibuan MA. Sentiment analysis of social media Twitter with case of anti-LGBT campaign in Indonesia using Naïve Bayes, decision tree, and random forest algorithm. Procedia Computer Science. 2019;161:765-772
15. Mandloi L, Patel R. Twitter Sentiments Analysis Using Machine Learning Methods. In: International Conference for Emerging Technology (INCET). 2020. pp. 1-5. doi:978-1-7281-6221-8/20/$31.00
16. Casas I, Delmelle EC. Tweeting about public transit-gleaning public perceptions from a social media microblog. Case Studies on Transport Policy. 2017;5(4):634-642. DOI: 10.1016/j.cstp.2017.08.004
17. Mora K, Chang J, Beatson A, Morahan C. Public perceptions of building seismic safety following the Canterbury earthquakes: A qualitative analysis using Twitter and focus groups. International Journal of Disaster Risk Reduction. 2015;13:1-9. DOI: 10.1016/j.ijdrr.2015.03.008
18. Klašnja M, Barberá P, Beauchamp N, Nagler J, Tucker JA. Measuring Public Opinion with Social Media Data. In: Atkeson LR, Alvarez RM, editors. The Oxford Handbook of Polling and Survey Methods, Oxford Handbooks (2018; online ed). Oxford Academic; 5 Oct 2015. pp. 555-582. DOI: 10.1093/oxfordhb/9780190213299.013.3
19. Al-Thubaity A, Alqahtani Q, Aljandal A. Sentiment lexicon for sentiment analysis of Saudi dialect tweets. Procedia Computer Science. 2018;142:301-307. DOI: 10.1016/j.procs.2018.10.494
20. Mukhtar N, Khan MA, Chiragh N. Lexicon-based approach outperforms supervised machine learning approach for Urdu sentiment analysis in multiple domains. Telematics and Informatics. 2018;35(8):2173-2183. DOI: 10.1016/j.tele.2018.08.003
21. Wu L, Morstatter F, Liu H. SlangSD: Building and using a sentiment dictionary of slang words for short-text sentiment classification. Language Resources and Evaluation. 2018;52(3):839-852. DOI: 10.1007/s10579-018-9416-0
22. Salsabila NA, Winatmoko YA, Septiandri AA. Colloquial Indonesian Lexicon. In: 2018 International Conference on Asian Language Processing (IALP). 2018. pp. 226-229. DOI: 10.1109/ialp.2018.8629151
23. Muliady W, Widiputra H. Generating Indonesian Slang Lexicons from Twitter. In: 2012 2nd International Conference on Uncertainty Reasoning and Knowledge Engineering. 2012. pp. 123-126. DOI: 10.1109/urke.2012.6319524
24. Vieira S, Pinaya WH, Mechelli A. Introduction to machine learning. In: Mechelli A, Vieira S, editors. Machine Learning. Academic Press; 2020. pp. 1-20. DOI: 10.1016/b978-0-12-815739-8.00001-8
25. Yeturu K. Machine learning algorithms, applications, and practices in data science. In: Srinivasa Rao ASR, Rao CR, editors. Handbook of Statistics Principles and Methods for Data Science. Elsevier; 2020. pp. 81-206. DOI: 10.1016/bs.host.2020.01.002
26. Jianqiang Z, Xiaolin G, Xuejun Z. Deep convolution neural networks for Twitter sentiment analysis. IEEE Access. 2018;6:23253-23260. DOI: 10.1109/access.2017.2776930
27. Singh S, Pareek A, Sharma A. Twitter sentiment analysis using rapid miner tool. International Journal of Computer Applications. 2019;177(16):44-50. DOI: 10.5120/ijca2019919604
28. Bouazizi M, Ohtsuki T. A pattern-based approach for multi-class sentiment analysis in Twitter. IEEE Access. 2017;5:20617-20639. DOI: 10.1109/access.2017.2740982
29. Zimmer M, Proferes N. A topology of Twitter research: Disciplines, methods, and ethics. Aslib Journal of Information Management. 2014;66(3):250-261. DOI: 10.1108/ajim-09-2013-0083
30. Guo X, Li J. A novel twitter sentiment analysis model with baseline correlation for financial market prediction with improved efficiency. In: Proceedings of the Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS), Granada, Spain, 22–25 October 2019. 2019. pp. 472-477
31. Harapan H, Itoh N, Yufika A, Winardi W, Keam S, Te H, et al. Coronavirus disease 2019 (COVID-19): A literature review. Journal of Infection and Public Health. 2020;13:667-673
32. Kumar D, Malviya R, Sharm PK. Corona virus: A review of COVID-19. Eurasian Journal of Medicine and Oncology. 2020;4(10):8-25. DOI: 10.14744/ejmo.2020.51418
33. Vieira CM, Franco OH, Restrepo CG, Abel T. COVID-19: The forgotten priorities of the pandemic. Maturitas. 2020;136:38-41. DOI: 10.1016/j.maturitas.2020.04.004
34. WHO. 2019 Novel Coronavirus (2019-nCoV) Strategic Preparedness and Response Plan for the South-East Asia Region. 2020. pp. 1-22. Retrieved from World Health Organization
35. Nicola M, Alsafi Z, Sohrabi C, Kerwan A, Al-Jabir A, Iosifidis C, et al. The socio-economic implications of the coronavirus pandemic (COVID-19): A review. International Journal of Surgery. 2020;78:185-193. DOI: 10.1016/j.ijsu.2020.04.018
36. Chen L, Liu Y, Chang Y, Wang X, Luo X. Public opinion analysis of novel coronavirus from online data. Journal of Safety Science and Resilience. 2020;1(2):120-127. DOI: 10.1016/j.jnlssr.2020.08.002
37. Ibrohim O, Budi I. Multi label hate speech and abusive language detection in Indonesian Twitter. ALW3: 3rd Workshop on Abusive Language Online. 2019. pp. 46-57
38. AminiMotlagh M, Shahhoseini H, Fatehi N. A reliable sentiment analysis for classification of tweets in social networks. Social Network Analysis and Mining. 2023;13:7. DOI: 10.1007/s13278-022-00998-2
39. Alassaf M, Qamar AM. Improving sentiment analysis of Arabic tweets by one-way ANOVA. Journal of King Saud University - Computer and Information Sciences. 2020;34(6):2849-2859. DOI: 10.1016/j. jksuci.2020.10.023
40. Williams LJ, Abdi H. Fisher’s least significant difference (LSD) test. In: Salkind N, editor. Encyclopedia of Research Design. Thousand Oaks: Sage; 2010. DOI: 10.4135/9781412961288.n154

Sections

Author information

1.Introduction
2.Related work
3.Methodology
4.Result and discussion
5.Conclusion

References

Publish with IntechOpen

Next chapter

A Comparative Performance Evaluation of Algorithms for the Analysis and Recognition of Emotional Content

By Konstantinos Kyritsis, Nikolaos Spatiotis, Isidoros Perikos and Michael Paraskevas

45 downloads

[1] 1. Drus Z, Khalid H. Sentiment analysis in social media and its application: Systematic literature review. Procedia Computer Science. 2019;161:707-714. DOI: 10.1016/j.procs.2019.11.174

[2] 2. Wang Y, Guo J, Yuan C, Li B. Sentiment analysis of Twitter data. Applied Sciences. 2022;12:11775. DOI: 10.3390/app122211775

[3] 3. Heikal M, Torki M, El-Makky N. Sentiment analysis of Arabic tweets using deep learning. Procedia Computer Science. 2018;142:114-122. DOI: 10.1016/j.procs.2018.10.466

[4] 4. Bouazizi M, Ohtsuki T. Multi-class sentiment analysis on Twitter: Classification performance and challenges. Big Data Mining and Analytics. 2019;2(3):181-194. DOI: 10.26599/BDMA.2019.9020002

[5] 5. Jianqiang Z, Xiaolin G. Comparison research on text pre-processing methods on Twitter sentiment analysis. IEEE Access. 2017;5:2870-2879. DOI: 10.1109/access.2017.2672677

[6] 6. Rahayu DA, Kuntur S, Hayatin N. Sarcasm detection on Indonesian twitter feeds. Proceeding of the Electrical Engineering Computer Science and Informatics. 2018;5(5):137-141. DOI: 10.11591/eecsi.v5i5.1724

[7] 7. Singh T, Kumari M. Role of text pre-processing in Twitter sentiment analysis. Procedia Compuer Science. 2016;89:549-554. DOI: 10.1016/j.procs.2016.06.095

[8] 8. Maylawati DS, Zulfikar WB, Slamet C. An improved of stemming algorithm for mining Indonesian text with slang on social media. In: 6th International Conference on CYber and IT Service Management (CTTSM). 2018

[9] 9. Yunitasari Y, Musdholifah A, Sari AK. Sarcasm detection for sentiment analysis in Indonesian tweets. Indonesian Journal of Computing and Cybernetics Systems. 2019;13:53-62. DOI: 10.22146/ijccs.41136

[10] 10. Adriani M, Asian J, Nazief B, Tahaghoghi SM, Williams HE. Stemming Indonesian: A confix-stripping approach. ACM Transactions on Asian Language Information Processing. 2007;6(4):1-33. DOI: 10.1145/1316457.1316459

[11] 11. Nuritha I, Arifiyanti AA, Widartha VP. Analysis of Public Perception on Organic Coffee through Text Mining Approach using Naive Bayes Classifier. In: East Indonesia Conference on Computer and Information Technology (EIConCIT). 2018. pp. 153-158. DOI: 978-1-5386-8050-6/18/$31.00

[12] 12. Adarsh MJ, Ravikumar P. Sarcasm detection in text data to bring out genuine sentiments for sentimental analysis. In: 2019 1st International Conference on Advances in Information Technology (ICAIT). 2019. DOI: 10.1109/icait47043.2019.8987393

[13] 13. Ferdiana R, Jatmiko F, Purwanti DD, Ayu AS, Dicka WF. Dataset Indonesia untuk Analisis Sentimen. Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI). 2019;8(4):334-339. DOI: 10.22146/jnteti.v8i4.533

[14] 14. Fitri VA, Andreswari R, Hasibuan MA. Sentiment analysis of social media Twitter with case of anti-LGBT campaign in Indonesia using Naïve Bayes, decision tree, and random forest algorithm. Procedia Computer Science. 2019;161:765-772

[15] 15. Mandloi L, Patel R. Twitter Sentiments Analysis Using Machine Learning Methods. In: International Conference for Emerging Technology (INCET). 2020. pp. 1-5. doi:978-1-7281-6221-8/20/$31.00

[16] 16. Casas I, Delmelle EC. Tweeting about public transit-gleaning public perceptions from a social media microblog. Case Studies on Transport Policy. 2017;5(4):634-642. DOI: 10.1016/j.cstp.2017.08.004

[17] 17. Mora K, Chang J, Beatson A, Morahan C. Public perceptions of building seismic safety following the Canterbury earthquakes: A qualitative analysis using Twitter and focus groups. International Journal of Disaster Risk Reduction. 2015;13:1-9. DOI: 10.1016/j.ijdrr.2015.03.008

[18] 18. Klašnja M, Barberá P, Beauchamp N, Nagler J, Tucker JA. Measuring Public Opinion with Social Media Data. In: Atkeson LR, Alvarez RM, editors. The Oxford Handbook of Polling and Survey Methods, Oxford Handbooks (2018; online ed). Oxford Academic; 5 Oct 2015. pp. 555-582. DOI: 10.1093/oxfordhb/9780190213299.013.3

[19] 19. Al-Thubaity A, Alqahtani Q, Aljandal A. Sentiment lexicon for sentiment analysis of Saudi dialect tweets. Procedia Computer Science. 2018;142:301-307. DOI: 10.1016/j.procs.2018.10.494

[20] 20. Mukhtar N, Khan MA, Chiragh N. Lexicon-based approach outperforms supervised machine learning approach for Urdu sentiment analysis in multiple domains. Telematics and Informatics. 2018;35(8):2173-2183. DOI: 10.1016/j.tele.2018.08.003

[21] 21. Wu L, Morstatter F, Liu H. SlangSD: Building and using a sentiment dictionary of slang words for short-text sentiment classification. Language Resources and Evaluation. 2018;52(3):839-852. DOI: 10.1007/s10579-018-9416-0

[22] 22. Salsabila NA, Winatmoko YA, Septiandri AA. Colloquial Indonesian Lexicon. In: 2018 International Conference on Asian Language Processing (IALP). 2018. pp. 226-229. DOI: 10.1109/ialp.2018.8629151

[23] 23. Muliady W, Widiputra H. Generating Indonesian Slang Lexicons from Twitter. In: 2012 2nd International Conference on Uncertainty Reasoning and Knowledge Engineering. 2012. pp. 123-126. DOI: 10.1109/urke.2012.6319524

[24] 24. Vieira S, Pinaya WH, Mechelli A. Introduction to machine learning. In: Mechelli A, Vieira S, editors. Machine Learning. Academic Press; 2020. pp. 1-20. DOI: 10.1016/b978-0-12-815739-8.00001-8

[25] 25. Yeturu K. Machine learning algorithms, applications, and practices in data science. In: Srinivasa Rao ASR, Rao CR, editors. Handbook of Statistics Principles and Methods for Data Science. Elsevier; 2020. pp. 81-206. DOI: 10.1016/bs.host.2020.01.002

[26] 26. Jianqiang Z, Xiaolin G, Xuejun Z. Deep convolution neural networks for Twitter sentiment analysis. IEEE Access. 2018;6:23253-23260. DOI: 10.1109/access.2017.2776930

[27] 27. Singh S, Pareek A, Sharma A. Twitter sentiment analysis using rapid miner tool. International Journal of Computer Applications. 2019;177(16):44-50. DOI: 10.5120/ijca2019919604

[28] 28. Bouazizi M, Ohtsuki T. A pattern-based approach for multi-class sentiment analysis in Twitter. IEEE Access. 2017;5:20617-20639. DOI: 10.1109/access.2017.2740982

[29] 29. Zimmer M, Proferes N. A topology of Twitter research: Disciplines, methods, and ethics. Aslib Journal of Information Management. 2014;66(3):250-261. DOI: 10.1108/ajim-09-2013-0083

[30] 30. Guo X, Li J. A novel twitter sentiment analysis model with baseline correlation for financial market prediction with improved efficiency. In: Proceedings of the Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS), Granada, Spain, 22–25 October 2019. 2019. pp. 472-477

[31] 31. Harapan H, Itoh N, Yufika A, Winardi W, Keam S, Te H, et al. Coronavirus disease 2019 (COVID-19): A literature review. Journal of Infection and Public Health. 2020;13:667-673

[32] 32. Kumar D, Malviya R, Sharm PK. Corona virus: A review of COVID-19. Eurasian Journal of Medicine and Oncology. 2020;4(10):8-25. DOI: 10.14744/ejmo.2020.51418

[33] 33. Vieira CM, Franco OH, Restrepo CG, Abel T. COVID-19: The forgotten priorities of the pandemic. Maturitas. 2020;136:38-41. DOI: 10.1016/j.maturitas.2020.04.004

[34] 34. WHO. 2019 Novel Coronavirus (2019-nCoV) Strategic Preparedness and Response Plan for the South-East Asia Region. 2020. pp. 1-22. Retrieved from World Health Organization

[35] 35. Nicola M, Alsafi Z, Sohrabi C, Kerwan A, Al-Jabir A, Iosifidis C, et al. The socio-economic implications of the coronavirus pandemic (COVID-19): A review. International Journal of Surgery. 2020;78:185-193. DOI: 10.1016/j.ijsu.2020.04.018

[36] 36. Chen L, Liu Y, Chang Y, Wang X, Luo X. Public opinion analysis of novel coronavirus from online data. Journal of Safety Science and Resilience. 2020;1(2):120-127. DOI: 10.1016/j.jnlssr.2020.08.002

[37] 37. Ibrohim O, Budi I. Multi label hate speech and abusive language detection in Indonesian Twitter. ALW3: 3rd Workshop on Abusive Language Online. 2019. pp. 46-57

[38] 38. AminiMotlagh M, Shahhoseini H, Fatehi N. A reliable sentiment analysis for classification of tweets in social networks. Social Network Analysis and Mining. 2023;13:7. DOI: 10.1007/s13278-022-00998-2

[39] 39. Alassaf M, Qamar AM. Improving sentiment analysis of Arabic tweets by one-way ANOVA. Journal of King Saud University - Computer and Information Sciences. 2020;34(6):2849-2859. DOI: 10.1016/j. jksuci.2020.10.023

[40] 40. Williams LJ, Abdi H. Fisher’s least significant difference (LSD) test. In: Salkind N, editor. Encyclopedia of Research Design. Thousand Oaks: Sage; 2010. DOI: 10.4135/9781412961288.n154

Preprocessing of Slang Words for Sentiment Analysis on Public Perceptions in Twitter

Advances in Sentiment Analysis - Techniques, Applications, and Challenges

Abstract

Keywords

Author Information

Media Anugerah Ayu*

Abdul Haris Muhendra

1. Introduction

2. Related work

2.1 Public perception

2.2 Sentiment analysis

2.2.1 Lexicon-based approach

2.2.2 Machine learning approach

2.3 Twitter

2.4 COVID-19

3. Methodology

Figure 1.

3.1 Data preprocessing

3.2 Feature extraction with TF-IDF

3.3 Classification with Naïve Bayes algorithm

3.4 Design of experiments

Table 1.

Table 2.

4. Result and discussion

4.1 Preliminary works

Table 3.

Table 4.

Table 5.

Table 6.

Figure 2.

Figure 3.

Table 7.

Figure 4.

Table 8.

Table 9.

5. Conclusion

References

Continue reading from the same book

Advances in Sentiment Analysis