Twitter Sentiment Analysis With Textblob
Twitter Sentiment Analysis With Textblob
ISSN No:-2456-2165
Sneh Vora
Department of Computer Science and Engineering,
Devang Patel Institute of Advance Technology and Research (DEPSTAR),
Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT)
Abstract:- In the World, many social media sites exist to each word that needs to be classified. In the last step of
like Twitter, Instagram, Facebook, Snapchat, etc. and classification, they use a support vector machine (SVM).
data posted by people on these social media sites are in- After the classification, they conclude that the accuracy of
creasing quickly that containing audio, video, text, and the E- Pattern-based algorithm is given a more accurate re-
images. People use this site to share their thoughts and sult than pattern based algorithm, and in terms of time also
opinion and sometimes share their opinion and thoughts E-pattern based algorithm takes less execution time.
towards any company. For this, we have chosen twitter
and applied sentiment analysis. In this paper, we discuss In June 2017 Shivam Singh, Sonal Agarwal, And Sak-
a method of data extraction through API, data cleaning, shi Agarwal proposed a Real-Time Twitter Sentiment Anal-
and use text blob python library for sentiment analysis. ysis. In this research, they use Hadoop with natural language
processing. 1st is the Ingestion of tweets into HDFS in this
Keywords:- Twitter, sentiment, opinion mining, social me- Tweets are ingested from Twitter streaming using Twitter 4j
dia, natural language processing, sentiment analysis. API. 2nd is Post Processing, Construction of n- grams, and
Spelling correction. 3rd is Query processing using HIVE
I. INTRODUCTION Once the tweets are ingested into HDFS. Excel uses the
ODBC driver to get the processed data in the form of
Social Media Websites are carrying a sea of data. So- graphs, geographical location, and charts-based data because
cial Media sites have given a right to speak to every person the culture and diversity of a location matter very much.
who can access or use them. Twitter is used by a large num-
ber of people to write their emotions, opinions about their In February 2018 Sahar A. El_Rahman, Feddah Alhu-
daily life, and a company or any organization reviews and maidi AlOtaibi, And Wejdan Abdullah AlShehri proposed a
opinions. Twitter is challenging because its users have to Sentiment Analysis of Twitter Data. The author used senti-
express their views in one or two key sentences and it can be ment analysis to classify English tweets about two famous
seen as a good reaction to what is happening around the restaurants that are McDonald's and KFC. In this method
world. Sentiment analysis automates the extraction or classi- they use some packages and libraries, some packages are
fication of sentiment and views using text analysis, natural Twitter, R0Auth, and word cloud, after preparing tweets
language processing, and computational approaches. This using an unsupervised learning algorithm they used a lexi-
sentiment analysis benefits many fields like Customer in- con- based model used to classify Twitter tweets. To train
formation, Marketing field, books, mobile application, So- the model they use different supervised algorithms: Naive
cial media, and websites. Many companies hire analysts who Bayes, SVM, random forest, decision tree, and maximum
have a job to extract the emotions of people behind these entropy. For Accuracy They use Recall, Precision, F-score,
posts or tweets. This helps businesses to get a good review And Cross- validation.
about a product or service which helps them know public
opinion and in addition, they make a better product in the In September 2019 Brahmananda Reddy,
future. In this project, we use the python text blob library for D.N.Vasundhara, and P. Subhash proposed Sentiment re-
text classification. There are two ways to extract tweets us- search on Twitter data. This system was completed in seven
ing Twitter's official API and data scraper. For this project, stages. In this system, they overcome the drawbacks for bet-
we preferred API for collecting datasets. ter understanding the emotions they classified emotions into
7 categories ex. Strongly Positive, Positive, Weakly posi-
II. LITERATURE REVIEW tive. Instead of static data, they use real-time data using
Twitter API by giving a username or hashtag and They can
In October 2017 Kirti Huda, Mrunmayee Deshpande, look at a specific person's tweets or hashtags. In this re-
and Neshat Karim gave information on Classification Tech- search, the author uses a Naïve bye classifier and they use
niques for Sentiment Analysis of Twitter tweets data. In video games review data sets for training and testing. The
classification, there are mainly three techniques Naive Bayes theorem is used in the Naive Bayes technique, which
Bayes, M E, and SVM. The author uses a pattern-based uses a probabilistic learning function.
technique for feature extraction. In this, for feature extrac-
tion, The n-gram algorithm is used, which assigns a priority
V. OUTPUT DICCUSSION
Output Discussion –
The first user has to give the user id or hashtag of Twitter that they want to analyze their tweets sentiment analysis. If the
user gives both user id and tag same time, then this will not work.
1 In 2017, Md Tabrez N-gram, SVM 90% The proposed and current algorithms
Nafis, Kirti Huda, execution times are measured in terms
Neshat Karim Shaukat of execution time. It has been deter-
mined that the upgraded pattern-based
algorithm takes less time to execute.
2 In June 2017 Shivam Hadoop With NLP 85% These Analysis can facilitate the pro-
Singh, Sonal Agarwal, cess of decision making in various
And Sakshi Agarwal areas such as health care analysis,
market analysis, weather forecasting,
advertising analysis, fraud detection,
traffic flow optimization etc.
3 In 2018, Ahmad Karim, Naive Bayes, SVM , 85% The feature vector's classification
Ali Hasan, SanaMoin, Maximum Entropy, accuracy is tested using victimization.
shahaboddin Sham- Decision Tree, Ran- Thomas Bayes is a naive classifica-
shirband dom Forest tion algorithm.
4 In 2018, Ahmad Karim, Textblob, Wordcloud 62.67% In comparison to W-WSD and Text
Ali Hasan, SanaMoin, Blob, TextBlob has the highest accu-
shahaboddin Sham- racy, according to the experiment
shirband result data.
5 In Fab 2019, Faizan Regular Expresion, 65.33% Increase the accuracy of the model by
K nearest neighbour using different deep learning tech-
algorithm niques such as neural networks.
6 In March 2019 Hetu SVM, Decision Tree 95% It must also extract valuable text
Bhavsar, Richa Mang- and Adaboost Deci- properties like bigrams that are more
lani sion Tree based hy- beneficial in sentiment analysis and
brid sentiment classi- extraction of properties like unigram
fication model not more accurate sentiment analysis.
7 In 2019 Vishl A. Machine Learning 74.56% The most accurate learning methods
Kharde, S.S. Sonawane and Lexicon Based are naïve Bayes and SVM, which can
Method, SVM, Naive be considered the baseline, whereas
Bais lexicon-based algorithms can be quite
useful in particular instances.
9 In February, 2020 Dr. Naïve Bayes classifi- 70% As the virus is spreading vigorously,
KB Priya Iyer and Dr. cation, the study needs to be carried out every
Shakti Kumaresh. Machine Learning week to have a better understanding
Algorithms of the sentiments of the people.
10 In 2020 Ankita Sharmaa, R language, Rapid 87% All sentences are checked at the sen-
Udayan Ghoseb miner tence level for polarity, which might
be negative or Positive; mixed opin-
ion may be or may not be considered
for a sentence.
Table 1: Methods used by authors for solving similar problem and accuracy achieved.
VII. LIMITATIONS REFERENCES
The tweets that we collected for these project purposes [1.] A.Pak and P. Paroubek. „Twitter as a Corpus for
were in the English language which is the limitation of this Sentiment Analysis and Opinion Mining". In Pro-
project because many of the tweets are in other languages. For ceedings of the Seventh Conference on international
extracting the tweets, we use Twitter API that only allows Language Resources and Evaluation, 2010, pp.1320-
collecting of the last 7 days. 1326
[2.] Pan S J, Ni X, Sun J T, et al. “Cross-domain sentiment
VIII. FUTRE WORK classification viaspectral feature alignment”. Proceed-
ings of the 19th international conference on World wide
In The Future, We will work on how this sentiment anal- web. ACM, 2010: 751-760.
ysis accuracy can further be increased and users can get a [3.] Taboada, M., Brooke, J., Tofiloski, M., Voll, K., &
99.99% accurate result. As this result is only based on one amp;Stede, M..“Lexicon basedmethods for sentiment
language we can increase the number of languages for senti- analysis”. Computational linguistics, 2011:37(2), 267-
ment analysis. 307.
IX. CONCLUSION [4.] A. Sharma, and S. Dey, “Performance Investigation of
Feature Selection Methods and Sentiment Lexicons for
Sentiment analysis/opinion mining wide-area real-time Sentiment Analysis,” Association for the advancement
applications have many research limitations. Since fast inter- of Artificial Intelligence, 2012.
net growth, internet-related applications, Sentiment analysis – [5.] M. Hu, and B. Liu, “Mining and summarizing customer
most interesting research area natural language processing reviews,” 2004.
community. In our project, we analyzed sentiments Tweets [6.] S.Lohmann, M. Burch, H. Schmauder and D. Weiskopf,
extracted from Twitter, classify them according to polarities. “Visual Analysis of Microblog Content Using Time-
Varying Co-occurrence Highlighting in Tag Clouds,”
Major limitations -Sentiment analysis: Annual conference of VISVISUS. Germany: University
Spam and fake News detection. of Stuttgart, 2012.
Classification Filtering limitation. [7.] M.Taboada, J. Brooke, M. Tofiloski, K. Voll, and M.
Limited language available Stede, “ Lexicon-Based Methods for Sentiment Analy-
sis,” Association for Computational Linguistics, 2011.
[8.] H. Saif, Y. He and H. Alani, “Alleviating Data Scarcity
for Twitter Sentiment Analysis”.Association for Com-
putational Linguistics, 2012
[9.] Neethu M,S and Rajashree R,” Sentiment Analysis in
Twitter using Machine Learning Techniques” 4th
ICCCNT 2013,at Tiruchengode, India. IEEE –31661