Minor Project Report grp 11 (2)
Minor Project Report grp 11 (2)
on
Submitted to
BY
ABHISHEK PANI 21051192
SHIRSHAK PATTNAIK 21052360
ABHIJEET PANI 21052552
BARENYA NAYAK 21052577
CHANDRAKANTA MEHER 21052580
CERTIFICATE
This is certify that the project entitled
is a record of bonafide work carried out by them, in the partial fulfillment of the
requirement for the award of Degree of Bachelor of Engineering (Computer Sci-ence
& Engineering) at KIIT Deemed to be university, Bhubaneswar. This work is done
during the year 2023-2024, under our guidance.
Date: 31/03/24
Project Guide:
Mr. Sourav Kumar Giri
Acknowledgements
Overall, this study showcases the utility of Naive Bayes and Logistic
Regression models in Twitter sentiment analysis, highlighting their
potential applications in diverse domains such as marketing, public
opinion research, and brand management.
1 Introduction 1
2 Basic Concepts/ Literature Review 2
Individual Contribution 8
Plagiarism Report 9
List of Figures
Twitter Sentiment Analysis
Chapter 1
Introduction
Millions of people are using Twitter and expressing their emotions like
happiness, sadness, anger, etc. The Sentiment analysis is also about
detecting the emotions, opinion, assessment, attitudes, and took this into
consideration as a way humans think. Sentiment analysis classifies the
emotions into classes such as positive or negative. Nowadays, industries are
interested in using textual data for semantic analysis to extract the view of
people about their products and services. Sentiment analysis is very
important for them to know the customer satisfaction level and they can
improve their services accordingly. To work on the text data, they try to
extract the data from social media platforms. There are a lot of social media
sites like Google Plus, Facebook, and Twitter that allow expressing opinions,
views, and emotions about certain topics and events. Microblogging site
Twitter is expanding rapidly among all other online social media
networking sites with about 200 million users. Twitter was founded in 2006
and currently, it is the most famous microblogging platform. In 2017 2
million users shared 8.3 million tweets in one hour. Twitter users post their
thoughts, emotions, and messages on their profiles, called tweets. The Word
limit of a single tweet has 140 characters. Twitter sentiment analysis based
on the NLP (natural language processing) field. For tweets text, we use NLP
techniques like tokenizing the words, removing the stop words like I, me,
my, our, your, is, was, etc. Natural language processing also plays a part to
preprocess the data like cleaning the text and removing the special
characters and punctuation marks. Sentimental analysis is very important
because we can know the trends of people’s emotions on specific topics
with their tweets.
Chapter 2
Basic Concepts/ Literature Review
This chapter provides an overview of the basic concepts, tools, and techniques
employed in the project on Twitter sentiment analysis. The section encompasses a
literature review to contextualize the project within the existing body of research.
Chapter 3
Problem Statement / Requirement
Specifications
3.1 Problem Statement
The project will utilize the following working environment and tools:
The diagram outlines the sequential steps involved in the sentiment analysis
process, including data collection, preprocessing, feature extraction, model
training, and sentiment classification. Each component interacts with the
next in a cohesive pipeline to achieve the overall objective of sentiment
analysis on Twitter data.
Chapter 4
Implementation
4.1 Methodology or Proposal
● Data Collection: Twitter data was collected using the Twitter API
based on specific search queries or hashtags related to the topics of interest.
In this section, we outline the testing or verification plan for evaluating the
sentiment analysis models developed as part of the project. The test case
titled "Model Evaluation" focuses on assessing the performance of the
trained models in predicting sentiment for tweets in a designated test
dataset.
● System Behavior: The system will predict sentiment for each tweet
in the test dataset using the loaded models.
During the evaluation process, the sentiment analysis models will analyze
the text of each tweet in the test dataset and classify it into one of the
predefined sentiment categories—positive, negative, or neutral. This
behavior represents the core functionality of the models, which is to
accurately categorize tweets based on their sentiment.
● Regular code reviews and peer feedback to ensure code quality and
adherence to best practices.
Chapter 5
Standards Adopted
Chapter 6
Furthermore, our evaluation and analysis have not only demonstrated the
performance and robustness of the sentiment analysis system but also shed
light on potential ethical considerations, biases, and real-world applications.
By addressing these issues and continuously seeking improvements, we can
ensure the responsible deployment and utilization of sentiment analysis
technology in diverse domains.
6
Twitter Sentiment Analysis
References
Breckling, Ed., The Analysis of Directional Time Series: Applications to Wind Speed and Direction, ser. Lecture
Notes in Statistics. Berlin, Germany: Springer, 1989, vol. 61.
S. Zhang, C. Zhu, J. K. O. Sin, and P. K. T. Mok, “A novel ultrathin elevated channel low-temperature poly-Si
TFT,” IEEE Electron Device Lett., vol. 20, pp. 569–571, Nov. 1999.
M. Wegmuller, J. P. von der Weid, P. Oberson, and N. Gisin, “High resolution fiber distributed measurements with
coherent OFDR,” in Proc. ECOC’00, 2000, paper 11.3.4, p. 109.
R. E. Sorace, V. S. Reinhardt, and S. A. Vaughn, “High-speed digital-to-RF converter,” U.S. Patent 5 668 842,
Sept. 16, 1997.
<Student Name (in capital letters in font size 12, Times New Roman and
centered>
<Student Roll number (font size 12, Times New Roman and centered>
Abstract: A short description of the aim and objective of the project work carried out in 3-4
lines. This part should be common to all students in the group. The font size and style will
remain same from this point onwards. The font size will be 12 and font style will be Times New
Roman. The line spacing will be 1.5.
This report should be prepared in A4 page format with ‘default’ option under ‘Margin’ of the
‘Page Layout’ tab in Microsoft Word. Word limit for this section is 80.
Individual contribution and findings: The student should clearly indicate his/her role
in the project group and the contribution in implementing the project work. The student should
also outline his /her planning involved in implementing his/her part in the work. This
contribution report should be different for every student in the group. The student would also
write his./her technical findings and experience while implementing the corresponding part of
the project. The overall contribution report should not be less than 1 page for each student. The
Student should provide both the soft copy and signed hard copy to the project supervisor.