Micro-Report-format 5 (1)
Micro-Report-format 5 (1)
A
Micro-Project
Report
On
Sentiment Analysis On Twitter Review
Semester - V
Foundation of AI and ML (4351601)
[October, 2024]
GOVERNMENT POLYTECHNIC, HIMATNAGAR
DEPARTMENT OF INFORMATION TECHNOLOGY
CERTIFICATE
GOVERNMENT
POLYTECHNIC,
HIMATNAGAR
DEPARTMENT OF INFORMATION TECHNOLOGY
CERTIFICATE
prescribed in curriculum.
1. Introduction:
1.Data Collection:
Collect tweets using the Twitter API or from a pre-existing dataset. Tweets can be filtered
based on keywords, hashtags, or user handles.
2.Data Preprocessing:
Tokenization: Splitting the text into individual words or tokens.
Stopword Removal: Removing common words that do not contribute to the sentiment (e.g.,
“and”, “the”).
Stemming/Lemmatization: Reducing words to their base or root form.
Cleaning: Removing URLs, mentions, hashtags, and special characters.
3.Feature Extraction:
Convert the cleaned text into numerical features that can be used by machine learning
algorithms. Common methods include Bag of Words (BoW), Term Frequency-Inverse
Document Frequency (TF-IDF), and word embeddings.
4.Model Training:
Train a machine learning model using labeled data (tweets with known sentiments).
Common algorithms include Naive Bayes, Support Vector Machines (SVM), and deep
learning models like LSTM and BERT.
5.Sentiment Classification:
Use the trained model to classify the sentiment of new tweets as positive, negative, or
neutral.
6.Evaluation:
Evaluate the model’s performance using metrics like accuracy, precision, recall, and F1-
score.
Tools :
1. Panda
2. Numpy
3. NLTK
4. Sklearn
Page 1
Foundation of AI and ML (4351601) Group No.:
2. Program code:
import pandas as pd
import numpy as np
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score,
classification_report
df['cleaned_text'] = df['text'].apply(preprocess_reel)
# Make predictions
y_pred = model.predict(X_test)
# Example usage
new_tweet = "I love this product!"
print(f'reel: {new_tweet} | Sentiment:
{predict_sentiment(new_treel)}
Page 4
3. Program output:
Accuracy: 0.85
precision recall f1-score support
Page 5
Foundation of AI and ML (4351601) Group No.:
4. Covered COs:
5. References:
https://www.kaggle.com/datasets/jp797498e/twitter-entity-
sentiment-analysis
https://scikit-learn.org/stable/install.html
https://pandas.pydata.org/docs/getting_started/install.html
https://www.nltk.org/install.html
Page 7