NLP - Emotion Detection
NLP - Emotion Detection
January 6, 2024
import warnings
warnings.filterwarnings('ignore')
df.head()
[40]: df['Emotion'].value_counts()
1
[40]: Emotion
joy 5362
sadness 4666
anger 2159
fear 1937
love 1304
surprise 572
Name: count, dtype: int64
plt.title('Distribution of Emotions')
plt.show()
2
[43]: # Histogram plot for emotion distribution with KDE
plt.figure(figsize=(10, 6))
sns.histplot(x='Emotion', data=df, kde=True, palette='Set2', element='bars',␣
↪stat='count',
common_norm=False)
plt.title('Distribution of Emotions')
plt.xticks(rotation=45, ha='right')
plt.show()
3
[44]: # Load spaCy English model
nlp = spacy.load("en_core_web_sm")
4
… … …
15995 i just had a very brief time in the beanbag an… sadness
15996 i am now turning and i feel pathetic that i am… sadness
15997 i feel strong and good overall joy
15998 i feel like this was such a rude comment and i… anger
15999 i know a lot but i feel so stupid because i ca… sadness
Emotion_num processed_text
0 1 not feel humiliate
1 1 feel hopeless damned hopeful care awake
2 2 m grab minute post feel greedy wrong
3 4 feel nostalgic fireplace know property
4 2 feel grouchy
… … …
15995 1 brief time beanbag say anna feel like beat
15996 1 turn feel pathetic wait table sub teaching degree
15997 0 feel strong good overall
15998 2 feel like rude comment m glad t
15999 1 know lot feel stupid portray
2 Train-test split
[46]: from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df['Description'],␣
↪df['Emotion_num'], test_size=0.2, random_state=42)
3 KNN
[47]: # KNN
knn = Pipeline([
('tfidf', TfidfVectorizer()),
('classifier', KNeighborsClassifier())
])
knn.fit(X_train, y_train)
knn_y_pred = knn.predict(X_test)
Classification Report:
precision recall f1-score support
5
2 0.76 0.61 0.68 427
3 0.77 0.52 0.62 397
4 0.77 0.35 0.48 296
5 0.64 0.30 0.41 113
4 Logistic Regression
[49]: # Logistic Regression
lr = Pipeline([
('tfidf', TfidfVectorizer()),
('classifier', LogisticRegression())
])
lr.fit(X_train, y_train)
lr_y_pred = lr.predict(X_test)
Classification Report:
precision recall f1-score support
6
[52]: print("Classification Report:\n", classification_report(y_test, nb_y_pred))
Classification Report:
precision recall f1-score support
6 Random Forest
[53]: # Random Forest
rfc = Pipeline([
('tfidf', TfidfVectorizer()),
('classifier', RandomForestClassifier(random_state=42))
])
rfc.fit(X_train, y_train)
rfc_y_pred = rfc.predict(X_test)
Classification Report:
precision recall f1-score support
7
7 Confusion Matrix Heatmap for Random Forest
[56]: cm = confusion_matrix(y_test, rfc_y_pred)
cm