Navigating The Dark Web of Hate: Supervised Machine Learning Paradigm and NLP For Detecting Online Hate Speeches
Navigating The Dark Web of Hate: Supervised Machine Learning Paradigm and NLP For Detecting Online Hate Speeches
Science (IJAERS)
Peer-Reviewed Journal
ISSN: 2349-6495(P) | 2456-1908(O)
Vol-11, Issue-3; Mar, 2024
Journal Home Page Available: https://ijaers.com/
Article DOI: https://dx.doi.org/10.22161/ijaers.114.1
Received: 03 Feb 2024, Abstract— Many online platform’s participants are worried about hate
Receive in revised form: 08 Mar 2024, speeches that usually trigger cyberbully attitudes that dissuades users’
interest in their platforms. The study investigates hate speech in online
Accepted: 19 Mar 2024,
platforms using Natural Language Processing (NLP) techniques and
Available online: 30 Mar 2024 supervised machine learning paradigm. It specifically focused on
©2024 The Author(s). Published by AI developing a robust model capable of classifying text as 'hateful' or 'non-
Publication. This is an open access article under hateful' accurately. The approaches applied included compiling a large
the CC BY license dataset from multiple online textual sources; preprocessing the dataset
(https://creativecommons.org/licenses/by/4.0/) through normalization, tokenization, stop-word removal, and
lemmatization; advanced feature extraction techniques such as
Keywords— Natural Language Processing,
negation handling, n-gram analysis, and Term Frequency-Inverse
Tokenization, Logistic Regression,
Document Frequency (TF-IDF) to capture the intricacies of the textual
Hyperparameter
material and the model implementation phase using Logistic Regression
for its efficiency in binary classification problems. The model's
performance was evaluated using metrics such as accuracy, precision,
recall, F1-score and confusion matrix. The baseline performance of the
model with default hyperparameters achieved a test accuracy of 93%.
When optimized with hyperparameter tuning and cross-validation
procedures to guarantee more generalizable performance, the model
achieved an accuracy of 95%. The study concluded that NLP and
logistic regression technique can effectively identify hate speeches.
www.ijaers.com Page | 37
Mbeledogu and Ike-Okonkwo International Journal of Advanced Engineering Research and Science, 11(3)-2024
The UN Strategy and Plan of Action on hate speech defined it Twitter datasets to date. The researchers’ proposed method
as any kind of communication in speech, writing or behavior captured both word sequence and order information in short
that attacks or uses pejorative or discriminatory language with texts.
reference to a person or a group on the basis of who they are, in Khanday et al. (2022) delved into detecting twitter hate
other words, on their religion, ethnicity, nationality, race, color, speech in COVID-19 era using machine learning and
descent, gender or other identity factor” (United Nations, n.d). ensemble learning techniques. The authors carried out hate
It is characterized by expressions that demean, discriminate, speech detection using machine learning and ensemble
or incite violence and poses significant threats to the well- learning techniques during COVID-19. The twitter data
being of a man and the world at large. used were extracted using the publicly available twitter API
Hate speeches have been on increase not only among peers with the help of trending hashtags during the COVID-19
but also political and religious leaders. The high rate of pandemic. The tweets were manually annotated into two
social media and online comments have provided users with categories based on different factors. Feature extraction was
eccentric avenues to voice their opinions without any performed using Term Frequency/Inverse Document
regards. This democratization of expression has Frequency (TF/IDF), Bag of Words and Tweet Length. The
necessitated this also. From targeted harassment campaigns study found the Decision Tree Classifier to be effective
by political elites to the least of common man, its impact on when compared to other typical Machine Learning (ML)
individuals and societies cannot be overemphasized. classifiers. It had 98% precision, 97% recall, 97% F1-Score,
An agreement was reached that online platforms have the and 97% accuracy.
responsibility to mitigate the exigencies of hate speech Rodriguez et al. (2022) developed a framework for
while upholding principles of free speech and open detection and integration of unstructured data of hate speech
dialogue. In the light of this, many actions have been taken on Facebook using sentiment and emotion analysis. The aim
to address the occurrences of hate speeches by online of the research was to locate and analyze the unstructured
platforms, pressure groups and governments, thus, creating data of selected social media posts that intend to spread hate
the need for the use of supervised machine learning in the comment sections. To address this issue, they
paradigm and natural language processing to mitigate this. proposed a novel framework called FADOHS, which
Supervised Machine Learning Paradigm combines data analysis and natural language processing
strategies to sensitize all social media providers to the
This is the learning approach of machines when under
pervasiveness of hate on social media. Specifically, they
supervision whereby labeled data are used in the form of
used sentiment and emotion analysis algorithms to analyze
input-output pairs. The major tasks of this type of learning
recent posts and comments on these pages. Posts suspected
are regression, classification and forecasting (Kotsiantis,
of containing dehumanizing words will be processed before
2007).
fed to the clustering algorithm for further evaluation.
Natural Language Processing (NLP) According to the experimental results, the proposed
The field of NLP is a branch of Artificial Intelligence that FADOHS framework surpassed the state-of-the-art
focuses on the interaction between humans and computers approach in terms of precision, recall, and F1 scores by
using natural language (Johnson, 2023). It leverages on approximately 10%.
computational linguistics and machine learning techniques Pamungkas et al. (2020) on “Do you really want to hurt me?
to analyze and understand human language. By developing Predicting abusive swearing in social media”. They
sophisticated algorithms and models, researchers and explored the phenomenon of swearing in Twitter
practitioners in NLP can automate machine translation, conversations, taking the possibility of predicting the
speech recognition, information retrieval, spam detection, abusiveness of a swear word in a tweet context as the main
text summarization, intelligent web searching, intelligent investigation perspective. They developed the Twitter
spell checking and human-computer communication. English corpus SWAD (Swear Words Abusiveness
Review of Related Work Dataset), where abusive swearing was manually annotated
at the word level. Their collection consists of 1,511 unique
Zhang et al. (2018) worked on “Detecting hate speech on
swear words from 1,320 tweets. They developed models to
Twitter using a convolution-GRU based deep neural
automatically predict abusive swearing to provide an
network”. The paper introduced a new method based on a
intrinsic evaluation of SWAD and confirm the robustness of
deep neural network combining convolutional and gated
the resource. They also presented the results of a glass box
recurrent networks. The authors conducted an extensive
ablation study in order to investigate which lexical,
evaluation of the method against several baselines and state
syntactic and effective features that are more informative
of the art on the largest collection of publicly available
www.ijaers.com Page | 38
Mbeledogu and Ike-Okonkwo International Journal of Advanced Engineering Research and Science, 11(3)-2024
towards the automatic prediction of the function of or offensive content on Twitter. The results showed that
swearing. their solution could obtain considerable performance on
Zimmerman et al. (2019) researched on improving hate these datasets in terms of precision and recall in comparison
speech detection with deep learning ensembles. They to existing approaches. Also, their model captured some
utilized a publicly available embedding model and tested biases in data annotation and collection process and can
against a hate speech corpus from Twitter. To confirm the potentially lead to a more accurate model.
robustness of their results, they additionally tested against a
popular sentiment dataset. Their method had a nearly 5 III. RESEARCH METHODOLOGY
point improvement in F-measure when compared to original
This section outlines the methodology adopted for
work on a publicly available hate speech evaluation dataset.
sentiment analysis on hate speeches using a supervised
The major difficulties they encountered was reproducibility
learning approach. The chosen approach involves training
of deep learning methods and comparison of findings from
models on labeled datasets, leveraging the rich body of
other work.
research and techniques in supervised learning for
Yun et al. (2023) worked on BERT-Based logits ensemble sentiment classification.
model for gender bias and hate speech detection. They
aimed to solve the problem on gender bias and hate speech
detection, and to detect malicious comments in a Korean
hate speech dataset constructed in 2020. They explored
bidirectional encoder representations from transformers
(BERT)-based deep learning models utilizing
hyperparameter tuning, data sampling, and logits ensembles
with a label distribution. They evaluated the model in
Kaggle competitions for gender bias, general bias, and hate
speech detection. For gender bias detection, an F1-score of
0.7711 was achieved using an ensemble of the Soongsil-
BERT and KcELECTRA models. The general bias task
included the gender bias task, and the ensemble model
achieved the best F1-score of 0.7166.
Siino et al. (2021) analyzed the detection of hate speech
spreaders using convolutional neural network. The authors
developed a deep learning model based on a convolutional Fig.1: Graph of the data used for analysis (0-Non Hate
neural network (CNN) for the profiling hate speech Speech1- Hate Speech)
spreaders (HSSs). Their classification (HSS or not HSS)
takes advantage of the CNN based on a single convolutional
layer. In this binary classification task, they performed tests Data Collection
using a 5-fold cross validation, in which the proposed model The dataset was collected from Twitter and contains a
reached a maximum accuracy of 0.80 on the multilingual diverse set of tweets from various sources and user
(i.e., English and Spanish) training set, and a minimum loss backgrounds, spanning over a year of data collection. The
value of 0.51 on the same set. The trained model presented `hateDetection_train.csv` dataset utilized consists of 31964
was able to reach an overall accuracy of 0.79 on the full test tweets in total, with 93.2% labeled as hateful and 6.8% as
set. non-hateful, making it an imbalanced dataset as seen in
Mozafari et al. (2019) worked on a BERT-Based transfer Figure 1.
learning approach for hate speech detection in online social To ensure transparency and reproducibility, it is crucial to
media. The study introduced a novel transfer learning provide a detailed account of the dataset's origin, size, and
approach based on an existing pre-trained language model composition. The first step in understanding the dataset was
called Bidirectional Encoder Representations from loading it into a Pandas DataFrame. Figure 2 shows the first
Transformers (BERT). The transfer learning-based fine- 5 tweets visualized from the dataset after loading it into the
tuning techniques to explore BERT's capacity to detect Pandas DataFrame.
hateful context in social media content. To evaluate the
proposed approach, they made use of two publicly available
datasets that have been annotated for racism, sexism, hate,
www.ijaers.com Page | 39
Mbeledogu and Ike-Okonkwo International Journal of Advanced Engineering Research and Science, 11(3)-2024
To prepare the text data for modeling, the following c. Handling special characters and emojis: Special
preprocessing steps were applied. Figure 4 also shows the characters and emojis are retained as they may convey
code block that carried out these preprocessing steps: sentiment or context.
a. Removal of URLs, mentions, and hashtags: These d. Stop word removal: Common words like "the," "and,"
elements do not carry significant semantic meaning and "in" are removed as they carry little informative
and can be safely removed. value.
b. Conversion to lowercase: To ensure consistency in e. Lemmatization: Reducing words to their root forms
word representation and avoid treating the same word helps in capturing the core meaning of words.
differently due to case variations. f. Duplicate tweet removal: Duplicate tweets are
removed to prevent bias in the training process.
www.ijaers.com Page | 40
Mbeledogu and Ike-Okonkwo International Journal of Advanced Engineering Research and Science, 11(3)-2024
Data Splitting: To evaluate model performance effectively, the dataset was split into training (80%) and testing (20%) sets
using a random split with a fixed random state. This ensures reproducibility and allows me to assess the model's generalization
ability on unseen data. Figure 5 shows the code block used in splitting the dataset.
www.ijaers.com Page | 41
Mbeledogu and Ike-Okonkwo International Journal of Advanced Engineering Research and Science, 11(3)-2024
www.ijaers.com Page | 42
Mbeledogu and Ike-Okonkwo International Journal of Advanced Engineering Research and Science, 11(3)-2024
Fig.9: Code block for calculating the accuracy of the Recall: This measures the completeness of positive
model predictions, that is, measure of how well a model correctly
identifies True Positives. Equ. (3) shows its calculation:
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
Confusion Matrix: To gain deeper insight into model 𝑅𝑒𝑐𝑎𝑙𝑙 = (3)
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠+𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
performance, a confusion matrix that visualizes the number
F1-Score: A measure of a model’s accuracy on a dataset. It
of true positives (TP), true negatives (TN), false positives
is a harmonic means of both precision and recall of the
(FP), and false negatives (FN) was employed as shown in
model. It is determined as:
Figure 10. This information helped to identify specific
(𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 𝑥 𝑟𝑒𝑐𝑎𝑙𝑙)
patterns of errors made by the model, such as whether it 𝐹1𝑆𝑐𝑜𝑟𝑒 = 2 𝑋 (4)
(𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑟𝑒𝑐𝑎𝑙𝑙)
tends to have more false positives or false negatives.
www.ijaers.com Page | 43
Mbeledogu and Ike-Okonkwo International Journal of Advanced Engineering Research and Science, 11(3)-2024
computational-linguistics-understanding-the-differences-
57044aa41ad2.
[3] Khanday, A. M. U. D., Rabani, S. T., Khan, Q. R., and Malik,
S. H. (2022). Detecting twitter hate speech in COVID-19 era
using machine learning and ensemble learning
techniques. International Journal of Information
Management Data Insights, 2(2), Pgs. 100-120.
[4] Kotsiantis, S.B. (2007). Supervised Machine Learning: A
Review of Classification Techniques, Informatica 31, Pgs.
249-268.
[5] Mozafari, M., Farahbakhsh, R. and Crespi, N. (2020). A
BERT-based transfer learning approach for hate speech
detection in online social media. In Complex Networks and
Fig.11: Baseline performance without hyperparameter
Their Applications VIII: Volume 1 Proceedings of the 8th
tuning International Conference on Complex Networks and Their
Applications, COMPLEX NETWORKS 2019 Vol.8, Pgs.
928-940. Springer International Publishing.
Tuning hyperparameter enhances the performance of a
[6] Pamungkas, E. W., Basile, V. and Patti, V. (2020, May). Do
model. Through the grid search cross-validation, the you really want to hurt me? Predicting abusive swearing in
hyperparameters of the logistic regression model was social media. In Proceedings of the 12th Language Resources
optimized, resulting in an improved accuracy of 95% as and Evaluation Conference, Pgs. 6237-6246.
seen in Figure 12. The optimal hyperparameters are [7] Rodriguez, A., Chen, Y. L. and Argueta, .C. (2022).
determined to be C = 0.1 and solver = newton-cg. FADOHS: framework for detection and integration of
unstructured data of hate speech on facebook using sentiment
and emotion analysis, IEEE Access, 10, Pgs. 22400-22419.
[8] Siino, M., Di Nuovo, E., Tinnirello, I. and La Cascia, M.
(2021). Detection of hate speech spreaders using
convolutional neural networks. In CLEF (Working Notes),
Pgs. 2126-2136.
[9] Wang, Z. and Cha, Y. J. (2021). Unsupervised deep learning
approach using a deep auto-encoder with a one-class support
vector machine to detect damage. Structural Health
Monitoring, 20 (1), Pgs. 406-425.
[10] Yun, S., Kang, S. and Kim, H. (2023). BERT-Based Logits
Ensemble Model for Gender Bias and Hate Speech
Detection. Journal of Information Processing Systems, 19
(5).
Fig. 12: Performance Evaluation after hyperparameter [11] Zhang, D., Mao, R., Song, X., Wang, D., Zhang, H., Xia, H.,
tuning and Gao, Y. (2023). Humidity sensing properties and
respiratory behavior detection based on chitosan
halloysite nanotubes film coated QCM sensor combined with
V. CONCLUSION support vector machine. Sensors and Actuators B:
Chemical, 374, 132824.
The research focused on creating a machine learning model [12] Zhang, Z., Robinson, D. and Tepper, J. (2018). Detecting hate
for detecting hate speech in online textual content using speech on twitter using a convolution-gru based deep neural
NLP techniques. Based on the performance evaluation, network. In The Semantic Web: 15th International
Logistic Regression model showed reliable results in Conference Proceedings, ESWC 2018, Heraklion, Crete,
classifying text as either a hate speech or non-hate speech. Greece, June 3–7, Springer International Publishing,
Pgs.745-760.
[13] Zimmerman, S., Kruschwitz, U. and Fox, C. (2018).
REFERENCES Improving hate speech detection with deep learning
ensembles. In Proceedings of the 11th international
[1] United Nations (n.d). What is hate speech? Retrieved from
conference on language resources and evaluation (LREC
https://www.un.org/en/hate-speech/understanding-hate-
2018).
speech/what-is-hate-speech
[2] Johnson, .A. (2023). NLP Vs Computational Linguistics:
Understanding the Differences. Retrieved from
https://medium.com/@andrew_johnson_4/nlp-vs-
www.ijaers.com Page | 44