Fake Social Media Profile Detection and Reporting
Fake Social Media Profile Detection and Reporting
IJARSCT
International Journal of Advanced Research in Science, Communication and Technology (IJARSCT)
International Open-Access, Double-Blind, Peer-Reviewed, Refereed, Multidisciplinary Online Journal
Impact Factor: 7.53 Volume 4, Issue 5, March 2024
Abstract: Our research focuses on utilizing machine learning techniques, encompassing natural language
processing and computer vision, to create an automated system for the detection and reporting of fake
social media profiles across various platforms. Our approach involves feature extraction from both textual
and visual content, followed by the application of machine learning models to classify profiles as fake or
genuine. This system operates in real-time, monitoring user activity and promptly flagging suspicious
profiles for user- initiated reporting. By combining the power of machine learning with cross-platform
compatibility and user feedback, our solution aims to enhance online safety by swiftly identifying and
addressing fraudulent social media profiles, thus fostering more secure and trustworthy online
communities.
Keywords: Fake profiles, Machine learning, Natural Language Processing, Fraudulent Social media
accounts
I. INTRODUCTION
A website known as a "social networking site" is one where users may connect with friends, make updates, and
find new people who have similar interests. Each user has a profile on the website. Users can communicate with one
another using Web 2.0 technologies in these online social networks [1]. The utilization of social networking sites is
expanding quickly and affecting how individuals interact with one another. Online communities bring together people
with like interests and make it easy for users to find new friends. The main benefit of Internet social networking is
that it allows users to easily connect with people and communicate better. This has provided new avenues for
potential attacks such as fake identities, disinformation, and more [3]. Researchers are working to determine the impact
these online social networks have on people. There is much more to media than just how many people use it.
This suggests that the number of fake accounts has grown throughout the past years [4]. A website known as a "social
networking site" is one where users may connect with friends, make updates, and find new people who have
similar interests. Each user has a profile on the website. Users can communicate with one another using Web
2.0 technologies in these online social networks [1]. The utilization of social networking sites is expanding
quickly and affecting how individuals interact with one another. Online communities bring together people with
like interests and make it easy for users to find new friends. The main benefit of Internet social networking is
that it allows users to easily connect with people and communicate better. This has provided new avenues for
potential attacks such as fake identities, disinformation, and more [3]. Researchers are working to determine the
impact these online social networks have on people. There is much more to media than just how many people use
it. This suggests that the number of fake accounts has grown throughout the past years [4].
Detecting fake profiles on social media is vital for maintaining trust, protecting users from harm, preserving,
authenticity, and ensuring effective advertising. Fake profiles can spread misinformation, scam users, and distort user
data, impacting the credibility and security of the platform. By identifying and removing fake profiles, social media
platforms can uphold user trust, safeguard privacy, and maintain the integrity of interactions and advertising efforts.
Fig. 2: Flowchart
III. METHODOLOGY
3.1 System Block Diagram
3.3 Algorithms
A. Random Forest Classifier
A random forest is a powerful ensemble learning technique used for classification and regression tasks. It operates by
constructing multiple decision trees based on random subsets of the training data and then combining their predictions to
improve accuracy and reduce overfitting.
Here's how a random forest typically works:
1. Randomly select a subset of data points (with replacement) from the training set.
2. Build a decision tree using the selected subset.
3. Repeat steps 1 and 2 multiple times to create a forest of decision trees.
4. When making predictions for new data points, each decision tree in the forest provides a prediction.
5. The final prediction is determined by combining the individual predictions through a majority voting process for
classification tasks or averaging for regression tasks.
This approach helps random forests to generalize well to new data and improve predictive performance compared to
individual decision trees.
Copyright to IJARSCT DOI: 10.48175/IJARSCT-16695 467
www.ijarsct.co.in
ISSN (Online) 2581-9429
IJARSCT
International Journal of Advanced Research in Science, Communication and Technology (IJARSCT)
International Open-Access, Double-Blind, Peer-Reviewed, Refereed, Multidisciplinary Online Journal
Impact Factor: 7.53 Volume 4, Issue 5, March 2024
IV. RESULTS
V. CONCLUSION
In our project, we delved into detecting and reporting fake social media profiles using machine learning, with a focus on
decision tree and random forest classifiers. Through feature engineering and model training, we developed robust
classifiers adept at spotting patterns indicative of fraudulent behavior. Decision trees offered valuable insights into feature
hierarchies, aiding interpretability, while random forests excelled in performance by aggregating insights from multiple
trees, enhancing accuracy, and curbing overfitting. Our study highlights the promise of machine learning in combating the
proliferation of fake profiles on social media. By leveraging these algorithms, users and platform administrators gain tools
for early detection and prompt reporting of suspicious accounts, crucial for upholding the integrity of online communities.
Future research avenues include exploring advanced ensemble methods, deep learning architectures, and integrating
contextual data to bolster detection capabilities. Moreover, there's a need for scalable, automated systems for real-time
monitoring and response to evolving tactics of malicious actors.Overall, our work contributes to the discourse on
employing machine learning to counter online deception, emphasizing the significance of interdisciplinary collaboration
among data scientists, social scientists, and platform stakeholders in tackling this pervasive issue.
REFERENCES
[1] E. Karunakar, V. D. R. Pavani, T. N. I. Priya, M. V. Sri, and K. Tiruvalluru, “Ensemble fake profile detection using
machine learning (ML),” J. Inf. Comput. Sci., vol. 10, pp. 1071–1077, 2020.
[2] P. Wanda and H. J. Jie, “Deep profile: utilising dynamic search to identify phoney profiles in online social networks
CNN” J. Inf. Secur. Appl., vol. 52, pp. 1–13, 2020.
[3] P. K. Roy, J. P. Singh, and S. Banerjee, “Deep learning to filter SMS spam,” Future Gener. Comput. Syst., vol. 102, pp.
524–533, 2020.
[4] R. Kaur, S. Singh, and H. Kumar, “A modern overview of several countermeasures for the rise of spam and
compromised accounts in online social networks,” J. Netw. Comput. Appl., vol. 112, pp. 53– 88, 2018.
[5] G. Suarez-Tangil, M. Edwards, C. Peersman, G. Stringhini, A. Rashid, and M. Whitty, “Automatically dismantling
online dating fraud,” IEEE Trans. Inf. Forensics Secur., vol. 15, pp. 1128–1137, 2020.
[6] K. Thomas, C. Grier, D. Song, and V. Paxson, ‘‘Suspended accounts in retrospect: An analysis of Twitter spam,in
Proc. ACM SIGCOMM Conf. internet Meas. Conf., 2011, pp. 243–258.
[7] Saeed Abu-Nimeh, T. M. Chen, and O. Alzubi, “Malicious and Spam Posts in Online Social Networks,” Computer,
vol.44, no.9, IEEE2011, pp.23–28.
[8] B. Viswanath et al., ‘‘Towards detecting anomalous user behavior in online social networks,’’ in Proc. Usenix Secur.,
vol. 14. 2014, pp. 223–238.
[9] R. Kaur and S. Singh, “A survey of data mining and social network analysis based anomaly detection techniques,”
Egyptian informatics journal, vol. 17, no. 2, pp. 199–216, 2016.
[10] S.-T. Sun, Y. Boshmaf, K. Hawkey, and K. Beznosov, “A billion keys, but few locks: the crisis of web single sign-
on,” in Proceedings of the 2010 New Security Paradigms Workshop. ACM, 2010, pp. 61–7