0% found this document useful (0 votes)
3 views

Fake Profile Identification in Social Network using Machine Learning and NLP

The document discusses the identification of fake profiles on social networks using Machine Learning and Natural Language Processing techniques, specifically Support Vector Machine (SVM) and Naïve Bayes algorithms, to enhance detection accuracy. It highlights existing challenges in detecting fake profiles, particularly on platforms like LinkedIn, due to limited publicly available data. The proposed system aims to improve the classification of profiles by utilizing both static and dynamic information while addressing security issues such as identity theft.

Uploaded by

Krishna Koushik
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Fake Profile Identification in Social Network using Machine Learning and NLP

The document discusses the identification of fake profiles on social networks using Machine Learning and Natural Language Processing techniques, specifically Support Vector Machine (SVM) and Naïve Bayes algorithms, to enhance detection accuracy. It highlights existing challenges in detecting fake profiles, particularly on platforms like LinkedIn, due to limited publicly available data. The proposed system aims to improve the classification of profiles by utilizing both static and dynamic information while addressing security issues such as identity theft.

Uploaded by

Krishna Koushik
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Fake Profile Identification in Social Network using

Machine Learning and NLP


ABSTRACT

At present social network sites are part of the life for most of the people. Every day
several people are creating their profiles on the social network platforms and they are
interacting with others independent of the user’s location and time. The social
network sites not only providing advantages to the users and also provide security
issues to the users as well their information. To analyze, who are encouraging threats
in social network we need to classify the social networks profiles of the users. From
the classification, we can get the genuine profiles and fake profiles on the social
networks. Traditionally, we have different classification methods for detecting the
fake profiles on the social networks. But, we need to improve the accuracy rate of the
fakeprofile detection in the social networks. In this paper we are proposing Machine
learning and Natural language Processing (NLP) techniques to improve the accuracy
rate of the fake profiles detection. We can use the Support Vector Machine (SVM)
and Naïve Bayes algorithm.
EXISTING SYSTEM

Chai et al awarded on this paper is a proof-of inspiration gain knowledge of. Even
though the prototype approach has employed most effective normal systems in
normal language processing and human-pc interplay, the results realized from the
user trying out are significant. By using comparing this simple prototype approach
with a wholly deployed menu procedure, they've discovered that users, principally
beginner users, strongly pick the common language dialog-based approach. They
have additionally learned that in an ecommerce environment sophistication in dialog
administration is most important than the potential to manage complex typical
language sentences.

In addition, to provide effortless access to knowledge on ecommerce web sites,


natural language dialog-based navigation and menu-pushed
navigation should be intelligently combined to meet person’s one-of-a-kind wants.
Not too long ago, they have got accomplished development of a new iteration of the
approach that includes enormous enhancements in language processing, dialog
administration and information management. They believed that average language
informal interfaces present powerful personalized alternatives to conventional
menupushed or search-based interfaces to web sites.

LinkedIn is greatly preferred through the folks who're in the authentic occupations.
With the speedy development of social networks, persons are likely to misuse them
for unethical and illegal conducts. Creation of a false profile turns into such
adversary outcomes which is intricate to identify without apt research. The current
solutions which were virtually developed and theorized to resolve this contention,
mainly viewed the traits and the social network ties of the person’s social profile.
However, in relation to LinkedIn such behavioral observations are tremendously
restrictive in publicly to be had profile data for the customers by the privateness
insurance policies. The limited publicly available profile data of LinkedIn makes it
ineligible in making use of the existing tactics in fake profile identification. For that
reason, there is to conduct distinctive study on deciding on systems for fake profile
identification in LinkedIn. Shalinda Adikari and Kaushik Dutta researched and
identified the minimal set of profile data that are crucial for picking out false profiles
in LinkedIn and labeled the appropriate knowledge mining procedure for such
project.

Z. Halim et al. Proposed spatio-temporal mining on social network to determine


circle of customers concerned in malicious events with the support of latent semantic
analysis. Then compare the results comprised of spatio temporal co incidence with
that of original organization/ties with in social network, which could be very
encouraging as the organization generated by spatio-temporal co-prevalence and
actual one are very nearly each other. Once they set the worth of threshold to right
level, we develop the number of nodes i.e. Actor so that they are able to get higher
photo. Total, scan indicate that Latent Semantic Indexing participate in very good for
picking out malicious contents, if the feature set is competently chosen. One obvious
quandary of this technique is how users pick their function set and the way rich it's. If
the characteristic set is very small then most of the malicious content material will
not be traced. However, the bigger person function set, better the performance won.
Disadvantages
 The system is not implemented Learning Algorithms like svm, Naive Bayes.
 The system is not implemented any the problems involving social networking
like privacy, online bullying, misuse, and trolling and many others.
Proposed System

• On this paper we presented a machine learning & natural language processing


system to observe the false profiles in online social networks. Moreover, we are
adding the SVM classifier and naïve bayes algorithm to increase the detection
accuracy rate of the fake profiles.

An SVM classifies information by means of finding the exceptional hyperplane that


separates all information facets of 1 type from those of the other classification. The
best hyperplane for an SVM method that the one with the biggest line between the
two classes. An SVM classifies data through discovering the exceptional hyperplane
that separates all knowledge facets of one category from those of the other class. The
help vectors are the info aspects which are closest to the keeping apart hyperplane.

Naive Bayes algorithm is the algorithm that learns the chance of an object with
designated features belonging to a unique crew/category. In brief, it's a probabilistic
classifier. The Naive Bayes algorithm is called "naive" on account that it makes the
belief that the occurrence of a distinct feature is independent of the prevalence of
other aspects. For illustration, if we're looking to determine false profiles based on its
time, date of publication or posts, language and geoposition. Even if these points
depend upon each and every different or on the presence of the other facets, all of
these properties in my view contribute to the probability that the false profile.
Advantages
 In the proposed system, Profile information in online networks will also be
static or dynamic. The details which can be supplied with the aid of the person
on the time of profile creation is known as static knowledge, the place as the
small print that are recounted with the aid of the system within the network is
called dynamic knowledge.

 In the proposed system, Social Networking offerings have facilitated identity


theft and Impersonation attacks for serious as good as naïve attackers.

SYSTEM REQUIREMENTS

➢ H/W System Configuration:-

➢ Processor - Pentium –IV


➢ RAM - 4 GB (min)
➢ Hard Disk - 20 GB
➢ Key Board - Standard Windows Keyboard
➢ Mouse - Two or Three Button Mouse
➢ Monitor - SVGA

SOFTWARE REQUIREMENTS:

 Operating system : Windows 7 Ultimate.

 Coding Language : Python.

 Front-End : Python.

 Back-End : Django-ORM

 Designing : Html, css, javascript.

 Data Base : MySQL (WAMP Server).

You might also like