Project 2
Project 2
BACHELOR OF TECHNOLOGY
in
of Dr.T.Rajesh
(Professor)
September – 2024
MALLA REDDY ENGINEERING COLLEGE
Maisammaguda, Secunderabad, Telangana, India 500100
CERTIFICATE
This is to certify that the “Credit Card Fraud Detection Using Random Forest
And Cart Algorithm” submitted by B VAISHNAVI [23J41A0204], B MANOJ
KIRAN [23J41A0205], B SIVA KRISHNA [23J41A0206] are work done by
him/ her and submitted during 2024 – 2025 academic year, in partial fulfillment
of the requirements for the award of the degree of BACHELOR OF
TECHNOLOGY in ELECTRICAL AND ELECTRONICS
ENGINEERING, at MALLA REDDY ENGINEERING COLLEGE An
Autonomous Institution, Maisammaguda, Secunderabad, Telangana, India
500100
SIGNATURE SIGNATURE
Dr.T.Rajesh
Dr.M.Kondalu
INTERNSHIP COORDINATOR
HOD
( Professor)
(Department of EEE)
Department of
EEE Malla Reddy Engineering
College Secunderabad, 500 100
Malla Reddy Engineering College
Secunderabad, 500 100
ACKNOWLEDGEMENT
First I would like to thank Dr.T.Rajesh, ( Professor) Place for giving me the opportunity to do
an internship within the organization. I am highly indebted to Dr.A.Ramaswamy Reddy,
(Principal) for the facilities provided to accomplish this internship. I would like to thank
Dr.M.Kondalu, (HOD) for his constructive criticism throughout my internship.
It is indeed with a great sense of pleasure and immense sense of gratitude that I acknowledge
the help of these individuals.
I am extremely great full to my department staff members and friends who helped me in
successful completion of this internship.
B VAISHNAVI [23J41A0204]
1
ABSTRACT
The project is mainly focussed on credit card fraud detection in real world. A
phenomenal growth in the number of credit card transactions, has recently led to a
considerable rise in fraudulent activities. The purpose is to obtain goods without paying,
or to obtain unauthorized funds from an account. Implementation of efficient fraud
detection systems has become imperative for all credit card issuing banks to minimize
their losses. One of the most crucial challenges in making the business is that neither
the card nor the cardholder needs to be present when the purchase is being made. This
makes it impossible for the merchant to verify whether the customer making a purchase
is the authentic cardholder or not. With the proposed scheme, using random forest
algorithm the accuracy of detecting the fraud can be improved can be improved.
Classification process of random forest algorithm to analyse data set and user current
dataset. Finally optimize the accuracy of the result data. The performance of the
techniques is evaluated based on accuracy, sensitivity, and specificity, and precision.
Then processing of some of the attributes provided identifies the fraud detection and
provides the graphical model visualization. The performance of the techniques is
evaluated based on accuracy, sensitivity, and specificity, and precision.
2
INDEX
ABSTRACT 2
4 Conclusion 13
5 Appendices 17
6 References 18
3
1. OBJECTIVES OF THE INTERNSHIP
4
2. TECHNICAL OBSERVATIONS AND LEARNINGS FORM
INTERNSHIP PROGRAM
Process Overview: The credit card fraud detection system involves several
stages starting from data collection to model deployment.
5
REQUIREMENT ANALYSIS
The project involved analyzing the design of few applications so as to make the
application more users friendly. To do so, it was really important to keep the
navigations from one screen to the other well ordered and at the same time reducing the
amount of typing the user needs to do. In order to make the application more accessible,
the browser version had to be chosen so that it is compatible with most of the Browsers.
REQUIREMENT SPECIFICATION
Functional Requirements
Software Requirements
For developing the application the following are the Software Requirements->Python
HARDWARE REQUIREMENTS
For developing the application the following are the Hardware Requirements:
6
SYSTEM SPECIFICATION:
HARDWARE REQUIREMENTS:
System : Pentium IV 2.4 GHz.
SOFTWARE REQUIREMENTS:
Front-End : Python.
Designing : Html,css,javascript.
7
Credit Card Fraud Detection Using Random Forest and CART :
Introduction
Credit card fraud is a significant concern for financial institutions and customers
alike, with billions lost annually due to fraudulent activities. Detecting fraud is
challenging due to the evolving nature of fraudulent tactics and the need to
distinguish between legitimate and fraudulent transactions. Machine learning
algorithms, particularly Random Forest and CART (Classification and Regression
Trees), have shown to be effective in identifying suspicious transactions based on
historical data patterns.
8
Advantages of Random Forest for Fraud Detection
predictive accuracy. • Robust to Overfitting: The use of random samples and features
makes Random Forest robust to overfitting, especially in high-dimensional datasets
typical in fraud detection. • Feature Importance: It provides insights into which features
(like transaction amount, frequency, location, etc.) are most important in predicting
fraud.
What is CART?
CART (Classification and Regression Trees) is a decision tree algorithm used for
creating a model that predicts the value of a target variable by learning simple decision
rules inferred from the data features. CART builds binary decision trees where each
node represents a decision on a single feature.
1. Splitting Criteria: The CART algorithm splits the dataset into two subsets based on the
value of a specific feature that results in the maximum information gain (or minimum
Gini impurity). 2. Recursive Binary Splits: This process is repeated recursively for each
subset, creating a binary tree structure. Each node represents a feature and a decision
rule, while each branch represents an outcome of that rule. 3. Stopping Criteria: The tree
continues to split until it reaches a stopping criterion (e.g., maximum tree depth,
minimum number of samples per leaf, or purity of nodes).
How Random Forest and CART Work for Credit Card Fraud Detection Credit
card fraud detection
..
9
Quality Planning and Control Activities
Role: My team was responsible for implementing and optimizing the Random
Forest algorithm for detecting fraudulent transactions. Additionally, I contributed
to the integration of the CART algorithm to enhance anomaly detection.
Experiences: Through this internship, I gained hands-on experience with
machine learning model development, data preprocessing, and the practical
challenges of deploying models in a real-world environment.
Comparison of Theory and Practice
1
0
UML
DIAGRAMS
Class diagram
User
UploadCreditCardDataset()
GenerateTrainAndTestModel()
RunRandomForestTree()
DetectFraudFrom TestData()
CleanAndFraudTransactionGraph()
Exit()
User
Exit
1
1
Sequence Diagram:
6.Exit
Collaboration Diagram
10
3. OUTCOME OF THE INTERNSHIP
During the internship, We acquired a range of technical and professional skills that
have significantly enhanced my qualifications:
11
B.Responsibilities Undertaken
12
C.Influence on Future Career Plans
Career Direction: The hands-on experience in credit card fraud detection has
reinforced my interest in pursuing a career in data science, particularly within the
financial technology sector. I now feel more confident in my ability to contribute
meaningfully to this field.
Professional Growth: The skills and experience gained during this internship will be
instrumental in advancing my career. I am now better prepared to take on more
challenging roles in data science and machine learning.
Networking Opportunities: The connections and relationships I built during the
internship will be valuable as I navigate my career path, providing opportunities
for mentorship and job opportunities in the future.
13
4. APPENDICES
To run project double click on ‘run.bat’ file to get below screen
In above screen click on ‘Upload Credit Card Dataset’ button to upload dataset
14
After uploading dataset will get below screen
Now click on ‘Generate Train & Test Model’ to generate training model for Random
Forest Classifier
15
In above screen after generating model we can see total records available in dataset and
then application using how many records for training and how many for testing. Now click
on “Run Random Forest Algorithm’ button to generate Random Forest model on train and
test data
In above screen we can see Random Forest generate 99.78% percent accuracy while
building model on train and test data. Now click on ‘Detect Fraud From Test Data’ button
to upload test data and to predict whether test data contains normal or fraud transaction
16
In above screen I am uploading test dataset and after uploading test data will get below
prediction details
In above screen beside each test data application will display output as whether transaction
contains cleaned or fraud signatures. Now click on ‘Clean & Fraud Transaction Detection
Graph’ button to see total test transaction with clean and fraud signature in graphical
format. See below screen
In above graph we can see total test data and number of normal and fraud transaction
detected. In above graph x-axis represents type and y-axis represents count of clean and
fraud transaction
17
5. CONCLUSION AND FUTURE SCOPE
The Random forest algorithm will perform better with a larger number of training data,
but speed during testing and application will suffer. Application of more pre-processing
techniques would also help. The SVM algorithm still suffers from the imbalanced dataset
problem and requires more preprocessing to give better results at the results shown by
SVM is great but it could have been better if more preprocessing have been done on the
data.
18
6. REFERENCES
19