0% found this document useful (0 votes)

96 views6 pages

Final Project Report - Kelompok 4

The document describes a project to develop a machine learning model to predict customer credit risk and deploy the model as a web application. The team analyzed a credit risk dataset, developed a random forest classifier model with high accuracy, and deployed the model through a Flask API. The deployed web app allows users to input customer data and receives predictions of good or bad credit risk. Implementing this model is expected to reduce review times, costs, and improve profit by accurately assessing customer loan eligibility.

Uploaded by

andonifikri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

96 views6 pages

Final Project Report - Kelompok 4

Uploaded by

andonifikri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Credit Risk Prediction With Deployment

1. Andoni Fikri Oktaviano

2. Narendra Seta Roesdyoko

1. Problem & Idea

1.1. Problem Statement
Our Banking company need to review each customer that uses our credit system to make sure that the
customers that takes loan out of the credit provided by our company can eventually payback the loan.
Doing the review manually will take a long time and people to do causing increase in working capital
of the company. The rate of customer reviewed using manual review is also relatively slow and interfere
with the profit produced.

1.2. Solution Idea

Upon requesting loan from the credit, our customers will be asked certain questions regarding their
personal data. This data then will be processed and reviewed using a model that analyse each parameter
from the data received to determine whether the customer will be eligible to receive loan or not. The
results then if accepted will be reviewed manually but because of large amount of review work is done
already by the model, the manual review is only done to make sure the data received is factually correct
thus reducing the amount of time needed to review each customer.

2. Results and Discussion

2.1. Solution Architecture

The solution that our team proposed consist of two major process. The first process is creating a robust
machine learning model with high accuracy and the second major process is to deploy the model in a
virtual web. The detailed solution work flow could be seen on figure 1.

Our solution start with data set research. We found the data set from kaggle about customer credit data
set that is quite complex and consist a lot of data points [1]. Once the data set is load, then we will search
through each column of the data set to make sure that we understand what each column represent. The
data set itself consist of 12 columns and 32581 rows.

On the process, we found the data set has some missing value in few columns. The column which has a
missing value is the customer employment length and the loan intention rate. The percentage of missing
value for both columns are 2,74% and 9,5% respectively. For the column that has a missing value lower
than 5%, we drop the data as it is the common practice in the data science community [2]. For the
columns which has missing value grater than 5%, we impute the missing values with the mean values
of the columns since the column is not evenly distributed.
ML Model Build Up

Figure 1. Solution Architecture Flow Diagram

After data cleaning process has been completed, we preprocessed the categorical values in the data set.
In total there are 4 columns that has a categorical value and is then being processed by using label
encoding (for unique value greater than 2) and binary encoding (for unique value equal to 2). Straight
after preprocessed the categorical value, Exploratory Data Analysis (EDA) is done to detect outliers in
the data set. We found that some columns may have an extreme outlier which was handled by using
capping method [3].

Selection of feature is also crucial to developed the ML model. Some articles have warned to see the
correlation between each column and to make sure none of this columns is highly correlated
(Multicollinearity Check) [4]. This would effect the accuracy of the model if highly correlated values
was not handled correctly. Luckly in our data set there is no feature that has a high correlation value.

The next step after selecting the feature and getting our data set cleaned, we deploy various machine
learning algorithms and evaluate which algotihms give the best accurate value. This step would be
discussed in detail on section 2.2 and for the deployment would be discussed in section 2.3. For the
tools, we used google colaboratory for ML model build up (red dashed line) by utilizing pandas, numpy,
matplotlib, and sklearn libarary. HTML, CSS, and Visual Code with Python Flask library is used for the
deployment (blue dashed line).
2.2. Machine Learning Model Evaluation

To deploy the correct machine learning algortihm, model selection is performed by fitting few
algorithms to the data set and it is evaluated with AUC Score since it is a classification problem. Few
algotihms that we fit to the data set are Random Forest Classifier, Decision Tree Classifier, KNN
Classifier, and Logistic Regression.

Figure 2. ML Algorithm and AUC Score

From figure 2, it can be concluded that Random Forest Classifier is the algorithm that produced the best
score. Thus, we will evaluate the model by using confusion matrix (figure 3) to see how well the model
predict the customer creditworthy. Overall, the model prediction gives a good accuracy with a precision
score of 0,97 for 1 (Bad loan customer) and 0,92 for 0 (Good loan customer). Thus, this model will be
used for the deployment.

Figure 3. Confusion Matrix for Random Forest Classifier

2.3. Model Deployment

For deploying the model, we used the reference that we’ve found on youtube considering the minimum
time for the project [5]. The deployment flow was quite simple. It starts with saving the model into a
pickle file which then will be loaded to the web page using the Flask API library from Python. Figure 4
shows the user interface of the web while figure 5 and 6 showing the user interface when the predictor
values has been inputted.

Figure 4. Web Home Page

Figure 5. Web Appearance when Predicting Bad Customer

Figure 6. Web Appearance when Predicting Good Customer

2.4. Bussiness Impact

The predicted benefit of changing the system of review for loan is that the review time needed for each
customer can be reduced significantly and in the same time reducing the workforce needed to manual
review. Reduce in expenses for manual labor, and increase in income as the time needed to review is
shortens will be expected after using the model.

Beside the more efficient operations, predicting the right customer will also gaining more revenue to the
company. From the data set we could now that the bad customer loan amount average is equal $10760.
Thu, by using our model with precision score of 0.97, the company could save more than $1 million
every 100 customer and instead gaining more revenue by giving the amount to a more promising
customer.

2.5. Github Project Link

Github: https://github.com/AndoniFikri/Credit-Risk-Prediction-with-Deployment
3. References

[1] L. Tse, "Credit Risk Dataset," Kaggle, 2020. [Online]. Available:

https://www.kaggle.com/datasets/laotse/credit-risk-dataset?resource=download. [Accessed
August 2022].
[2] S. Kumar, "7 Ways to Handle Missing Values in Machine Learning," Towards Data Science, 24
July 2020. [Online]. Available: https://towardsdatascience.com/7-ways-to-handle-missing-values-
in-machine-learning-1a6326adf79e. [Accessed August 2022].
[3] C. Goyal, "Feature Engineering – How to Detect and Remove Outliers (with Python Code),"
Analytics Vidhya, 19 May 2021. [Online]. Available:
https://www.analyticsvidhya.com/blog/2021/05/feature-engineering-how-to-detect-and-remove-
outliers-with-python-code/. [Accessed August 2022].
[4] W. Badr, "Why Feature Correlation Matters …. A Lot!," Towards Data Science, 18 January 2019.
[Online]. Available: https://towardsdatascience.com/why-feature-correlation-matters-a-lot-
847e8ba439c4. [Accessed August 2022].
[5] K. Naik, "Deploy Machine Learning Model using Flask," Youtube, 16 June 2019. [Online].
Available: https://www.youtube.com/watch?v=UbCWoMf80PY. [Accessed August 2022].

Exercise - Multivariate Analysis - Jupyter Notebook
No ratings yet
Exercise - Multivariate Analysis - Jupyter Notebook
14 pages
Typhoon Contingency Plan - Tarlac Province Cy 2022 - 2025
100% (1)
Typhoon Contingency Plan - Tarlac Province Cy 2022 - 2025
66 pages
Installation Manual Netsure 512 582137000
100% (1)
Installation Manual Netsure 512 582137000
80 pages
FINAL Draft of IRR of RA11166
No ratings yet
FINAL Draft of IRR of RA11166
36 pages
B2 19bec113 19bec116 Loan Prediction
No ratings yet
B2 19bec113 19bec116 Loan Prediction
3 pages
Group 5 Dseb64a Report
No ratings yet
Group 5 Dseb64a Report
10 pages
Machine Learning
No ratings yet
Machine Learning
26 pages
Kritika Sejwal 24MCI10023 ML Lab Project Report
No ratings yet
Kritika Sejwal 24MCI10023 ML Lab Project Report
10 pages
SSRN Id3769854
No ratings yet
SSRN Id3769854
8 pages
Customer Credit Risk Application and Evaluation of Machine Learning and Deep Learning Models
No ratings yet
Customer Credit Risk Application and Evaluation of Machine Learning and Deep Learning Models
5 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Arpit Pal E2 17 Report Loan-Prediction-System
No ratings yet
Arpit Pal E2 17 Report Loan-Prediction-System
34 pages
Predicting Personal Loan Approval Using Machine Learning Handbook
No ratings yet
Predicting Personal Loan Approval Using Machine Learning Handbook
31 pages
Finclub Summer Project 2 (2025)
No ratings yet
Finclub Summer Project 2 (2025)
7 pages
Edafinal 1
No ratings yet
Edafinal 1
32 pages
Machine Learning Paper BD
No ratings yet
Machine Learning Paper BD
16 pages
Hanoi - 2021: (Document Title)
No ratings yet
Hanoi - 2021: (Document Title)
19 pages
Loan Approval - PPT
No ratings yet
Loan Approval - PPT
19 pages
Defect Prediction in Software Development & Maintainence
From Everand
Defect Prediction in Software Development & Maintainence
Rudra Kumar
No ratings yet
Credit Risk Analysis
No ratings yet
Credit Risk Analysis
6 pages
Project Guidelines Credit Score Classification
No ratings yet
Project Guidelines Credit Score Classification
3 pages
C5 IEEE CreditRiskScoringAnalysisBasedonMachineLearningModels
No ratings yet
C5 IEEE CreditRiskScoringAnalysisBasedonMachineLearningModels
6 pages
Loan Delinquency Prediction-1
No ratings yet
Loan Delinquency Prediction-1
4 pages
Loan Eligibility Prediction
No ratings yet
Loan Eligibility Prediction
12 pages
Decision Making Assignment
No ratings yet
Decision Making Assignment
6 pages
Phase 3
No ratings yet
Phase 3
19 pages
Project Stage I Report
No ratings yet
Project Stage I Report
17 pages
PA v0.25
No ratings yet
PA v0.25
18 pages
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
Quadexp IDS Project
No ratings yet
Quadexp IDS Project
22 pages
AI-Driven Web Apps: Practical Machine Learning for Software Developers
From Everand
AI-Driven Web Apps: Practical Machine Learning for Software Developers
Sivaramarajalu Ramadurai Venkataraajalu
No ratings yet
Research Paper ALAS
No ratings yet
Research Paper ALAS
4 pages
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
From Everand
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
Brian Knight
3/5 (1)
Report
No ratings yet
Report
15 pages
Credit Card Default Prediction PRESENTATION
No ratings yet
Credit Card Default Prediction PRESENTATION
12 pages
Assignment 1 - CIS 508
No ratings yet
Assignment 1 - CIS 508
11 pages
Module 3.4 Classification Models, Case Study
No ratings yet
Module 3.4 Classification Models, Case Study
12 pages
Credit Default Project 23124001
No ratings yet
Credit Default Project 23124001
13 pages
Python Code For Loan Default Prediction
No ratings yet
Python Code For Loan Default Prediction
4 pages
AI200 Capstone Project Instructions
No ratings yet
AI200 Capstone Project Instructions
8 pages
Project Report
No ratings yet
Project Report
19 pages
Step by Step Data Processing For ML Project
No ratings yet
Step by Step Data Processing For ML Project
16 pages
Synopsis Machine Learning
No ratings yet
Synopsis Machine Learning
18 pages
Ads 9
No ratings yet
Ads 9
8 pages
Finance Project Proposal
No ratings yet
Finance Project Proposal
7 pages
Loan Approval Prediction Using Machine Learning
No ratings yet
Loan Approval Prediction Using Machine Learning
8 pages
PA v0.21
No ratings yet
PA v0.21
17 pages
Loan Risk Prediction Using User Transaction Information
No ratings yet
Loan Risk Prediction Using User Transaction Information
3 pages
Credit Defaulter Classifier 1659348484
No ratings yet
Credit Defaulter Classifier 1659348484
7 pages
Capstone Project Report v1 - Abhishek Bihani
No ratings yet
Capstone Project Report v1 - Abhishek Bihani
16 pages
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
Credit Risk Management Using ML
No ratings yet
Credit Risk Management Using ML
4 pages
MCS-034: Software Engineering
From Everand
MCS-034: Software Engineering
Dr. DK Sukhani
No ratings yet
Development of A Machine Learning-Based Financial Risk Control Sy
No ratings yet
Development of A Machine Learning-Based Financial Risk Control Sy
70 pages
Report
No ratings yet
Report
14 pages
Blue Prism Developer Certification Case Based Practice Question - Latest 2023
From Everand
Blue Prism Developer Certification Case Based Practice Question - Latest 2023
Exam OG
No ratings yet
Data Science Real World Applications
100% (1)
Data Science Real World Applications
19 pages
Research Report
No ratings yet
Research Report
8 pages
Powerpoint Presentation On Rmarkdown 2
No ratings yet
Powerpoint Presentation On Rmarkdown 2
5 pages
Powerpoint Presentation On Rmarkdown
No ratings yet
Powerpoint Presentation On Rmarkdown
4 pages
Irjet V12i425
No ratings yet
Irjet V12i425
7 pages
Siebel Incentive Compensation Management ( ICM ) Guide
From Everand
Siebel Incentive Compensation Management ( ICM ) Guide
Mohammed Azizuddin Aamer
No ratings yet
PA v0.20
No ratings yet
PA v0.20
17 pages
Technicaltaskmodule5 (Andoni Fikri Oktaviano)
No ratings yet
Technicaltaskmodule5 (Andoni Fikri Oktaviano)
10 pages
Benchmark Evaluation of The NRAD Reactor LEU Core
No ratings yet
Benchmark Evaluation of The NRAD Reactor LEU Core
16 pages
Ujian Isometric Andoni
No ratings yet
Ujian Isometric Andoni
3 pages
Heavy Duty Marine Carousel Structure Analysis Based On Finite Element Theory With Multi Pivot-Nonlinear Support
No ratings yet
Heavy Duty Marine Carousel Structure Analysis Based On Finite Element Theory With Multi Pivot-Nonlinear Support
6 pages
MS3201 2021 HW2
No ratings yet
MS3201 2021 HW2
1 page
Hydraulics - Hyd 301: Cairo University - Faculty of Engineering
No ratings yet
Hydraulics - Hyd 301: Cairo University - Faculty of Engineering
46 pages
Animal Caution Signs
No ratings yet
Animal Caution Signs
7 pages
Revision For The First Regularly Test d2946
No ratings yet
Revision For The First Regularly Test d2946
4 pages
A2z of Stress Management
No ratings yet
A2z of Stress Management
11 pages
07 10-2023 MANU UR20-Webserver en
No ratings yet
07 10-2023 MANU UR20-Webserver en
33 pages
Portfolio
No ratings yet
Portfolio
21 pages
Wellbore Models GWELL, GWNACL, and HOLA User's Guide PDF
No ratings yet
Wellbore Models GWELL, GWNACL, and HOLA User's Guide PDF
114 pages
SY 2021 2022 Nominees Information Sheet
No ratings yet
SY 2021 2022 Nominees Information Sheet
1 page
Built Last: Magazine of The Series 2 Club
No ratings yet
Built Last: Magazine of The Series 2 Club
56 pages
A New Dimeric Secoiridoid Glycoside From The Leaves of Olea Ferruginea Royle
No ratings yet
A New Dimeric Secoiridoid Glycoside From The Leaves of Olea Ferruginea Royle
6 pages
Hide Hyodo Shimizu
No ratings yet
Hide Hyodo Shimizu
7 pages
GUMv 2
No ratings yet
GUMv 2
24 pages
Sa 1053 HMC Mip
No ratings yet
Sa 1053 HMC Mip
45 pages
UL Cyber Park
No ratings yet
UL Cyber Park
9 pages
Acustica: Light Diffraction by Ultrasonic Gratings
No ratings yet
Acustica: Light Diffraction by Ultrasonic Gratings
8 pages
Resume Mohammad A Quib
No ratings yet
Resume Mohammad A Quib
6 pages
Dufwenberg 2002
No ratings yet
Dufwenberg 2002
13 pages
Designing A Place Value Chart To Help Basic Grade Three Learners Subtract Four Digit Numbers
No ratings yet
Designing A Place Value Chart To Help Basic Grade Three Learners Subtract Four Digit Numbers
4 pages
Pompa Caldura WZH Hidros
No ratings yet
Pompa Caldura WZH Hidros
6 pages
Code of Conduct For The Laboratories: It Department Unix Lab
No ratings yet
Code of Conduct For The Laboratories: It Department Unix Lab
15 pages
Outline Example For Literature Review
100% (2)
Outline Example For Literature Review
7 pages
Jeju Island
No ratings yet
Jeju Island
2 pages
Lesson 33 - Amanuwal's Passover Memorial
No ratings yet
Lesson 33 - Amanuwal's Passover Memorial
20 pages
Extra
No ratings yet
Extra
3 pages
VCP Leaflet
No ratings yet
VCP Leaflet
8 pages
Utility Communications: FOX515H Next Generation High Capacity Utility Mutliplexer
No ratings yet
Utility Communications: FOX515H Next Generation High Capacity Utility Mutliplexer
5 pages
1ST Term JSS3 Agricultural Science
No ratings yet
1ST Term JSS3 Agricultural Science
17 pages

Final Project Report - Kelompok 4

Uploaded by

Final Project Report - Kelompok 4

Uploaded by

Credit Risk Prediction With Deployment

1. Andoni Fikri Oktaviano

1. Problem & Idea

1.2. Solution Idea

2. Results and Discussion

Figure 1. Solution Architecture Flow Diagram

Figure 2. ML Algorithm and AUC Score

Figure 3. Confusion Matrix for Random Forest Classifier

Figure 4. Web Home Page

Figure 5. Web Appearance when Predicting Bad Customer

2.4. Bussiness Impact

2.5. Github Project Link

[1] L. Tse, "Credit Risk Dataset," Kaggle, 2020. [Online]. Available:

You might also like