AI-Powered Credit Scoring System
AI-Powered Credit Scoring System
Credit
Scoring
System
Problem Statement
Assessing the credit score of individuals in low-income communities presents
significant challenges due to the absence of traditional financial data such as
credit history, credit scores, and formal employment records. This lack of
formal financial information results in limited access to credit, perpetuating
cycles of poverty and financial exclusion. To address these issues, this project
aims to develop a comprehensive, machine learning-based framework to
evaluate creditworthiness using alternative data sources. By leveraging
transactional data from informal financial activities this approach seeks to
provide a more inclusive and accurate assessment of creditworthiness,
enabling financial institutions to extend credit to underserved populations
effectively.
Proposed solution
Gather alternative data sources such as transaction records
from informal financial activities, transaction history.
Train the autoencoder for defaulters on a dataset composed of
known defaulting individuals to capture patterns specific to
financial risk.
Train the autoencoder for non-defaulters with data from
individuals who have a history of financial reliability.
Utilize the reconstruction error from both autoencoders to
assess the deviation of a new individual’s data from typical
profiles of defaulters and non-defaulters.
Develop a scoring formula based on the reconstruction errors
that quantifies creditworthiness. The score can range from
high risk (closer to defaulter patterns) to low risk (closer
to non-defaulter patterns).
Technical Approach
Tech Stack
Scoring Formula
1 Prediction Annals of This study applies machine The study employs machine Through a comparative
of bank Operation learning to predict bank learning algorithms, specifically analysis, it was found that
credit s credit worthiness, focusing Random Forest and Gradient to machine learning models,
worthiness Research, on enhancing AI predict bank credit risk. It especially the Gradient
through 2024 transparency. It evaluates focuses on improving AI Boosting classifier,
credit risk models like Random Forest transparency by ranking the outperform traditional
analysis: an and Gradient Boosting, importance of features statistical methods in
explainable ranking feature importance influencing model decisions. The predicting
machine to address concerns about models are trained and creditworthiness, showing
learning AI's "black-box" nature. evaluated on a large dataset of superior resistance to
study The research advances bank loan defaults to ensure overfitting and better
ethical, transparent credit accuracy and interpretability. overall performance.
risk models in financial
services.
Literature Review
Journal
Sr.
Title Name & Abstract Methodology Conclusion
No
Year
2 Machine Journal This study investigates the The study utilizes a The study found that
learning of the effectiveness of various dataset containing loan ensemble methods like
predictivit Brazilia machine learning models in records from a Brazilian Random Forest and
y applied n predicting creditworthiness bank. AdaBoost outperformed
to Comput using data from a Brazilian Various machine learning other models, including
consumer er bank. The focus is on models including random SVM and Decision Trees.
creditwort Society, comparing the performance forest and decision tree These models
hiness 2020 of several models. The were trained and tested. demonstrated better
study aims to identify the The models were predictive accuracy,
most accurate model for evaluated using making them more suitable
assessing credit risk in the performance metrics for credit risk assessment
Brazilian market, providing in this context.
insights for financial
institutions to improve their
credit risk management
practices.