A comprehensive end-to-end data science project to analyze and predict credit risk using the Lending Club loan dataset.
This project performs an in-depth credit risk analysis on the Lending Club dataset sourced from Kaggle. The goal is to identify key drivers of loan default and build a predictive model that classifies borrowers as high or low credit risk.
The project covers the full data science pipeline — from raw data cleaning, thorough exploratory data analysis and statistical testing to model building and deployment as an interactive web application.
- Identified the most significant predictors of credit default using statistical techniques including KS Test, Mutual Information, and Cramér's V.
- Borrowers with higher debt-to-income (DTI) ratios and lower annual income showed significantly higher default rates.
- Loan grade and sub-grade assigned by Lending Club proved to be strong indicators of credit risk.
- The final predictive model achieved strong performance metrics, with features engineered from raw financial variables contributing meaningfully to accuracy.
- Findings were documented in a structured HTML report covering domain context, methodology, and actionable insights.
- Source: Lending Club Loan Dataset — Kaggle
- Size: ~1.1 GB (not included in this repo due to GitHub file size limits)
- Description: Contains historical loan data from Lending Club including borrower information, loan attributes, and repayment status.
git clone https://github.com/mahatarc/Lending_club_credit_risk.git
cd Lending_club_credit_riskpip install -r requirements.txtjupyter notebook Lending_club_credit_risk_analysis.ipynbstreamlit run app.pyThen open your browser at http://localhost:8501
uvicorn main:app --reloadThen open your browser at http://localhost:8000
Interactive API docs available at http://localhost:8000/docs
The credit risk prediction model is deployed as an interactive web app built with Streamlit and FastAPI.
