Credit Risk Analysis Capstone Project
Credit Risk Analysis Capstone Project
1. Introduction
1.1 Project Overview
The objective of this project is to analyze credit risk using historical loan data from the
Lending Club. By leveraging SAS for data manipulation and SQL for data querying and
analysis, this project aims to predict loan default risk based on borrower attributes and loan
characteristics. The insights gained will assist financial institutions in making informed
decisions to manage credit risk effectively.
2. Data Collection
2.1 Data Source
The Lending Club Loan Data was imported into SAS for data manipulation and
preprocessing. SQL queries were used to extract relevant subsets of data and perform initial
exploratory analysis.
OUT=work.credit_risk_data
DBMS=CSV REPLACE;
RUN;
PROC SQL;
QUIT;
Generate summary statistics using SAS procedures (PROC MEANS, PROC FREQ) and SQL
queries to understand the distribution of variables.
RUN;
PROC SQL;
FROM work.credit_risk_data
GROUP BY loan_status;
QUIT;
Create basic visualizations using SAS procedures (PROC SGPLOT, PROC SQL) to explore
relationships and distributions.
PROC SQL;
FROM work.credit_risk_data
QUIT;
DATA work.credit_risk_cleaned;
SET work.credit_risk_data;
RUN;
PROC SQL;
UPDATE work.credit_risk_cleaned
QUIT;
4.2 Feature Engineering
Select and transform features using SAS data steps and SQL queries based on domain
knowledge and initial EDA findings.
DATA work.credit_risk_features;
RUN;
PROC SQL;
FROM work.credit_risk_cleaned;
QUIT;
5. Model Development
5.1 Model Selection
Choose a suitable modeling approach in SAS (PROC LOGISTIC, PROC GENMOD) based on
the project requirements and dataset characteristics.
RUN;
Train the selected model and evaluate its performance using SAS procedures (PROC
LOGISTIC, PROC SCORE) and SQL queries.
SCORE DATA=work.credit_risk_features
OUT=work.credit_risk_predictions;
RUN;
RUN;
DATA _NULL_;
FILE 'credit_risk_analysis_summary.txt';
PUT "Summary of Findings:";
PUT "---------------------";
RUN;
6.2 Recommendations
Provide actionable recommendations based on the analysis results to optimize credit risk
management strategies.
DATA _NULL_;
FILE 'credit_risk_analysis_recommendations.txt';
PUT "Recommendations:";
PUT "-----------------";
RUN;