Enterprise Credit Risk Evaluation Based On Neural Network Algorithm
Enterprise Credit Risk Evaluation Based On Neural Network Algorithm
com
ScienceDirect
Cognitive Systems Research 52 (2018) 317–324
www.elsevier.com/locate/cogsys
Received 29 May 2018; received in revised form 2 July 2018; accepted 17 July 2018
Available online 24 July 2018
Abstract
To explore the enterprise credit risk evaluation, the application effect of several common neural network models in Chinese small and
medium-sized enterprise data sets was compared and the optimal parameters for each model were determined. In addition, the classifi-
cation accuracy and the applicability of the model were compared, and finally the common problem of optimization neural network algo-
rithm based on population was solved: need to determine the dimensions in advance. The experimental results showed that the
probabilistic neural network (PNN) had the minimum error rate and second types of errors, while the PNN model had the highest
AUC value and was robust. To sum up, the algorithm makes some contributions to solve the financing problem of small and
medium-sized enterprises in China.
Ó 2018 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.cogsys.2018.07.023
1389-0417/Ó 2018 Elsevier B.V. All rights reserved.
318 X. Huang et al. / Cognitive Systems Research 52 (2018) 317–324
method. Different from statistical models, AI model does aid credit approval data system. Zhang, Hu, and Zhang
not require the assumption of variable distribution, and (2015) established a credit risk assessment index system,
can acquire knowledge directly from training data sets. In and adopted the supply chain view that considered the
the field of credit risk assessment, especially when the credit credit status of the enterprise and the relationship between
risk assessment problem is nonlinear mode classification, the supply chains. In addition, the credit risk assessment
the performance of AI model is better than that of the sta- model based on support vector machine (SVM) and the
tistical model. implementation technology of BP neural network were also
carried out. The credit risk assessment index system,
2. State of the art including the credit status of the leading enterprises in
the supply chain and the cooperative relationship between
Chang, Chang, Chu, and Tong (2016) proposed a short- the small and medium-sized enterprises (SMEs) and the
term credit risk assessment model based on decision tree, leading enterprises, could help banks to predict the accu-
which is used to evaluate credit risk. The goal is to use a racy of the default of SMEs. As a result, more SMEs can
decision tree to filter short term breaches to produce a get loans from the banks through SCF. Fatemi and
highly accurate model that can distinguish between default Fooladi (2014) believed that the SCF credit risk assessment
loans. In this paper, a credit risk model is established by model based on support vector machine (SVM) had good
combining Bootstrap aggregation with the minority of generalization ability and robustness, which was more
sampling techniques to improve the stability of decision effective than BP neural network evaluation model. There-
tree and the performance of unbalanced data. Zhang, fore, the application of support vector machine model can
Zhao, and Wu (2017), based on the neural network of par- improve the accuracy of credit risk assessment for small
ticle swarm optimization genetic algorithm, studied the and medium-sized enterprises, thus alleviating the problem
cross border e-commerce credit risk assessment model, of credit rationing in small and medium-sized enterprises.
and put forward the construction process of credit risk We use the financial data of 46 unlisted SMEs in the tri-
assessment model based on PSO-GA in BP neural network. angle area to formulate indexes, and make a deep compar-
The results showed that the above model could effectively ative study of 4 kinds of neural network models and
meet the requirements of the cross-border e-commerce decision tree methods used for risk assessment. The credit
credit risk assessment. Bao (2016) used BP neural network risk assessment model in this study can provide powerful
simulation to obtain the credit rating of individual borrow- tools and technical support for effective early warning of
ers from P2P network. And the simulation was carried out bank credit risk, and it can provide a scientific and reason-
in the absence of data. Compared with the website rating, able quantitative basis for loan approval. Therefore, the
the simulation results were more accurate, and the credit risk management level and the comprehensive competitive-
risk of individual borrowers could be effectively evaluated. ness of the bank can be improved. At the same time, it can
On the basis of the analysis, some suggestions and counter- also play a certain role in promoting the development of
measures of the network platform were given. Bekhet and the enterprise.
Eletter (2014) proposed two credit scoring models using
data mining technology to support the loan decision of 3. Methodology
commercial banks in Jordan. The loan application assess-
ment will improve the effectiveness of the loan decision, Backward propagation (BP) is the most popular appli-
control the task of the loan office, and save the analysis cation of neural network structure. The main reason for
time and cost. Loan applications that are accepted and the popularity of backward propagation is that backward
rejected from different commercial banks in Jordan are propagation can learn and obtain very complex mapping.
used to establish a credit scoring model. The results showed The BP neural network uses a supervised learning model
that the logistic regression model was superior to the radial and a backward propagating network structure, as shown
basis function (RBF) model in terms of the overall accu- in Fig. 1.
racy rate. However, the radial basis function is better than Topology is shown above: the input layer, the hidden
the identification of those who may default. layer, and the output layer. BP describes the relationship
Yang (2014) firstly proposed an improved quantization between the layer’s input and the output by using the acti-
method, that is, IDM, based on statistical independence. vation function that can be guided, and the S type function
Then, data mining technology, namely, decision tree is commonly used. The input unit receives a foreign input
C4.5, naive Bias and SVM classifier, were used to classify sample x, which is adjusted by the weight coefficient w of
and predict the quantified credit data. The impact of quan- the network by the training unit, and then outputs the
titative methods on the classification of credit approval result by the output unit. In this process, the desired output
data was studied. The experimental results showed that this signal can be used as a teacher’s signal to input, and the
method significantly improved the average accuracy of the error generated by the comparison between the teacher’s
classification than other known quantized methods. This signal and the actual output can control the modification
showed that the proposed method could effectively explain weight coefficient w. The input sample signal acts through
and illustrate the design ability of a new type of intelligent the weight coefficient and produces the output results in X.
X. Huang et al. / Cognitive Systems Research 52 (2018) 317–324 319
Table 2
Credit risk assessment obfuscation matrix.
Physical truth
The result of the test is positive (non-default) Negative value (breach of contract)
Test result Positive value (non-default) True value (TP) False positive value (FP)
Negative value (breach of contract) False negative value (FN) True negative (TN)
dimensional graph, which represents the ratio of classifying At present, banks are making more and more efforts to
the bad applicants as bad applicants (known as ‘‘sensitiv- replicate the rating process of the rating agencies in order
ity” ordinates) to wrongly judging good applicants as bad to rating their large customers. However, it is impossible
applicants (abscissas called ‘‘1- specific”). The R0C curve for banks to appoint an analyst to analyse the large num-
is a measure of the overall performance of the model under ber of small - scale risk loans on their balance sheets. For
different boundary values. In fact, sensitivity is equal to 1 retail and small and medium-sized loans, banks need to
minus second kinds of errors, and specificity is equivalent identify borrowers’ credit based on statistical methods, so
to 1 minus the first kind of errors. AUC (Area Under as to automatically distinguish ‘‘good borrowers” and
Curve) value is an important index of RoC curve, that is, ‘‘bad borrowers”. This statistical method is called credit
the area between RoC curve and abscissa. The bigger the risk evaluation.
AUC value is, the better the credit risk assessment model
is. The maximum value is 1. When the RoC curve coincides 4.2. Credit rating
with the 45 degree line, the value of AUC equals to 0.5, and
the corresponding credit risk assessment model has no Credit ratings can be divided into two types, one for
identification. It is more comprehensive and objective to debt or financial problems, and the other for bond issuers.
evaluate the predictive ability of the model by using the The first one is the most common, often called ‘‘bond rat-
AUC value. The conclusion is more reliable when compar- ing” or ‘‘credit rating”. It is very useful to get the possibility
ing the prediction ability of various models. that an investor can gain the desired benefits in an issued
bond. The latter is an assessment of the financial obliga-
4. Results and discussion tions of the bond issuer, which conveys information about
the basic credibility of the issuer. The assessment focuses
4.1. Credit risk evaluation on the ability and willingness of the issuer to fulfil the bur-
den of political participation in time. The results can be
The bank is described by microeconomics language, that called ‘‘the credit rating of the counterparty”, the ‘‘default
is, banks are the most suitable (Pareto optimal) organiza- rating” or ‘‘the issuer’s credit rating”. The two types of rat-
tion to collect individual participants. It can play three ing are very important in the investment world.
main intermediary functions: liquidity intermediary, risk The way that companies get credit rating information is
intermediary and information intermediary. The meaning to get a credit rating for a particular bond or debt problem
of a bank is that it can complete intermediary services well by contact with a professional rating agency. Usually, the
and fill the gap in the financial market. Nowadays, bank document information that enterprises need to submit
risk has become one of the most important research topics include: annual reports in recent years, recent quarterly
in the financial field, especially in the banking industry. reports, income statements, balance sheets, recent debt
Among them, credit risk is the risk that threatens the sur- problems, and other specific information and statistical
vival of the bank. It is the main cause of the bankruptcy reports. The rating agencies will then allow analysts to do
and the most obvious risk in the management of the bank- some basic analysis of the information submitted by the
ing industry. enterprise. After the analysis is completed, the analyst will
It is necessary to reduce the credit risk faced by the submit an analysis report to the rating committee and give
bank. In order to reduce the adverse effects of credit risk, its own rating recommendations. The rating committee will
banks must evaluate the ability of customers to perform discuss with analysts after browsing the analysis report.
repayment obligations according to the agreements signed The final rating agency will give the final results and be
by both sides, so as to evaluate the possibility of user responsible for the results.
default. It is necessary to use qualitative tools and quanti- It is generally believed that credit ratings include the dis-
tative methods in assessing the risk of breach of contract. tribution of highly subjective qualitative and quantitative
Credit rating is one of the most familiar forms in qualita- factors, and the identification of variables in the industrial
tive measurement. The credit rating is carried out by the level and the market level. The rating agencies and some
rating agency, which guarantees the benefits of investors researchers have stressed the importance of subjective judg-
active in the bond market and supervises the debt sector. ment in bond ratings and some statistical and artificial
The goal of the rating agency is to issue an independent intelligence models. However, in the following part, we will
credit opinion based on a series of accurate standards. explain that some credit rating prediction models based on
X. Huang et al. / Cognitive Systems Research 52 (2018) 317–324 321
statistics and artificial intelligence can achieve very good given by the linear equations. Generally speaking, the per-
prediction results and capture important characteristics in formance of RBF neural network is greatly influenced by
the process of bond rating. the expansion velocity of radial basis function.
Fig. 3 shows the effects of different spread values on the
4.3. Determination of the parameters of BP neural network performance of the RBF neural network. Similarly, in
order to reduce the influence of initial training set and test
Fig. 2 describes the influence of the number of hidden set’s selection, weight and threshold on the result, the eval-
layer neurons on the BP neural network performance. In uation index selected here is the mean value of the corre-
order to reduce the impact of the selection, weight and sponding 10 erroneous fraction of the program running.
threshold of initialized training set and test set on the It is found that the spread value has little effect on the per-
results, here the selected evaluation index is the mean value formance of the RBF neural network. When the value of
of the error rate for the algorithm operating 50 times (30 spread is 0.5, the network performance is slightly better.
training sets and 16 different test sets are randomly selected Therefore, in the subsequent experiments, the spread value
each time). The error loss estimate divides the two different of the RBF neural network is taken 0.5.
losses in the case of accepting bad loan applicants and
rejecting good loan applicants. The standard index of the 4.5. Determination of GRNN and PNN parameters
evaluation is also based on the obfuscation matrix. For
banks, the bad applicants are misjudged to be good appli- GRNN is the input layer, the pattern layer, the summa-
cants, which will lead to greater losses. The loss of the first tion layer and the output layer. Compared with the BP
kind of errors (the good applicants are wrongly judged as neural network, GRNN has the following advantages:
the bad applicants) and the second kind of errors (the the training of the network is one-way training that it does
bad applicants are misjudged as the bad applicants) is sig- not need iteration; the number of hidden neurons is deter-
nificantly different, and the loss brought by the second kind mined by adaptive training samples; the weights between
of errors is much greater than that brought by the first kind each layer of the network is only determined by the training
of errors. samples, to avoid the weight modification of BP neural net-
It can be seen from the figure that when the number of work in the iteration revision; the activation function of
neurons in the hidden layer is 7, the total error rate of the hidden layer node uses the Gauss function with local acti-
test set and the second types of errors are the smallest, vation on the characteristics of the input information, so
which are 0.248 and 0.128, respectively. Therefore, in the the input close to the local neurons characteristics has
subsequent experiments, the value of the number of neu- strong appeal.
rons in the hidden layer of the BP neural network is 7. Probabilistic neural network (PNN) is a kind of feed-
forward neural network, proposed by Specht in 1989. He
4.4. Determination of the parameters of RBF neural network adopted the Gauss function proposed by Parzeri to form
the estimation method and Bayesian optimization rules of
In the RBF neural network, the number of neurons in joint probability distribution. As a result, it constructs
the hidden layer is the same as the number of that of the the probability density estimation and neural networks
training set, and the weights and thresholds are directly with parallel processing. Therefore, PNN not only has
the characteristics of the general neural network, but also
has good generalization ability and fast learning ability.
0.35 Total error rate
First type of error
Second types of errors 0.50
0.30
0.45
0.40
0.25 Total error rate
Error rate
0.30
0.20
0.25
0.15 0.20
0.15
0.10 0.10
2 3 4 5 6 7 8 9 10 11 0.0 0.2 0.4 0.6 0.8 1.0
The number of neurons in the hidden layer The value of spread
Fig. 2. Influence of the number of hidden layer neurons on the BP neural Fig. 3. Effect of SPREAD value on the performance of RBF neural
network. network.
322 X. Huang et al. / Cognitive Systems Research 52 (2018) 317–324
The structure of PNN is similar to that of GRNN, 4.6. Comparison of the error rate of different models
which consists of the input layer, the hidden layer and
the output layer. Unlike GRNN, the output layer of In order to compare the effectiveness of different neural
PNN uses competitive output to replace linear output. networks in the small and medium-sized enterprise credit
Each neuron solves and estimates different kinds of proba- risk evaluation problem, the experiment uses the financial
bilities only on the basis of Parzen method, and the compe- and default data of 46 small and medium-sized enterprises
tition layer inputs the response opportunities of patterns. in the Yangtze River Delta Region. In the MATLAB plat-
Finally, only one neuron wins the competition, and such form, programming of BP neural network is realized. We
winning neuron represents the classification of the input apply neural network toolbox to achieve RB > neural net-
mode. The learning algorithm of PNN is close to the learn- work and PNN, and realize ID3 decision tree algorithm.
ing algorithm of GRNN, and there is only a slight differ- Assuming that there is a set of loan applicants in the data-
ence in the output layer. base, each applicant can be divided into two groups of
Similarly, as shown in Figs. 4 and 5, in order to deter- ‘‘good credit” and ‘‘bad credit”. The credit risk assessment
mine the optimal spread value in GRNN and PNN, the model is to find a classification model that can distinguish
experiment compares the average value of the error rate between good credit and bad credit samples. A decision
for the program running 10 times. The optimal spread tree contains a group of Boolean divisions of the data.
value of GRNN is 0.7, and the spread value of PNN is 0.5. The algorithm begins with a group of root nodes that con-
tain good credit samples and bad credit samples. Next, the
Total error rate algorithm loops down to find the best split position, and
First type of error then begins to split into the leaf node and the internal node.
0.45
Second types of errors The attributes in the ID3 algorithm are discrete values, and
the attributes of the continuous values must be discretized.
0.36 The experiment compares the error rate of each algorithm
in the data set (to solve the mean value by operating each
algorithm for 10 times), and the specific results are shown
0.27
in Table 3.
Error rate
0.07 Table 3
Error rate of different models.
Model Total error rate First type of errors Second type of errors
0.00
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 BP 0.29 0.13 0.16
RBF 0.47 0.26 0.21
The value of spread PNN 0.17 0.07 0.10
GRNN 0.25 0.07 0.18
Fig. 5. Impact of SPREAD value on the performance of PNN neural
ID3 0.31 0.15 0.16
network.
X. Huang et al. / Cognitive Systems Research 52 (2018) 317–324 323
BP
0.10 RBF
1.0 GRNN
PNN
BP RBF PNN GRNN ID3
0.9
ID3
Fig. 6. Second type of errors in different models.
0.8
As shown in Fig. 6, it is observed that the credit risk
0.7
assessment model based on the probabilistic neural net-
work (PNN) has the lowest second type of error rates. 0.6
Through running each model for 10 times, we solve the
0.5
mean value. In the experiment, we calculate the prediction
result misclassification rate in 46 small and medium-sized 0.4
enterprises data set in the Yangtze River Delta Region of
five kinds of benchmark models. We first of all compare 0.3
Table 4
AUC value and mean value of the running results of different models.
Test 1 2 3 4 5 6 7 8 9 10 AVG
BP 0.75 0.82 0.79 0.67 0.55 0.81 0.75 0.61 0.81 0.71 0.73
RBF 0.72 0.71 0.69 0.33 0.51 0.33 0.68 0.54 0.76 0.64 0.59
GRNN 0.61 0.69 0.73 0.69 0.73 0.55 0.61 0.73 0.64 0.90 0.69
PNN 0.66 0.87 0.88 0.82 1.00 0.82 0.94 0.88 0.68 0.72 0.83
ID3 0.90 0.75 0.70 0.75 0.75 0.93 0.87 0.88 0.87 0.80 0.82
324 X. Huang et al. / Cognitive Systems Research 52 (2018) 317–324
Acknowledgements
We can see that the probabilistic neural network (PNN)
model has the highest average AUC value and is robust. The authors acknowledge the National Natural Science
Table 5 compares the times that different models achieve Foundation of China (Grant: 71663003).
the maximum AUC value in the 10 times tests. It is
observed that the probabilistic neural network (PNN) References
model achieves the best results (5 times).
Bao, Y. L. (2016). P2P Personal Credit Risk Simulation Model Based on
BP Neural Network 5(2), pp. 192–207.
5. Conclusion Bekhet, H. A., & Eletter, S. F. K. (2014). Credit risk assessment model for
Jordanian commercial banks: Neural scoring approach. Review of
On the basis of the Chinese private SMEs based on data Development Finance, 4(1), 20–28.
set, we compare the classification accuracy and applicabil- Chang, Y. C., Chang, K. H., Chu, H. H., & Tong, L. I. (2016).
Establishing decision tree-based short-term default credit risk assess-
ity of several common neural network models and thus ment models. Communications in Statistics, 45(23), 6803–6815.
propose some corresponding suggestions for the specific Fatemi, A., & Fooladi, I. (2014). Credit risk management: A survey of
application of the credit risk assessment model. In addi- practices. Managerial Finance, 32(3), 227–233.
tion, we prove the error rate of several common credit risk Yang, Z. (2014). Utilization of quantization method on credit risk
assessment models. The experimental results showed that assessment. Applied Mechanics & Materials, 472(6), 432–436.
Zhang, L., Hu, H., & Zhang, D. (2015). A credit risk assessment model
the probabilistic neural network (PNN) had the minimum based on svm for small and medium enterprises in supply chain
error rate and second type of errors, and the PNN model finance. Financial Innovation, 1(1), 14.
had the highest AUC value and was robust. Zhang, X., Zhao, X., & Wu, N. (2017). Credit risk assessment model for
The purpose is to make some contribution to solve the cross-border e-commerce in a BP neural network based on PSO-GA.
problem of financing for small and medium-sized enter- Agro Food Industry Hi Tech, 28(1), 411–414.
prises in China. However, because of a variety of factors