SlideShare una empresa de Scribd logo
1 de 10
Descargar para leer sin conexión
Mathematical Theory and Modeling
ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)
Vol.3, No.11, 2013

www.iiste.org

An Application of Artificial Intelligent Neural Network and
Discriminant Analyses On Credit Scoring
1.
2.
3.

Alabi, M.A.1, Issa, S2., Afolayan, R.B3.
Department of Mathematics and Statistics. Akanu Ibiam Federal Polytechnic Unwana, Afikpo, Ebonyi,
Nigeria.
Usmanu Danfodiyo University, Sokoto, Nigeria.
Department of Statistics, University of Ilorin, Nigeria.
* E-mail: alabi_ma@yahoo.com

Abstract
The research paper deals with credit scoring in banking system which compares most commonly statistical
predictive model for credit scoring, Artificial Intelligent Neural Network (ANN) and Discriminant Analyses. It is
very clear from the classification outcomes of this research that Neural Network compares well with Linear
Discriminant model. It gives slight better results than Discriminant Analysis. However, it is noteworthy that a
“Bad accepted” generates much high costs than a “Good rejected” and Neural Network acquires less amount of
“Bad accepted” than the Discriminant predictive model. It achieves less cost of misclassification for the data set
use in the research. Furthermore, if the final section of this research, an optimization algorithm (Genetic
Algorithm) is proposed in order to obtain better classification accuracy through the configuration of the neural
network architecture. On the contrary, it is important to note that the success of the predictive model largely
depends on the predictor variables selection to be used as inputs of the model.
Keywords: Scoring, Artificial Neural Network, Discriminant Analysis.
1.0 Introduction
The objective of credit scoring models is to help the banks to find good credit applications that are likely to
observe obligation according to their age, credit limit, income and marital condition. Many different credit
scoring models have been developed by the banks and researchers in order to solve the classification problems,
such as linear discriminant analysis (LDA), logistic regression (LR), multivariate adaptive regression splines
(MARS), classification and regression tree (CART), case based reasoning (CBR), and artificial neural networks
(ANNs).
LDA and ANNs are generally used as methods to construct credit scoring models. LDA is the earliest one used
for the credit scoring model. However, the utilization of LDA has often been criticized due to the assumptions of
linear relationship between input and output variables, which seldom holds, and it is sensitive to deviations from
the multivariate normality assumption (West, 2000).
1.1 Aim and Objectives
The aim is to compare the predictive ability of Artificial Neural network with the linear discriminant models for
credit scoring. This aim can be achieved through the following objectives:
1) To build a linear discriminant model capable of predicting an applicant credit status.
2) To build an artificial neural network model capable of identifying an applicant credit status.
3) To compare and contrast the predictive powers of Artificial Neural Network and Linear
Discriminant
Models.
1.2 Predictor Variables Selection
Credit scoring is performed through “Credit Risk Assessment” which has mainly three purposes (Colquitt, 2007).
First of all and most importantly, it goes through the borrower’s probability of repaying the debt by appraising
his income, character, capacity and capital adequacy etc. In addition, it tries to identify borrower’s primary
source of repayment, especially in the case of extended debt. And finally, it tries to evaluate borrower’s
secondary source of repayment if the primary source of repayment becomes not available.
Although credit risk assessment is one of the most successful applications of applied statistics, the best statistical
models don’t promise credit scoring success, it depends on the experienced risk management practices, the way
models are developed and applied, and proper use of the management information systems (Mays 1998). And at
the same time, selection of the independent variables are very essential in the model development phase because
they determine the attributes that decide the value of the credit score (see equation 1), and the value of the
independent variables are normally collected from the application form. It is very significant to identify which
variables will be selected and included in the final scoring model.

Y = β 0 + β1 x1 + β 2 x2 + ...β n xn

(1)

x
Where Y is the dependent variable and i ' s are independent variables
20
Mathematical Theory and Modeling
ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)
Vol.3, No.11, 2013

www.iiste.org

Selection of the predictor variables should not be based only on statistical analysis; other points have to be noted
also. For example, variables those are expensive or time consuming to obtain like “Debt Burden”, should be
excluded. Moreover, variables that are influenced by the organization itself like “Utilization of Advanced Cash,
should also be excluded (Mays, 1998).
2.0 Material and Methods
The data for this write-up was collected from a sample of 200 applicant on credit scoring, extracted from the
application form of First Bank of Nigeria plc, Kontagora. The methods used for credit worthy are Linear
Discriminant Analysis and Artificial Neural Network.
2.1 Linear Discriminant Analysis
LDA was first proposed by Fisher (1936) as a classification technique. It has been reported so far as the most
commonly used technique in handling classification problems (Lee et al., 1999). In the simplest type of LDA,
two-group LDA, a linear discriminant function (LDF) that passes through the centroids (geometric centres) of
the two groups can be used to discriminate between the two groups. The LDF is represented by Equation (2):
= +
+
+ ⋯+
(2)
Where a is a constant, and b1 to bp are the regression coefficients for p variables. LDA can also be applied in
other areas, such as business investment, bankruptcy prediction, and market segment (Lee et al., 1997; Kim et al.,
2000).
2.2 Neural Network
A neural network (NNW) is a mathematical representation inspired by the human brain and its ability to adapt on
the basis of the inflow of new information. Mathematically, NNW is a non-linear optimization tool. Many
various types of NNW have been specified in the literature.
The NNW design called multilayer perceptron (MLP) is especially suitable for classification and is widely used
in practice. The network consists of one input layer, one or more hidden layers and one output layer, each
consisting of several neurons. Each neuron processes its inputs and generates one output value that is transmitted
to the neurons in the subsequent layer. Each neuron in the input layer (indexed i = 1,…,n) delivers the value of
one predictor (or the characteristics) from vector x. When considering default/non-default discrimination, one
output neuron is satisfactory. In each layer, the signal propagation is accomplished as follows. First, a weighted
sum of inputs is calculated at each neuron: the output value of each neuron in the proceeding network layer times
the respective weight of the connection with that neuron. There are two stages of optimization. First, weights
have to be initialized, and second, a nonlinear optimization scheme is implemented. In the first stage, the weights
are usually initialized with some small random number. The second stage is called learning or training of NNW.
The most popular algorithm for training multilayer perceptrons is the back-propagation algorithm. As the name
suggests, the error computed from the output layer is back-propagated through the network, and the weights are
modified according to their contribution to the error function. Essentially, back-propagation performs a local
gradient search, and hence its implementation; although not computationally demanding, it does not guarantee
reaching a global minimum. For each individual, weights are modified in such a way that the error computed
from the output layer is minimized. The BP network used herein has an input layer, an intermediate hidden layer,
and an output layer. The BP-based credit scoring method is succinctly illustrated in Fig. 1. The input nodes
represent the applicant’s characteristics, and output node represents the identified class (say 1 for rejected and 2
for accepted). The BP learning involves three stages: the feed-forward of the input training pattern, the
calculation of the associated error, and the adjustment of the weights. After the network reaches a satisfactory
level of performance, it learns the relationships between independent variables (applicant’s attributes) and
dependent variable (credit class). The trained BP network can then be adopted as a scoring model to classify the
credit as either good or bad by inserting the values of applicant’s attributes.

21
Mathematical Theory and Modeling
ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)
Vol.3, No.11, 2013

www.iiste.org

2.3. Wilks’ Lambda Test for significance of canonical correlation
Hypothesis canonical correlation:

H o : There is no linear relationship between the two sets of variables
H 1 : There is linear relationship between the two sets of variables
Test statistic:

λ=

W
W +H

, where W is residual variance
H is the variance due to linear relationship
W + H is the total variance.

H

o if P<0.05 otherwise accept
Decision rule: Reject
2.4 Chi-square Test
Hypothesis for Chi-square Test:

H o at the 5% level of significance

H o : The two variables are independent
H 1 : The two variables are not independent
Test statistic:
r

c

χ = ∑∑
2

i =1 j =1

(Oij − eij )2
eij

, where

Oij

is the observed value and

eij

is the expected value.

Decision Rule:
Reject

H o if P<0.05 otherwise accept H o at the 5% level of significance

3.0 Data collection and Analysis
The dataset contains 200 cases, 163 applicants are considered as “Creditworthy” and the rest 37 are treated as
“Non-creditworthy”. Data preparation allows identifying unusual cases, invalid cases, erroneous variables and
incorrect data values in dataset.
A real world credit dataset is used in this research. The dataset is extracted from the application forms of First
Bank of Nigeria, plc Kontagora. It is referred to as “Credit Dataset”. After preparing the dataset, it is used in
the subsequent sections for conducting the analysis with Neural Network and Discriminant Analyses

22
Mathematical Theory and Modeling
ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)
Vol.3, No.11, 2013

www.iiste.org

Table 3.1: Credit Dataset Description
No.
Variable
Type
Scale
Description
1
Attribute1
Input Variable
Scale
Age of the Applicant
2
Attribut2
Input Variable
Nominal
Sex of the Applicant
3
Attribute3
Input Variable
Nominal
Ownership of residence
4
Attribute4
Input Variable
Nominal
Marital status
5
Attribute5
Input Variable
Nominal
Qualification
6
Attribute6
Input Variable
Nominal
Employment status
7
Attribute7
Input Variable
Nominal
Employment classification
8
Attribute8
Input Variable
Scale
Length of service
9
Attribute9
Input Variable
Scale
Salary
10
Attribute10
Input Variable
Nominal
Application Request
11
Attribute11
Input Variable
Scale
Amount Request
12
Attribute12
Input Variable
Scale
Credit Amount
13
Attribute13
Input Variable
Scale
Proposed tenor in month
14
Attribute14
Input Variable
Nominal
Other borrowing
15
Attribute15
Output Variable
Nominal
Status of the Credit Applicant
The dataset contains 200 cases, 163 applicants are considered as “Creditworthy” and the rest 37 applicants are
treated as “Non-creditworthy”. The dataset holds 15 variables altogether. Among the variables, 9 variables are
“Categorical” and the rest 6 variables are “Numerical”. Moreover, there are 14 independent variables (input
variables) and 1 dependent variable (output variable) in the dataset.
3.1 Measurement of Model Performance in Discriminant Analysis
Discriminant analysis is a statistical technique to classify the target population (in this study, credit card
applicants) into the specific categories or groups (here, either creditworthy applicant or non-creditworthy
applicant) based on the certain attributes (predictor variables or independent variables) (Plewa and Friedlob
1995). Discriminant analysis requires fulfilling definite assumptions, for example, assumption of normality,
assumption of linearity, assumption of homoscedasticity, absence of multicollinearity and outlier, but this
method is fairly robust to the violation of these assumptions (Meyers, Gamst et al. 2005). Here, in this study, it is
assumed that all required assumptions are fulfilled to use the predictive power of the discriminant analysis for
classification of the applicants. At this point, “creditworthiness” is the dependent variable (or, grouping variable)
and the rest 14 variables are the independent variables (or, input variables).
A discriminant model is identified as “Useful” if there is at least 25% more improvement achievable over by
chance accuracy rate alone. “By Chance Accuracy” means that if there is no relationship between the dependent
variable and the independent variables, it is still possible to achieve some percentage of correct group
membership. Here, “By Chance Accuracy Rate” is 70% (.8152+.1852) and 25% increase of this value equals to
87.5% (1.25 70%) and the cross validated accuracy rate is 88.5%. Hence, cross validated accuracy rate is
greater than or equal to the proportional by chance accuracy rate, it is possible to declare that the discriminant
model is useful for the classification goal. Moreover, Wilks' lambda is a measure of the usefulness of the model.
The smaller significance value indicates that the discriminant function does better than chance at separating the
groups
Table 3.2: SPSS Output: Model Test Wilks' Lambda
Test of Function(s)

Wilks' Lambda

Chi-square

Df

Sig.

1

.501

131.995

14

.000

Here, Wilks' lambda test has a probability of <0.000 which is less than the level of significance of .05, means
that predictors (i.e. independents variables such as Age of applicant, sex, salary, length of service etc)
significantly discriminate the groups. This provides the proportion of total variability not explained, i.e. 50.1%
unexplained.
Table 3.3: Canonical correlation
Function
1

Eigenvalue
0.996

a

% of Variance

Cumulative %

Canonical Correlation

100.0

100.0

.706

a. First 1 canonical discriminant functions were used in the analysis.
A canonical correlation of 0.706 suggests that the model explains 49.84% (i.e. .7062) of the variation in the
grouping variable, i.e. whether an applicant is a creditworthy or not.
3.2 Importance of Independent Variables in Discriminant Analysis
One of the most important tasks is to identify the independent variables that are important in the predictive
23
Mathematical Theory and Modeling
ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)
Vol.3, No.11, 2013

model development.
Table 3.4: Discriminant Model
FUNCTIONS
Structure matrix

www.iiste.org

Standardized discriminant
coefficients
.434
-.025
-.033
-.131
.015
.012
-.267
.158
-.412
-.163
1.253
-1.031
-.137
.688

Canonical
discriminant
coefficients
Age
.473
.062
Sex
-.041
-.055
Owner ship
-.058
-.067
Marital Status
-.095
-.173
Qualification
.051
.011
Employment Status
-.010
.029
Employment Classification
-.119
-.320
Length Of Service
.467
.020
Salary
-.148
.000
Application Request
-.061
-.380
Amount Request
.040
.000
Credit Amount
-.153
.000
Propose Tenure
-.092
-.023
Other Borrowing
.554
1.494
Constant
-3.173
It can be identified from the structure matrix, that the predictor variables strongly associated with the
discriminant model, are the “Age of the Applicant”, “Length of Service” and “Other Borrowing”. The structure
matrix shows the correlations of each variable with each discriminant function. These Pearson coefficients are
structure coefficients or discriminant loadings. They serve like factor loadings in factor analysis. Generally, just
like factor loadings, 0.30 is seen as the cut-off between important and less important variables. The interpretation
of the standardized discriminant function coefficients is like that in multiple regressions. It provides an index of
the importance of each predictor like the standardized regression coefficients did in multiple regression. The sign
indicates the direction of the relationship. Other Borrowing score is the highest predictor, follow by Age of
applicant and Length of service is the least importance of a predictor. The canonical discriminant function
coefficients (unstandardized coefficients) are used to create the discriminant function (equation). It operates just
like regression equation.
There are specific characteristics determined by the discriminant model for the two groups (creditworthy
applicants and non-creditworthy applicants). Based on these given characteristics, an applicant is awarded “Good”
and another one is the “Bad”. These characteristics differ between the two groups. For example, on an average, a
good customer has the Age of at most 45 years; spend at most 20years in service and not borrowing from
cooperative, Employers and other bank. The most important variables, identified in structure matrix before, are
shown below.
Table 3.5: SPSS Output: Group Statistics: Only The Good Group:
Valid N (listwise)
Status of the Credit Applicant

Std. Deviation

Unweighted

Weighted

Age of applicant

45.9141

6.90267

163

163.000

Length of service
Other Borrowing

Good

Mean
20.1166
1.1043

7.80186
0.37865

163
163

163.000
163.000

Characteristics of the Creditworthy Applicants
On the other hand, the bad group holds some certain characteristics in contradictory with the good group. For
example, on an average, non-creditworthy applicants possess at least 54years of the Age; spend at least 29years
in service and having Borrowing from Employer. Only the most important characteristics are show below.

24
Mathematical Theory and Modeling
ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)
Vol.3, No.11, 2013

www.iiste.org

Table 3.6: SPSS Output: Group Statistics: Only The bad Group:
Valid N (listwise)
Status of the Credit Applicant
Bad

Age of applicant

Mean

Std. Deviation

Unweighted

Weighted

54.4054

7.51456

37

37.000

Length of service
29.6757
8.85078
37
37.000
Other Borrowing
1.7568
0.72286
37
37.000
Characteristics of the Non-creditworthy Applicant.
3.3 Neural Network Structure (Architecture): Neural network model is constructed with the multilayer
perceptron algorithm. In the architectural point of view, it is a 14-10-1 neural network, means that there are total
14 independent variables, 2 neurons in the hidden layer and 1 dependent (output) variable. SPSS software is used.
SPSS procedure can choose the best architecture automatically and it builds the network with one hidden layer.
It is also possible to specify the minimum (by default 1) and maximum (by default 50) number of units allowed
in the hidden layer, and the automatic architecture selection procedure finds out the “best” number of units (6
units are selected for this analysis) in the hidden layer. Automatic architecture selection uses the default
activation functions for the hidden layer (Hyperbolic Tangent) and output layers (softmax). Predictor variables
consist of “Factors” and “Covariates”. Factors are the categorical dependent variables (8 nominal variables) and
the covariates are the scale dependent variables (6 continuous variables). Moreover, standardized method is
chosen for the rescaling of the scale dependent variables to improve the network training. Further, 70% of the
data is allocated for the training (training sample) of the network and to obtain a model; and 30% is assigned as
testing sample to keep tracks of the errors and to protect from the overtraining. Different types of training
methods are available like batch, online and minibatch. Here, batch training is chosen because it directly
minimizes the total error and it is most useful for “smaller” datasets. Moreover, Optimization algorithm is used
to estimate the synaptic weights and “Scaled Conjugate Gradient” optimization algorithm is assigned because of
the selection of the batch training method. Batch training method supports only this algorithm. Additionally,
stopping rules are used to determine the stopping criteria for the network training. According to the rule
definitions, a step corresponds to iteration for the batch training method. Here, one (1) maximum step is allowed
if the error is not decreased further. Here, it is important to note that, to replicate the neural network results
exactly, data analyzer needs to use the same initialization value for the random number generator, the same data
order and the same variable order, in addition to using the same procedure settings.
3.3.1 Measurement of Model Performance in Neural Network Structure (Architecture)
The following model summary table displays information about the results of the neural network training. Here,
cross entropy error is displayed because the output layer uses the softmax activation function. This is the error
function that the network tries to minimize during training. Moreover, the percentage of incorrect prediction is
equivalent to 18.88% in the training samples. So, percentage of correct prediction is nearer to 94.4% that is quite
high. If any dependent variable has scale measurement level, then the average overall relative error (relative to
the mean model) is displayed. On the other hand, if the defined dependent variables are categorical, then the
average percentage of incorrect predictions is displayed
Table 3.7: SPSS Output: Model Summary:
Training

18.882

Percent Incorrect Predictions

5.6%

Stopping Rule Used

1 consecutive step(s) with no decrease in error

Training Time

0:00:00.265

Cross Entropy Error

5.383

Percent Incorrect Predictions

Testing

Cross Entropy Error

1.8%

Dependent Variable: applicant Status
a. Error computations are based on the testing sample.
3.3.2 Importance of Independent Variables:
The following table performs an analysis, which computes the importance and the normalized importance of
each predictor in determining the neural network. The analysis is based on the training and testing samples. The
importance of an independent variable is a measure of how much the network’s model-predicted value changes
for different values of the independent variable. Moreover, the normalized importance is simply the importance
values divided by the largest importance values and expressed as percentages. From the following table, it is
evident that “Amount Request” contributes most in the neural network model construction, followed by “Credit
25
Mathematical Theory and Modeling
ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)
Vol.3, No.11, 2013

www.iiste.org

Amount”, “Annual Estimated Earnings of the Applicants”, “Ages in Years of the Applicants”, “Other
Borrowing”, “Length of Service”, “Propose Tenor in Month” etc
Table 3.8: SPSS Output: Independent Variable Importance:
Importance
Normalized Importance
gender of applicant
ownership of residence
Marital status
qualification of applicant
Employment status
Employment classification
application request
Other Borrowing
Age of applicant
Length of service
applicant salary
Amount Request
credit amount
proposed tenor in month

.018
.004
.027
.031
.019
.021
.029
.088
.115
.079
.128
.218
.151
.072

8.1%
2.0%
12.6%
14.2%
8.9%
9.4%
13.5%
40.4%
53.1%
36.4%
58.7%
100.0%
69.2%
33.0%

Important Variables Identified By the Neural Network Model
The above table shows importance on how the network classifies the prospective applicants. But it is not
possible to identify the direction of the relationship between these variables and the predicted probability of
default. This is one of the most prominent limitations of the neural network. So, statistical models will help in
this situation.
3.4 Comparison of the Model’s Predictive Ability
3.4.1 Discriminant Analysis:
In the discriminant analysis model development phase, a statistically significant model is derived which possess
a very good classification accuracy capability. In the following table, it is shown that the discriminant model is
able to classify 152 good applicants as “Good Group” out of 163 good applicants. Thus, it holds 93.3%
classification accuracy for the good group. On the other hand, the same discriminant model is able to classify 30
bad applicants as “Bad Group” out of 37 bad applicants. Thus, it holds 81.1% classification accuracy for the bad
group. Thus, the model is able to generate 91.0% classification accuracy in combined groups.
Table 3.9: SPSS Output: Classification Results:
Predictive Ability of the Discriminant Model
Predicted Group Membership
Attribute 15

2.00

Total

1.00

152

11

163

2.00

7

30

37

1.00

93.3

6.7

100.0

2.00

18.9

81.1

100.0

Count

1.00

150

13

163

%

2.00
1.00

10
92.0

27
8.0

37
100.0

2.00

Original

1.00

27.0

73.0

100.0

Count
%

Cross-validated

a

b.91.0% of original grouped cases correctly classified.
c.88.5% of cross-validated grouped cases correctly classified.
3.4.3 Artificial Neural Network:
In the artificial neural network model development stage, a predictive model is derived which enjoys a very good
classification accuracy capability. In the following table, it is shown that the neural network model is able to
classify 115 good applicants as “Good Group” out of 116 good applicants. Thus, it holds 99.1% classification
accuracy for the good group. On the other hand, the same neural network model is able to classify 21 bad
26
Mathematical Theory and Modeling
ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)
Vol.3, No.11, 2013

www.iiste.org

applicants as “Bad Group” out of 28 bad applicants. Thus, it holds 75.0% classification accuracy for the bad
group. Thus, the model is able to generate 94.4% classification accuracy for the both groups. Here, the training
sample is taken into account, because statistical models don’t use testing sample
Table 3.10: SPSS Output: Classification Results:
Predictive Ability of the Artificial Neural Network.
Predicted
Sample

Observed

1

2

Percent Correct

Training

1

115

1

99.1%

2

7

21

75.0%

Overall Percent

84.7%

15.3%

94.4%

1

47

0

100.0%

2

1

8

88.9%

Overall Percent

85.7%

14.3%

98.2%

Testing

Dependent Variable: applicant Status
4.0 Findings and Conclusions.
Appropriate predictor variables selection is one of the conditions for successful credit scoring models
development. This study reviews several considerations regarding the selection of the predictor variables.
Moreover, using the Multilayer Perceptron Algorithm of Neural Network, network architecture is constructed for
predicting the probability that a given customer will default on a loan. The model results are comparable to those
obtained using commonly used techniques like Discriminant Analysis and Artificial Neutral Network as
described in the following table:
Table 4.1: Predictive Models Comparison
Dataset
Models
Good
Good
Bad
Bad Rejected
Success Rate
Accepted
Rejected
Accepted
Discriminant Analysis
152
11
30
7
91.0%
Neural Network
115
1
7
21
94.4%
There are two noteworthy and interesting points about this table. First of all, it shows the predictive ability of
each model. Here, the column 2 and 5 (“Good Accepted” and “Bad Rejected”) are the applicants that are
classified correctly. Moreover, the column 3 and 4 (“Good Rejected” and “Bad Accepted”) are the applicants
that are classified incorrectly. Furthermore, it shows that neural network gives slightly better results than
discriminant analysis. It should be noted that it is not possible to draw a general conclusion that neural network
holds better predictive ability than discriminant analysis because this study covers only one dataset. On the other
hand, statistical models can be used to further explore the nature of the relationship between the dependent and
each independent variable.
Secondly, the table 4.1 gives an idea about the cost of misclassification which assumed that a “Bad Accepted”
generates much higher costs than a “Good Rejected”, because there is a chance to lose the whole amount of
credit while accepting a “Bad” and only losing the interest payments while rejecting a “Good”. In this analysis, it
is apparent that neural network (equals to 7) acquired less amount of “Bad Accepted” than discriminant analysis
(equals to 30). So, neural network achieves less cost of misclassification.
Refrence
Colquitt, J. (2007). Credit Risk Management, McGraw-Hill.
Fisher, R.A. (1936). The use of Multiple Measurements in Taxonomic Problems. Annals of Eugenic, No. 7, p.
179-188.
West, D. (2000). Neural Network Credit Scoring Models. Computer & Operations Research. No. 27, p. 1131–
1152.
Kim, J. C., Kim, D. H., Kim, J. J., Ye, J. S., and Lee, H. S. (2000). Segmenting the Korean Housing Market
Using Multiple Discriminant Analysis, Construction Management and Economics, p. 45-54.
Lee, T.H., and S.H. Jung. 1999/2000. Forecasting creditworthiness: logistic vs artificial neural
network, The
Journal of Business Forecasting Methods & Systems 18(4) 28-30.
Mays, E. (1998). Credit Risk Modeling: Design and Application, CRC. Mcculloch, W. and W. Pitts (1943). A
logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biology Vol. 5, No.

27
Mathematical Theory and Modeling
ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)
Vol.3, No.11, 2013

www.iiste.org

4, p. 115–133.
Meyers, L. S., Gamst, G. C. & Guarino, A. J. (2005). Applied Multivariate Research: Design and Interpretation,
Sage Publications.
Plewa, F. J. & Friedlob, G. T. (1995). Understanding Cash Flow, Wiley.
West, D. (2000). Neural Network Credit Scoring Models. Computer & Operations Research. No. 27, p. 1131–
1152.

28
This academic article was published by The International Institute for Science,
Technology and Education (IISTE). The IISTE is a pioneer in the Open Access
Publishing service based in the U.S. and Europe. The aim of the institute is
Accelerating Global Knowledge Sharing.
More information about the publisher can be found in the IISTE’s homepage:
http://www.iiste.org
CALL FOR JOURNAL PAPERS
The IISTE is currently hosting more than 30 peer-reviewed academic journals and
collaborating with academic institutions around the world. There’s no deadline for
submission. Prospective authors of IISTE journals can find the submission
instruction on the following page: http://www.iiste.org/journals/
The IISTE
editorial team promises to the review and publish all the qualified submissions in a
fast manner. All the journals articles are available online to the readers all over the
world without financial, legal, or technical barriers other than those inseparable from
gaining access to the internet itself. Printed version of the journals is also available
upon request of readers and authors.
MORE RESOURCES
Book publication information: http://www.iiste.org/book/
Recent conferences: http://www.iiste.org/conference/
IISTE Knowledge Sharing Partners
EBSCO, Index Copernicus, Ulrich's Periodicals Directory, JournalTOCS, PKP Open
Archives Harvester, Bielefeld Academic Search Engine, Elektronische
Zeitschriftenbibliothek EZB, Open J-Gate, OCLC WorldCat, Universe Digtial
Library , NewJour, Google Scholar

Más contenido relacionado

La actualidad más candente

IRJET- Credit Card Fraud Detection using Isolation Forest
IRJET- Credit Card Fraud Detection using Isolation ForestIRJET- Credit Card Fraud Detection using Isolation Forest
IRJET- Credit Card Fraud Detection using Isolation ForestIRJET Journal
 
PREDICTING STUDENT ACADEMIC PERFORMANCE IN BLENDED LEARNING USING ARTIFICIAL ...
PREDICTING STUDENT ACADEMIC PERFORMANCE IN BLENDED LEARNING USING ARTIFICIAL ...PREDICTING STUDENT ACADEMIC PERFORMANCE IN BLENDED LEARNING USING ARTIFICIAL ...
PREDICTING STUDENT ACADEMIC PERFORMANCE IN BLENDED LEARNING USING ARTIFICIAL ...ijaia
 
Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...Pratibha Singh
 
The use of genetic algorithm, clustering and feature selection techniques in ...
The use of genetic algorithm, clustering and feature selection techniques in ...The use of genetic algorithm, clustering and feature selection techniques in ...
The use of genetic algorithm, clustering and feature selection techniques in ...IJMIT JOURNAL
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics DomainDrjabez
 
Performance Analysis of Various Data Mining Techniques on Banknote Authentica...
Performance Analysis of Various Data Mining Techniques on Banknote Authentica...Performance Analysis of Various Data Mining Techniques on Banknote Authentica...
Performance Analysis of Various Data Mining Techniques on Banknote Authentica...inventionjournals
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithmsankit panigrahy
 
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...Ferhat Ozgur Catak
 
Prediction of Default Customer in Banking Sector using Artificial Neural Network
Prediction of Default Customer in Banking Sector using Artificial Neural NetworkPrediction of Default Customer in Banking Sector using Artificial Neural Network
Prediction of Default Customer in Banking Sector using Artificial Neural Networkrahulmonikasharma
 
Credit Scoring Using CART Algorithm and Binary Particle Swarm Optimization
Credit Scoring Using CART Algorithm and Binary Particle Swarm Optimization Credit Scoring Using CART Algorithm and Binary Particle Swarm Optimization
Credit Scoring Using CART Algorithm and Binary Particle Swarm Optimization IJECEIAES
 
Deployment of ID3 decision tree algorithm for placement prediction
Deployment of ID3 decision tree algorithm for placement predictionDeployment of ID3 decision tree algorithm for placement prediction
Deployment of ID3 decision tree algorithm for placement predictionijtsrd
 
Independent Component Analysis of Edge Information for Face Recognition
Independent Component Analysis of Edge Information for Face RecognitionIndependent Component Analysis of Edge Information for Face Recognition
Independent Component Analysis of Edge Information for Face RecognitionCSCJournals
 
Improving the credit scoring model of microfinance
Improving the credit scoring model of microfinanceImproving the credit scoring model of microfinance
Improving the credit scoring model of microfinanceeSAT Publishing House
 
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...ijaia
 
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...ijscmcj
 
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...IJECEIAES
 

La actualidad más candente (18)

IRJET- Credit Card Fraud Detection using Isolation Forest
IRJET- Credit Card Fraud Detection using Isolation ForestIRJET- Credit Card Fraud Detection using Isolation Forest
IRJET- Credit Card Fraud Detection using Isolation Forest
 
PREDICTING STUDENT ACADEMIC PERFORMANCE IN BLENDED LEARNING USING ARTIFICIAL ...
PREDICTING STUDENT ACADEMIC PERFORMANCE IN BLENDED LEARNING USING ARTIFICIAL ...PREDICTING STUDENT ACADEMIC PERFORMANCE IN BLENDED LEARNING USING ARTIFICIAL ...
PREDICTING STUDENT ACADEMIC PERFORMANCE IN BLENDED LEARNING USING ARTIFICIAL ...
 
Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...
 
The use of genetic algorithm, clustering and feature selection techniques in ...
The use of genetic algorithm, clustering and feature selection techniques in ...The use of genetic algorithm, clustering and feature selection techniques in ...
The use of genetic algorithm, clustering and feature selection techniques in ...
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics Domain
 
Performance Analysis of Various Data Mining Techniques on Banknote Authentica...
Performance Analysis of Various Data Mining Techniques on Banknote Authentica...Performance Analysis of Various Data Mining Techniques on Banknote Authentica...
Performance Analysis of Various Data Mining Techniques on Banknote Authentica...
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithms
 
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...
 
Prediction of Default Customer in Banking Sector using Artificial Neural Network
Prediction of Default Customer in Banking Sector using Artificial Neural NetworkPrediction of Default Customer in Banking Sector using Artificial Neural Network
Prediction of Default Customer in Banking Sector using Artificial Neural Network
 
Credit Scoring Using CART Algorithm and Binary Particle Swarm Optimization
Credit Scoring Using CART Algorithm and Binary Particle Swarm Optimization Credit Scoring Using CART Algorithm and Binary Particle Swarm Optimization
Credit Scoring Using CART Algorithm and Binary Particle Swarm Optimization
 
Deployment of ID3 decision tree algorithm for placement prediction
Deployment of ID3 decision tree algorithm for placement predictionDeployment of ID3 decision tree algorithm for placement prediction
Deployment of ID3 decision tree algorithm for placement prediction
 
Credit scoring i financial sector
Credit scoring i financial  sector Credit scoring i financial  sector
Credit scoring i financial sector
 
Independent Component Analysis of Edge Information for Face Recognition
Independent Component Analysis of Edge Information for Face RecognitionIndependent Component Analysis of Edge Information for Face Recognition
Independent Component Analysis of Edge Information for Face Recognition
 
Improving the credit scoring model of microfinance
Improving the credit scoring model of microfinanceImproving the credit scoring model of microfinance
Improving the credit scoring model of microfinance
 
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...
 
I0704047054
I0704047054I0704047054
I0704047054
 
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
 
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...
 

Destacado

An Introduction to Digital Credit: Resources to Plan a Deployment
An Introduction to Digital Credit: Resources to Plan a DeploymentAn Introduction to Digital Credit: Resources to Plan a Deployment
An Introduction to Digital Credit: Resources to Plan a DeploymentCGAP
 
Ai and neural networks
Ai and neural networksAi and neural networks
Ai and neural networksNikhil Kansari
 
Introduction to Recurrent Neural Network with Application to Sentiment Analys...
Introduction to Recurrent Neural Network with Application to Sentiment Analys...Introduction to Recurrent Neural Network with Application to Sentiment Analys...
Introduction to Recurrent Neural Network with Application to Sentiment Analys...Artifacia
 
neural network
neural networkneural network
neural networkSTUDENT
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications Ahmed_hashmi
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkDEEPASHRI HK
 

Destacado (8)

Credit Score Millionaire: University of Idaho Extension
Credit Score Millionaire: University of Idaho Extension Credit Score Millionaire: University of Idaho Extension
Credit Score Millionaire: University of Idaho Extension
 
An Introduction to Digital Credit: Resources to Plan a Deployment
An Introduction to Digital Credit: Resources to Plan a DeploymentAn Introduction to Digital Credit: Resources to Plan a Deployment
An Introduction to Digital Credit: Resources to Plan a Deployment
 
Ai and neural networks
Ai and neural networksAi and neural networks
Ai and neural networks
 
Introduction to Recurrent Neural Network with Application to Sentiment Analys...
Introduction to Recurrent Neural Network with Application to Sentiment Analys...Introduction to Recurrent Neural Network with Application to Sentiment Analys...
Introduction to Recurrent Neural Network with Application to Sentiment Analys...
 
Neural
NeuralNeural
Neural
 
neural network
neural networkneural network
neural network
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 

Similar a Artificial Neural Network and Discriminant Analysis for Credit Scoring

Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...inventionjournals
 
KIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdfKIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdfDr. Radhey Shyam
 
Summer 07-mfin7011-tang1922
Summer 07-mfin7011-tang1922Summer 07-mfin7011-tang1922
Summer 07-mfin7011-tang1922stone55
 
Pt2520 Unit 6 Data Mining Project
Pt2520 Unit 6 Data Mining ProjectPt2520 Unit 6 Data Mining Project
Pt2520 Unit 6 Data Mining ProjectJoyce Williams
 
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...IJDKP
 
B510519.pdf
B510519.pdfB510519.pdf
B510519.pdfaijbm
 
Proficiency comparison ofladtree
Proficiency comparison ofladtreeProficiency comparison ofladtree
Proficiency comparison ofladtreeijcsa
 
Q UANTUM C LUSTERING -B ASED F EATURE SUBSET S ELECTION FOR MAMMOGRAPHIC I...
Q UANTUM  C LUSTERING -B ASED  F EATURE SUBSET  S ELECTION FOR MAMMOGRAPHIC I...Q UANTUM  C LUSTERING -B ASED  F EATURE SUBSET  S ELECTION FOR MAMMOGRAPHIC I...
Q UANTUM C LUSTERING -B ASED F EATURE SUBSET S ELECTION FOR MAMMOGRAPHIC I...ijcsit
 
A Hybrid Theory Of Power Theft Detection
A Hybrid Theory Of Power Theft DetectionA Hybrid Theory Of Power Theft Detection
A Hybrid Theory Of Power Theft DetectionCamella Taylor
 
A predictive system for detection of bankruptcy using machine learning techni...
A predictive system for detection of bankruptcy using machine learning techni...A predictive system for detection of bankruptcy using machine learning techni...
A predictive system for detection of bankruptcy using machine learning techni...IJDKP
 
A Review on Credit Card Default Modelling using Data Science
A Review on Credit Card Default Modelling using Data ScienceA Review on Credit Card Default Modelling using Data Science
A Review on Credit Card Default Modelling using Data ScienceYogeshIJTSRD
 
CREDIT RISK MANAGEMENT USING ARTIFICIAL INTELLIGENCE TECHNIQUES
CREDIT RISK MANAGEMENT USING ARTIFICIAL INTELLIGENCE TECHNIQUESCREDIT RISK MANAGEMENT USING ARTIFICIAL INTELLIGENCE TECHNIQUES
CREDIT RISK MANAGEMENT USING ARTIFICIAL INTELLIGENCE TECHNIQUESijaia
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionIRJET Journal
 
An Explanation Framework for Interpretable Credit Scoring
An Explanation Framework for Interpretable Credit Scoring An Explanation Framework for Interpretable Credit Scoring
An Explanation Framework for Interpretable Credit Scoring gerogepatton
 
AN EXPLANATION FRAMEWORK FOR INTERPRETABLE CREDIT SCORING
AN EXPLANATION FRAMEWORK FOR INTERPRETABLE CREDIT SCORINGAN EXPLANATION FRAMEWORK FOR INTERPRETABLE CREDIT SCORING
AN EXPLANATION FRAMEWORK FOR INTERPRETABLE CREDIT SCORINGijaia
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET Journal
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET Journal
 

Similar a Artificial Neural Network and Discriminant Analysis for Credit Scoring (20)

Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
 
KIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdfKIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdf
 
Summer 07-mfin7011-tang1922
Summer 07-mfin7011-tang1922Summer 07-mfin7011-tang1922
Summer 07-mfin7011-tang1922
 
Pt2520 Unit 6 Data Mining Project
Pt2520 Unit 6 Data Mining ProjectPt2520 Unit 6 Data Mining Project
Pt2520 Unit 6 Data Mining Project
 
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...
 
B510519.pdf
B510519.pdfB510519.pdf
B510519.pdf
 
Proficiency comparison ofladtree
Proficiency comparison ofladtreeProficiency comparison ofladtree
Proficiency comparison ofladtree
 
Q UANTUM C LUSTERING -B ASED F EATURE SUBSET S ELECTION FOR MAMMOGRAPHIC I...
Q UANTUM  C LUSTERING -B ASED  F EATURE SUBSET  S ELECTION FOR MAMMOGRAPHIC I...Q UANTUM  C LUSTERING -B ASED  F EATURE SUBSET  S ELECTION FOR MAMMOGRAPHIC I...
Q UANTUM C LUSTERING -B ASED F EATURE SUBSET S ELECTION FOR MAMMOGRAPHIC I...
 
A Hybrid Theory Of Power Theft Detection
A Hybrid Theory Of Power Theft DetectionA Hybrid Theory Of Power Theft Detection
A Hybrid Theory Of Power Theft Detection
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
 
A predictive system for detection of bankruptcy using machine learning techni...
A predictive system for detection of bankruptcy using machine learning techni...A predictive system for detection of bankruptcy using machine learning techni...
A predictive system for detection of bankruptcy using machine learning techni...
 
A Review on Credit Card Default Modelling using Data Science
A Review on Credit Card Default Modelling using Data ScienceA Review on Credit Card Default Modelling using Data Science
A Review on Credit Card Default Modelling using Data Science
 
Di35605610
Di35605610Di35605610
Di35605610
 
CREDIT RISK MANAGEMENT USING ARTIFICIAL INTELLIGENCE TECHNIQUES
CREDIT RISK MANAGEMENT USING ARTIFICIAL INTELLIGENCE TECHNIQUESCREDIT RISK MANAGEMENT USING ARTIFICIAL INTELLIGENCE TECHNIQUES
CREDIT RISK MANAGEMENT USING ARTIFICIAL INTELLIGENCE TECHNIQUES
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
 
An Explanation Framework for Interpretable Credit Scoring
An Explanation Framework for Interpretable Credit Scoring An Explanation Framework for Interpretable Credit Scoring
An Explanation Framework for Interpretable Credit Scoring
 
AN EXPLANATION FRAMEWORK FOR INTERPRETABLE CREDIT SCORING
AN EXPLANATION FRAMEWORK FOR INTERPRETABLE CREDIT SCORINGAN EXPLANATION FRAMEWORK FOR INTERPRETABLE CREDIT SCORING
AN EXPLANATION FRAMEWORK FOR INTERPRETABLE CREDIT SCORING
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection Analysis
 
03_AJMS_298_21.pdf
03_AJMS_298_21.pdf03_AJMS_298_21.pdf
03_AJMS_298_21.pdf
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
 

Más de Alexander Decker

Abnormalities of hormones and inflammatory cytokines in women affected with p...
Abnormalities of hormones and inflammatory cytokines in women affected with p...Abnormalities of hormones and inflammatory cytokines in women affected with p...
Abnormalities of hormones and inflammatory cytokines in women affected with p...Alexander Decker
 
A validation of the adverse childhood experiences scale in
A validation of the adverse childhood experiences scale inA validation of the adverse childhood experiences scale in
A validation of the adverse childhood experiences scale inAlexander Decker
 
A usability evaluation framework for b2 c e commerce websites
A usability evaluation framework for b2 c e commerce websitesA usability evaluation framework for b2 c e commerce websites
A usability evaluation framework for b2 c e commerce websitesAlexander Decker
 
A universal model for managing the marketing executives in nigerian banks
A universal model for managing the marketing executives in nigerian banksA universal model for managing the marketing executives in nigerian banks
A universal model for managing the marketing executives in nigerian banksAlexander Decker
 
A unique common fixed point theorems in generalized d
A unique common fixed point theorems in generalized dA unique common fixed point theorems in generalized d
A unique common fixed point theorems in generalized dAlexander Decker
 
A trends of salmonella and antibiotic resistance
A trends of salmonella and antibiotic resistanceA trends of salmonella and antibiotic resistance
A trends of salmonella and antibiotic resistanceAlexander Decker
 
A transformational generative approach towards understanding al-istifham
A transformational  generative approach towards understanding al-istifhamA transformational  generative approach towards understanding al-istifham
A transformational generative approach towards understanding al-istifhamAlexander Decker
 
A time series analysis of the determinants of savings in namibia
A time series analysis of the determinants of savings in namibiaA time series analysis of the determinants of savings in namibia
A time series analysis of the determinants of savings in namibiaAlexander Decker
 
A therapy for physical and mental fitness of school children
A therapy for physical and mental fitness of school childrenA therapy for physical and mental fitness of school children
A therapy for physical and mental fitness of school childrenAlexander Decker
 
A theory of efficiency for managing the marketing executives in nigerian banks
A theory of efficiency for managing the marketing executives in nigerian banksA theory of efficiency for managing the marketing executives in nigerian banks
A theory of efficiency for managing the marketing executives in nigerian banksAlexander Decker
 
A systematic evaluation of link budget for
A systematic evaluation of link budget forA systematic evaluation of link budget for
A systematic evaluation of link budget forAlexander Decker
 
A synthetic review of contraceptive supplies in punjab
A synthetic review of contraceptive supplies in punjabA synthetic review of contraceptive supplies in punjab
A synthetic review of contraceptive supplies in punjabAlexander Decker
 
A synthesis of taylor’s and fayol’s management approaches for managing market...
A synthesis of taylor’s and fayol’s management approaches for managing market...A synthesis of taylor’s and fayol’s management approaches for managing market...
A synthesis of taylor’s and fayol’s management approaches for managing market...Alexander Decker
 
A survey paper on sequence pattern mining with incremental
A survey paper on sequence pattern mining with incrementalA survey paper on sequence pattern mining with incremental
A survey paper on sequence pattern mining with incrementalAlexander Decker
 
A survey on live virtual machine migrations and its techniques
A survey on live virtual machine migrations and its techniquesA survey on live virtual machine migrations and its techniques
A survey on live virtual machine migrations and its techniquesAlexander Decker
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbAlexander Decker
 
A survey on challenges to the media cloud
A survey on challenges to the media cloudA survey on challenges to the media cloud
A survey on challenges to the media cloudAlexander Decker
 
A survey of provenance leveraged
A survey of provenance leveragedA survey of provenance leveraged
A survey of provenance leveragedAlexander Decker
 
A survey of private equity investments in kenya
A survey of private equity investments in kenyaA survey of private equity investments in kenya
A survey of private equity investments in kenyaAlexander Decker
 
A study to measures the financial health of
A study to measures the financial health ofA study to measures the financial health of
A study to measures the financial health ofAlexander Decker
 

Más de Alexander Decker (20)

Abnormalities of hormones and inflammatory cytokines in women affected with p...
Abnormalities of hormones and inflammatory cytokines in women affected with p...Abnormalities of hormones and inflammatory cytokines in women affected with p...
Abnormalities of hormones and inflammatory cytokines in women affected with p...
 
A validation of the adverse childhood experiences scale in
A validation of the adverse childhood experiences scale inA validation of the adverse childhood experiences scale in
A validation of the adverse childhood experiences scale in
 
A usability evaluation framework for b2 c e commerce websites
A usability evaluation framework for b2 c e commerce websitesA usability evaluation framework for b2 c e commerce websites
A usability evaluation framework for b2 c e commerce websites
 
A universal model for managing the marketing executives in nigerian banks
A universal model for managing the marketing executives in nigerian banksA universal model for managing the marketing executives in nigerian banks
A universal model for managing the marketing executives in nigerian banks
 
A unique common fixed point theorems in generalized d
A unique common fixed point theorems in generalized dA unique common fixed point theorems in generalized d
A unique common fixed point theorems in generalized d
 
A trends of salmonella and antibiotic resistance
A trends of salmonella and antibiotic resistanceA trends of salmonella and antibiotic resistance
A trends of salmonella and antibiotic resistance
 
A transformational generative approach towards understanding al-istifham
A transformational  generative approach towards understanding al-istifhamA transformational  generative approach towards understanding al-istifham
A transformational generative approach towards understanding al-istifham
 
A time series analysis of the determinants of savings in namibia
A time series analysis of the determinants of savings in namibiaA time series analysis of the determinants of savings in namibia
A time series analysis of the determinants of savings in namibia
 
A therapy for physical and mental fitness of school children
A therapy for physical and mental fitness of school childrenA therapy for physical and mental fitness of school children
A therapy for physical and mental fitness of school children
 
A theory of efficiency for managing the marketing executives in nigerian banks
A theory of efficiency for managing the marketing executives in nigerian banksA theory of efficiency for managing the marketing executives in nigerian banks
A theory of efficiency for managing the marketing executives in nigerian banks
 
A systematic evaluation of link budget for
A systematic evaluation of link budget forA systematic evaluation of link budget for
A systematic evaluation of link budget for
 
A synthetic review of contraceptive supplies in punjab
A synthetic review of contraceptive supplies in punjabA synthetic review of contraceptive supplies in punjab
A synthetic review of contraceptive supplies in punjab
 
A synthesis of taylor’s and fayol’s management approaches for managing market...
A synthesis of taylor’s and fayol’s management approaches for managing market...A synthesis of taylor’s and fayol’s management approaches for managing market...
A synthesis of taylor’s and fayol’s management approaches for managing market...
 
A survey paper on sequence pattern mining with incremental
A survey paper on sequence pattern mining with incrementalA survey paper on sequence pattern mining with incremental
A survey paper on sequence pattern mining with incremental
 
A survey on live virtual machine migrations and its techniques
A survey on live virtual machine migrations and its techniquesA survey on live virtual machine migrations and its techniques
A survey on live virtual machine migrations and its techniques
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
 
A survey on challenges to the media cloud
A survey on challenges to the media cloudA survey on challenges to the media cloud
A survey on challenges to the media cloud
 
A survey of provenance leveraged
A survey of provenance leveragedA survey of provenance leveraged
A survey of provenance leveraged
 
A survey of private equity investments in kenya
A survey of private equity investments in kenyaA survey of private equity investments in kenya
A survey of private equity investments in kenya
 
A study to measures the financial health of
A study to measures the financial health ofA study to measures the financial health of
A study to measures the financial health of
 

Último

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Último (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

Artificial Neural Network and Discriminant Analysis for Credit Scoring

  • 1. Mathematical Theory and Modeling ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online) Vol.3, No.11, 2013 www.iiste.org An Application of Artificial Intelligent Neural Network and Discriminant Analyses On Credit Scoring 1. 2. 3. Alabi, M.A.1, Issa, S2., Afolayan, R.B3. Department of Mathematics and Statistics. Akanu Ibiam Federal Polytechnic Unwana, Afikpo, Ebonyi, Nigeria. Usmanu Danfodiyo University, Sokoto, Nigeria. Department of Statistics, University of Ilorin, Nigeria. * E-mail: alabi_ma@yahoo.com Abstract The research paper deals with credit scoring in banking system which compares most commonly statistical predictive model for credit scoring, Artificial Intelligent Neural Network (ANN) and Discriminant Analyses. It is very clear from the classification outcomes of this research that Neural Network compares well with Linear Discriminant model. It gives slight better results than Discriminant Analysis. However, it is noteworthy that a “Bad accepted” generates much high costs than a “Good rejected” and Neural Network acquires less amount of “Bad accepted” than the Discriminant predictive model. It achieves less cost of misclassification for the data set use in the research. Furthermore, if the final section of this research, an optimization algorithm (Genetic Algorithm) is proposed in order to obtain better classification accuracy through the configuration of the neural network architecture. On the contrary, it is important to note that the success of the predictive model largely depends on the predictor variables selection to be used as inputs of the model. Keywords: Scoring, Artificial Neural Network, Discriminant Analysis. 1.0 Introduction The objective of credit scoring models is to help the banks to find good credit applications that are likely to observe obligation according to their age, credit limit, income and marital condition. Many different credit scoring models have been developed by the banks and researchers in order to solve the classification problems, such as linear discriminant analysis (LDA), logistic regression (LR), multivariate adaptive regression splines (MARS), classification and regression tree (CART), case based reasoning (CBR), and artificial neural networks (ANNs). LDA and ANNs are generally used as methods to construct credit scoring models. LDA is the earliest one used for the credit scoring model. However, the utilization of LDA has often been criticized due to the assumptions of linear relationship between input and output variables, which seldom holds, and it is sensitive to deviations from the multivariate normality assumption (West, 2000). 1.1 Aim and Objectives The aim is to compare the predictive ability of Artificial Neural network with the linear discriminant models for credit scoring. This aim can be achieved through the following objectives: 1) To build a linear discriminant model capable of predicting an applicant credit status. 2) To build an artificial neural network model capable of identifying an applicant credit status. 3) To compare and contrast the predictive powers of Artificial Neural Network and Linear Discriminant Models. 1.2 Predictor Variables Selection Credit scoring is performed through “Credit Risk Assessment” which has mainly three purposes (Colquitt, 2007). First of all and most importantly, it goes through the borrower’s probability of repaying the debt by appraising his income, character, capacity and capital adequacy etc. In addition, it tries to identify borrower’s primary source of repayment, especially in the case of extended debt. And finally, it tries to evaluate borrower’s secondary source of repayment if the primary source of repayment becomes not available. Although credit risk assessment is one of the most successful applications of applied statistics, the best statistical models don’t promise credit scoring success, it depends on the experienced risk management practices, the way models are developed and applied, and proper use of the management information systems (Mays 1998). And at the same time, selection of the independent variables are very essential in the model development phase because they determine the attributes that decide the value of the credit score (see equation 1), and the value of the independent variables are normally collected from the application form. It is very significant to identify which variables will be selected and included in the final scoring model. Y = β 0 + β1 x1 + β 2 x2 + ...β n xn (1) x Where Y is the dependent variable and i ' s are independent variables 20
  • 2. Mathematical Theory and Modeling ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online) Vol.3, No.11, 2013 www.iiste.org Selection of the predictor variables should not be based only on statistical analysis; other points have to be noted also. For example, variables those are expensive or time consuming to obtain like “Debt Burden”, should be excluded. Moreover, variables that are influenced by the organization itself like “Utilization of Advanced Cash, should also be excluded (Mays, 1998). 2.0 Material and Methods The data for this write-up was collected from a sample of 200 applicant on credit scoring, extracted from the application form of First Bank of Nigeria plc, Kontagora. The methods used for credit worthy are Linear Discriminant Analysis and Artificial Neural Network. 2.1 Linear Discriminant Analysis LDA was first proposed by Fisher (1936) as a classification technique. It has been reported so far as the most commonly used technique in handling classification problems (Lee et al., 1999). In the simplest type of LDA, two-group LDA, a linear discriminant function (LDF) that passes through the centroids (geometric centres) of the two groups can be used to discriminate between the two groups. The LDF is represented by Equation (2): = + + + ⋯+ (2) Where a is a constant, and b1 to bp are the regression coefficients for p variables. LDA can also be applied in other areas, such as business investment, bankruptcy prediction, and market segment (Lee et al., 1997; Kim et al., 2000). 2.2 Neural Network A neural network (NNW) is a mathematical representation inspired by the human brain and its ability to adapt on the basis of the inflow of new information. Mathematically, NNW is a non-linear optimization tool. Many various types of NNW have been specified in the literature. The NNW design called multilayer perceptron (MLP) is especially suitable for classification and is widely used in practice. The network consists of one input layer, one or more hidden layers and one output layer, each consisting of several neurons. Each neuron processes its inputs and generates one output value that is transmitted to the neurons in the subsequent layer. Each neuron in the input layer (indexed i = 1,…,n) delivers the value of one predictor (or the characteristics) from vector x. When considering default/non-default discrimination, one output neuron is satisfactory. In each layer, the signal propagation is accomplished as follows. First, a weighted sum of inputs is calculated at each neuron: the output value of each neuron in the proceeding network layer times the respective weight of the connection with that neuron. There are two stages of optimization. First, weights have to be initialized, and second, a nonlinear optimization scheme is implemented. In the first stage, the weights are usually initialized with some small random number. The second stage is called learning or training of NNW. The most popular algorithm for training multilayer perceptrons is the back-propagation algorithm. As the name suggests, the error computed from the output layer is back-propagated through the network, and the weights are modified according to their contribution to the error function. Essentially, back-propagation performs a local gradient search, and hence its implementation; although not computationally demanding, it does not guarantee reaching a global minimum. For each individual, weights are modified in such a way that the error computed from the output layer is minimized. The BP network used herein has an input layer, an intermediate hidden layer, and an output layer. The BP-based credit scoring method is succinctly illustrated in Fig. 1. The input nodes represent the applicant’s characteristics, and output node represents the identified class (say 1 for rejected and 2 for accepted). The BP learning involves three stages: the feed-forward of the input training pattern, the calculation of the associated error, and the adjustment of the weights. After the network reaches a satisfactory level of performance, it learns the relationships between independent variables (applicant’s attributes) and dependent variable (credit class). The trained BP network can then be adopted as a scoring model to classify the credit as either good or bad by inserting the values of applicant’s attributes. 21
  • 3. Mathematical Theory and Modeling ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online) Vol.3, No.11, 2013 www.iiste.org 2.3. Wilks’ Lambda Test for significance of canonical correlation Hypothesis canonical correlation: H o : There is no linear relationship between the two sets of variables H 1 : There is linear relationship between the two sets of variables Test statistic: λ= W W +H , where W is residual variance H is the variance due to linear relationship W + H is the total variance. H o if P<0.05 otherwise accept Decision rule: Reject 2.4 Chi-square Test Hypothesis for Chi-square Test: H o at the 5% level of significance H o : The two variables are independent H 1 : The two variables are not independent Test statistic: r c χ = ∑∑ 2 i =1 j =1 (Oij − eij )2 eij , where Oij is the observed value and eij is the expected value. Decision Rule: Reject H o if P<0.05 otherwise accept H o at the 5% level of significance 3.0 Data collection and Analysis The dataset contains 200 cases, 163 applicants are considered as “Creditworthy” and the rest 37 are treated as “Non-creditworthy”. Data preparation allows identifying unusual cases, invalid cases, erroneous variables and incorrect data values in dataset. A real world credit dataset is used in this research. The dataset is extracted from the application forms of First Bank of Nigeria, plc Kontagora. It is referred to as “Credit Dataset”. After preparing the dataset, it is used in the subsequent sections for conducting the analysis with Neural Network and Discriminant Analyses 22
  • 4. Mathematical Theory and Modeling ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online) Vol.3, No.11, 2013 www.iiste.org Table 3.1: Credit Dataset Description No. Variable Type Scale Description 1 Attribute1 Input Variable Scale Age of the Applicant 2 Attribut2 Input Variable Nominal Sex of the Applicant 3 Attribute3 Input Variable Nominal Ownership of residence 4 Attribute4 Input Variable Nominal Marital status 5 Attribute5 Input Variable Nominal Qualification 6 Attribute6 Input Variable Nominal Employment status 7 Attribute7 Input Variable Nominal Employment classification 8 Attribute8 Input Variable Scale Length of service 9 Attribute9 Input Variable Scale Salary 10 Attribute10 Input Variable Nominal Application Request 11 Attribute11 Input Variable Scale Amount Request 12 Attribute12 Input Variable Scale Credit Amount 13 Attribute13 Input Variable Scale Proposed tenor in month 14 Attribute14 Input Variable Nominal Other borrowing 15 Attribute15 Output Variable Nominal Status of the Credit Applicant The dataset contains 200 cases, 163 applicants are considered as “Creditworthy” and the rest 37 applicants are treated as “Non-creditworthy”. The dataset holds 15 variables altogether. Among the variables, 9 variables are “Categorical” and the rest 6 variables are “Numerical”. Moreover, there are 14 independent variables (input variables) and 1 dependent variable (output variable) in the dataset. 3.1 Measurement of Model Performance in Discriminant Analysis Discriminant analysis is a statistical technique to classify the target population (in this study, credit card applicants) into the specific categories or groups (here, either creditworthy applicant or non-creditworthy applicant) based on the certain attributes (predictor variables or independent variables) (Plewa and Friedlob 1995). Discriminant analysis requires fulfilling definite assumptions, for example, assumption of normality, assumption of linearity, assumption of homoscedasticity, absence of multicollinearity and outlier, but this method is fairly robust to the violation of these assumptions (Meyers, Gamst et al. 2005). Here, in this study, it is assumed that all required assumptions are fulfilled to use the predictive power of the discriminant analysis for classification of the applicants. At this point, “creditworthiness” is the dependent variable (or, grouping variable) and the rest 14 variables are the independent variables (or, input variables). A discriminant model is identified as “Useful” if there is at least 25% more improvement achievable over by chance accuracy rate alone. “By Chance Accuracy” means that if there is no relationship between the dependent variable and the independent variables, it is still possible to achieve some percentage of correct group membership. Here, “By Chance Accuracy Rate” is 70% (.8152+.1852) and 25% increase of this value equals to 87.5% (1.25 70%) and the cross validated accuracy rate is 88.5%. Hence, cross validated accuracy rate is greater than or equal to the proportional by chance accuracy rate, it is possible to declare that the discriminant model is useful for the classification goal. Moreover, Wilks' lambda is a measure of the usefulness of the model. The smaller significance value indicates that the discriminant function does better than chance at separating the groups Table 3.2: SPSS Output: Model Test Wilks' Lambda Test of Function(s) Wilks' Lambda Chi-square Df Sig. 1 .501 131.995 14 .000 Here, Wilks' lambda test has a probability of <0.000 which is less than the level of significance of .05, means that predictors (i.e. independents variables such as Age of applicant, sex, salary, length of service etc) significantly discriminate the groups. This provides the proportion of total variability not explained, i.e. 50.1% unexplained. Table 3.3: Canonical correlation Function 1 Eigenvalue 0.996 a % of Variance Cumulative % Canonical Correlation 100.0 100.0 .706 a. First 1 canonical discriminant functions were used in the analysis. A canonical correlation of 0.706 suggests that the model explains 49.84% (i.e. .7062) of the variation in the grouping variable, i.e. whether an applicant is a creditworthy or not. 3.2 Importance of Independent Variables in Discriminant Analysis One of the most important tasks is to identify the independent variables that are important in the predictive 23
  • 5. Mathematical Theory and Modeling ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online) Vol.3, No.11, 2013 model development. Table 3.4: Discriminant Model FUNCTIONS Structure matrix www.iiste.org Standardized discriminant coefficients .434 -.025 -.033 -.131 .015 .012 -.267 .158 -.412 -.163 1.253 -1.031 -.137 .688 Canonical discriminant coefficients Age .473 .062 Sex -.041 -.055 Owner ship -.058 -.067 Marital Status -.095 -.173 Qualification .051 .011 Employment Status -.010 .029 Employment Classification -.119 -.320 Length Of Service .467 .020 Salary -.148 .000 Application Request -.061 -.380 Amount Request .040 .000 Credit Amount -.153 .000 Propose Tenure -.092 -.023 Other Borrowing .554 1.494 Constant -3.173 It can be identified from the structure matrix, that the predictor variables strongly associated with the discriminant model, are the “Age of the Applicant”, “Length of Service” and “Other Borrowing”. The structure matrix shows the correlations of each variable with each discriminant function. These Pearson coefficients are structure coefficients or discriminant loadings. They serve like factor loadings in factor analysis. Generally, just like factor loadings, 0.30 is seen as the cut-off between important and less important variables. The interpretation of the standardized discriminant function coefficients is like that in multiple regressions. It provides an index of the importance of each predictor like the standardized regression coefficients did in multiple regression. The sign indicates the direction of the relationship. Other Borrowing score is the highest predictor, follow by Age of applicant and Length of service is the least importance of a predictor. The canonical discriminant function coefficients (unstandardized coefficients) are used to create the discriminant function (equation). It operates just like regression equation. There are specific characteristics determined by the discriminant model for the two groups (creditworthy applicants and non-creditworthy applicants). Based on these given characteristics, an applicant is awarded “Good” and another one is the “Bad”. These characteristics differ between the two groups. For example, on an average, a good customer has the Age of at most 45 years; spend at most 20years in service and not borrowing from cooperative, Employers and other bank. The most important variables, identified in structure matrix before, are shown below. Table 3.5: SPSS Output: Group Statistics: Only The Good Group: Valid N (listwise) Status of the Credit Applicant Std. Deviation Unweighted Weighted Age of applicant 45.9141 6.90267 163 163.000 Length of service Other Borrowing Good Mean 20.1166 1.1043 7.80186 0.37865 163 163 163.000 163.000 Characteristics of the Creditworthy Applicants On the other hand, the bad group holds some certain characteristics in contradictory with the good group. For example, on an average, non-creditworthy applicants possess at least 54years of the Age; spend at least 29years in service and having Borrowing from Employer. Only the most important characteristics are show below. 24
  • 6. Mathematical Theory and Modeling ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online) Vol.3, No.11, 2013 www.iiste.org Table 3.6: SPSS Output: Group Statistics: Only The bad Group: Valid N (listwise) Status of the Credit Applicant Bad Age of applicant Mean Std. Deviation Unweighted Weighted 54.4054 7.51456 37 37.000 Length of service 29.6757 8.85078 37 37.000 Other Borrowing 1.7568 0.72286 37 37.000 Characteristics of the Non-creditworthy Applicant. 3.3 Neural Network Structure (Architecture): Neural network model is constructed with the multilayer perceptron algorithm. In the architectural point of view, it is a 14-10-1 neural network, means that there are total 14 independent variables, 2 neurons in the hidden layer and 1 dependent (output) variable. SPSS software is used. SPSS procedure can choose the best architecture automatically and it builds the network with one hidden layer. It is also possible to specify the minimum (by default 1) and maximum (by default 50) number of units allowed in the hidden layer, and the automatic architecture selection procedure finds out the “best” number of units (6 units are selected for this analysis) in the hidden layer. Automatic architecture selection uses the default activation functions for the hidden layer (Hyperbolic Tangent) and output layers (softmax). Predictor variables consist of “Factors” and “Covariates”. Factors are the categorical dependent variables (8 nominal variables) and the covariates are the scale dependent variables (6 continuous variables). Moreover, standardized method is chosen for the rescaling of the scale dependent variables to improve the network training. Further, 70% of the data is allocated for the training (training sample) of the network and to obtain a model; and 30% is assigned as testing sample to keep tracks of the errors and to protect from the overtraining. Different types of training methods are available like batch, online and minibatch. Here, batch training is chosen because it directly minimizes the total error and it is most useful for “smaller” datasets. Moreover, Optimization algorithm is used to estimate the synaptic weights and “Scaled Conjugate Gradient” optimization algorithm is assigned because of the selection of the batch training method. Batch training method supports only this algorithm. Additionally, stopping rules are used to determine the stopping criteria for the network training. According to the rule definitions, a step corresponds to iteration for the batch training method. Here, one (1) maximum step is allowed if the error is not decreased further. Here, it is important to note that, to replicate the neural network results exactly, data analyzer needs to use the same initialization value for the random number generator, the same data order and the same variable order, in addition to using the same procedure settings. 3.3.1 Measurement of Model Performance in Neural Network Structure (Architecture) The following model summary table displays information about the results of the neural network training. Here, cross entropy error is displayed because the output layer uses the softmax activation function. This is the error function that the network tries to minimize during training. Moreover, the percentage of incorrect prediction is equivalent to 18.88% in the training samples. So, percentage of correct prediction is nearer to 94.4% that is quite high. If any dependent variable has scale measurement level, then the average overall relative error (relative to the mean model) is displayed. On the other hand, if the defined dependent variables are categorical, then the average percentage of incorrect predictions is displayed Table 3.7: SPSS Output: Model Summary: Training 18.882 Percent Incorrect Predictions 5.6% Stopping Rule Used 1 consecutive step(s) with no decrease in error Training Time 0:00:00.265 Cross Entropy Error 5.383 Percent Incorrect Predictions Testing Cross Entropy Error 1.8% Dependent Variable: applicant Status a. Error computations are based on the testing sample. 3.3.2 Importance of Independent Variables: The following table performs an analysis, which computes the importance and the normalized importance of each predictor in determining the neural network. The analysis is based on the training and testing samples. The importance of an independent variable is a measure of how much the network’s model-predicted value changes for different values of the independent variable. Moreover, the normalized importance is simply the importance values divided by the largest importance values and expressed as percentages. From the following table, it is evident that “Amount Request” contributes most in the neural network model construction, followed by “Credit 25
  • 7. Mathematical Theory and Modeling ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online) Vol.3, No.11, 2013 www.iiste.org Amount”, “Annual Estimated Earnings of the Applicants”, “Ages in Years of the Applicants”, “Other Borrowing”, “Length of Service”, “Propose Tenor in Month” etc Table 3.8: SPSS Output: Independent Variable Importance: Importance Normalized Importance gender of applicant ownership of residence Marital status qualification of applicant Employment status Employment classification application request Other Borrowing Age of applicant Length of service applicant salary Amount Request credit amount proposed tenor in month .018 .004 .027 .031 .019 .021 .029 .088 .115 .079 .128 .218 .151 .072 8.1% 2.0% 12.6% 14.2% 8.9% 9.4% 13.5% 40.4% 53.1% 36.4% 58.7% 100.0% 69.2% 33.0% Important Variables Identified By the Neural Network Model The above table shows importance on how the network classifies the prospective applicants. But it is not possible to identify the direction of the relationship between these variables and the predicted probability of default. This is one of the most prominent limitations of the neural network. So, statistical models will help in this situation. 3.4 Comparison of the Model’s Predictive Ability 3.4.1 Discriminant Analysis: In the discriminant analysis model development phase, a statistically significant model is derived which possess a very good classification accuracy capability. In the following table, it is shown that the discriminant model is able to classify 152 good applicants as “Good Group” out of 163 good applicants. Thus, it holds 93.3% classification accuracy for the good group. On the other hand, the same discriminant model is able to classify 30 bad applicants as “Bad Group” out of 37 bad applicants. Thus, it holds 81.1% classification accuracy for the bad group. Thus, the model is able to generate 91.0% classification accuracy in combined groups. Table 3.9: SPSS Output: Classification Results: Predictive Ability of the Discriminant Model Predicted Group Membership Attribute 15 2.00 Total 1.00 152 11 163 2.00 7 30 37 1.00 93.3 6.7 100.0 2.00 18.9 81.1 100.0 Count 1.00 150 13 163 % 2.00 1.00 10 92.0 27 8.0 37 100.0 2.00 Original 1.00 27.0 73.0 100.0 Count % Cross-validated a b.91.0% of original grouped cases correctly classified. c.88.5% of cross-validated grouped cases correctly classified. 3.4.3 Artificial Neural Network: In the artificial neural network model development stage, a predictive model is derived which enjoys a very good classification accuracy capability. In the following table, it is shown that the neural network model is able to classify 115 good applicants as “Good Group” out of 116 good applicants. Thus, it holds 99.1% classification accuracy for the good group. On the other hand, the same neural network model is able to classify 21 bad 26
  • 8. Mathematical Theory and Modeling ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online) Vol.3, No.11, 2013 www.iiste.org applicants as “Bad Group” out of 28 bad applicants. Thus, it holds 75.0% classification accuracy for the bad group. Thus, the model is able to generate 94.4% classification accuracy for the both groups. Here, the training sample is taken into account, because statistical models don’t use testing sample Table 3.10: SPSS Output: Classification Results: Predictive Ability of the Artificial Neural Network. Predicted Sample Observed 1 2 Percent Correct Training 1 115 1 99.1% 2 7 21 75.0% Overall Percent 84.7% 15.3% 94.4% 1 47 0 100.0% 2 1 8 88.9% Overall Percent 85.7% 14.3% 98.2% Testing Dependent Variable: applicant Status 4.0 Findings and Conclusions. Appropriate predictor variables selection is one of the conditions for successful credit scoring models development. This study reviews several considerations regarding the selection of the predictor variables. Moreover, using the Multilayer Perceptron Algorithm of Neural Network, network architecture is constructed for predicting the probability that a given customer will default on a loan. The model results are comparable to those obtained using commonly used techniques like Discriminant Analysis and Artificial Neutral Network as described in the following table: Table 4.1: Predictive Models Comparison Dataset Models Good Good Bad Bad Rejected Success Rate Accepted Rejected Accepted Discriminant Analysis 152 11 30 7 91.0% Neural Network 115 1 7 21 94.4% There are two noteworthy and interesting points about this table. First of all, it shows the predictive ability of each model. Here, the column 2 and 5 (“Good Accepted” and “Bad Rejected”) are the applicants that are classified correctly. Moreover, the column 3 and 4 (“Good Rejected” and “Bad Accepted”) are the applicants that are classified incorrectly. Furthermore, it shows that neural network gives slightly better results than discriminant analysis. It should be noted that it is not possible to draw a general conclusion that neural network holds better predictive ability than discriminant analysis because this study covers only one dataset. On the other hand, statistical models can be used to further explore the nature of the relationship between the dependent and each independent variable. Secondly, the table 4.1 gives an idea about the cost of misclassification which assumed that a “Bad Accepted” generates much higher costs than a “Good Rejected”, because there is a chance to lose the whole amount of credit while accepting a “Bad” and only losing the interest payments while rejecting a “Good”. In this analysis, it is apparent that neural network (equals to 7) acquired less amount of “Bad Accepted” than discriminant analysis (equals to 30). So, neural network achieves less cost of misclassification. Refrence Colquitt, J. (2007). Credit Risk Management, McGraw-Hill. Fisher, R.A. (1936). The use of Multiple Measurements in Taxonomic Problems. Annals of Eugenic, No. 7, p. 179-188. West, D. (2000). Neural Network Credit Scoring Models. Computer & Operations Research. No. 27, p. 1131– 1152. Kim, J. C., Kim, D. H., Kim, J. J., Ye, J. S., and Lee, H. S. (2000). Segmenting the Korean Housing Market Using Multiple Discriminant Analysis, Construction Management and Economics, p. 45-54. Lee, T.H., and S.H. Jung. 1999/2000. Forecasting creditworthiness: logistic vs artificial neural network, The Journal of Business Forecasting Methods & Systems 18(4) 28-30. Mays, E. (1998). Credit Risk Modeling: Design and Application, CRC. Mcculloch, W. and W. Pitts (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biology Vol. 5, No. 27
  • 9. Mathematical Theory and Modeling ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online) Vol.3, No.11, 2013 www.iiste.org 4, p. 115–133. Meyers, L. S., Gamst, G. C. & Guarino, A. J. (2005). Applied Multivariate Research: Design and Interpretation, Sage Publications. Plewa, F. J. & Friedlob, G. T. (1995). Understanding Cash Flow, Wiley. West, D. (2000). Neural Network Credit Scoring Models. Computer & Operations Research. No. 27, p. 1131– 1152. 28
  • 10. This academic article was published by The International Institute for Science, Technology and Education (IISTE). The IISTE is a pioneer in the Open Access Publishing service based in the U.S. and Europe. The aim of the institute is Accelerating Global Knowledge Sharing. More information about the publisher can be found in the IISTE’s homepage: http://www.iiste.org CALL FOR JOURNAL PAPERS The IISTE is currently hosting more than 30 peer-reviewed academic journals and collaborating with academic institutions around the world. There’s no deadline for submission. Prospective authors of IISTE journals can find the submission instruction on the following page: http://www.iiste.org/journals/ The IISTE editorial team promises to the review and publish all the qualified submissions in a fast manner. All the journals articles are available online to the readers all over the world without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. Printed version of the journals is also available upon request of readers and authors. MORE RESOURCES Book publication information: http://www.iiste.org/book/ Recent conferences: http://www.iiste.org/conference/ IISTE Knowledge Sharing Partners EBSCO, Index Copernicus, Ulrich's Periodicals Directory, JournalTOCS, PKP Open Archives Harvester, Bielefeld Academic Search Engine, Elektronische Zeitschriftenbibliothek EZB, Open J-Gate, OCLC WorldCat, Universe Digtial Library , NewJour, Google Scholar