Riview Jurnal
Riview Jurnal
T.Jayalakshmi Dr.A.Santhakumaran
Computer Science Department Statistics Department
CMS College of Science and Commerce Salem Sowdeswari College
Coimbatore, INDIA Salem, INDIA
[email protected] [email protected]
Abstract—Many real world problems can be solved with Varieties of techniques have been applied to deal with the
Artificial Neural Networks in the areas of pattern recognition, classification problems. Many previous research works
signal processing and medical diagnosis. Most of the medical shows that neural network classifiers have a better
data set is seldom complete. Artificial Neural Networks require performance, lower classification error rate, and more robust
complete set of data for an accurate classification. This paper to noise than other methods [1]. One type of neural network
dwells on the various missing value techniques to improve the commonly used for classification is a Multi Layer Perceptron
classification accuracy. The proposed system also investigates (MLP) a feed forward net with one or more layers of nodes
the impact on preprocessing during the classification. A and with back propagation for training [12]. Increasing both
classifier was applied to Pima Indian Diabetes Dataset and the
the number of hidden layers and neurons will make the
results were improved tremendously when using certain
combination of preprocessing techniques. The experimental
network more flexible to the mapping to be implemented [6].
system achieves an excellent classification accuracy of 99% Moreover, increasing the number of hidden neurons
which is best than before. increases the risk of over fitting.
To overcome these problems, this paper present two
Keywords- Artificial Neural Networks; Diabetes Mellitus; different approaches. The first approach attempts to construct
Missing Value Analysis; Pre-Processing Methods. the data sets with reconstructed missing values. It can
achieve higher correct classification rates than the standard
I. INTRODUCTION back propagation method. Reconstruction of missing values
may shift input patterns, and the network may have to settle
Medical information systems in modern hospitals and on a complicated solution to satisfy all reconstructed
medical institutions become larger and larger; it causes great patterns. In contrast, the second approach attempts to reduce
difficulties in extracting useful information for decision the actual learning time by using data pre-processing. Pre-
support. Traditional manual data analysis has become processing can speed up the training time by starting the
inefficient and methods for efficient computer based analysis training process for each feature within the same scale. It is
are essential. It has been proven that the benefits of especially useful for modeling applications where the inputs
introducing machine learning into medical analysis are to are generally on widely different scales. In this study,
increase diagnostic accuracy, to reduce costs and to reduce various missing value analysis and pre-processing methods
human resources. Artificial Neural Networks (ANN) is are analyzed. This paper also investigates the reconstruction
currently the next promising area of interest. Already it could values with pre-processing and observes the results. This
successfully apply to various areas of medicine such as shows that the results are tremendously improved when
diagnostic systems, bio chemical analysis, image analysis applying these concepts.
and drug development. The benefit of using ANN is that This paper organized as follows: Section II briefs about
they are not affected by factors such as fatigue, working Artificial Neural Networks, Section III gives the description
conditions and emotional state. Diabetes mellitus is a of the Diabetes mellitus. Section IV provides the details
lifelong disease resulted from the underproduction or about back propagation method, Section V proposes the
reduced action of hormone insulin. This dysfunction of methodology and Section VI concludes the paper
insulin results in blood glucose levels out of normal range,
leading to many short and long-term complications [13]. II. ARTIFICIAL NEURAL NETWORKS
Depending on the cause of this insulin insufficiency, two A Neural Network is a massively parallel-distributed
different types of the disease are distinguished. Type I processor made up of simple processing units, which has a
diabetes mellitus and Type II diabetes mellitus. This paper natural propensity for storing experiential knowledge and
deals about the classification of Type II diabetes. making it available for use. It is very sophisticated modeling
technique capable of modeling extremely complex functions.
160
161
physiological measurements were taken but not recorded by hidden layers and one output layer. The ANN has eight input
the person perhaps due to time constraints [2]. ANNs cannot nodes, eight hidden nodes and one output node. The network
interpret missing values, and when a database is highly was trained by Levenberg Marquardt back propagation
skewed, ANNs have difficulty in identifying the factors algorithm with tansigmoid activation function. The network
leading to a rare outcome. It poses difficulty in the analysis parameters such as learning rate, momentum constant,
and decision-making processes which depend on these data, training error and number of epochs can be considered as
requiring methods of estimation that are accurate and 0.9, 0.9, 1e-008 and 100 respectively. Before training, the
efficient. Various techniques exist as a solution to this weights are initialized to random values. The reason to
problem, ranging from deletion to methods employing initialize weights with small values is to prevent saturation.
statistical and Artificial Intelligent techniques to impute for To evaluate the performance of the network the entire
missing values. The following can deal with this issue. The sample was randomly divided into training and test sample.
first and easiest way to deal with missing values is simply The model is tested using the standard rule of 80/20, where
delete all the cases with missing values for the variable under 80% of the samples are used for training and 20% is used for
consideration. This technique however may lead to the loss testing. In this classification method, training process is
of potentially valuable information about patients whose considered to be successful when the Mean Square Error
values are missing. The second approach is to replace all (MSE) reaches the value 1e-008. On the other hand the
missing values with the mean. The method of replacing by training process fails to converge when it reaches the
average is to replace all missing values of an attribute by the maximum training time before reaching the desired MSE.
average of all available values of the same attribute in the The training time of an algorithm is defined as the number of
training set. Replacing the missing values with the means epochs required to meet the stopping criterion.
might bias the databases towards the sicker ones. The third
technique is to replace all the missing values with zeros. The E. Experimental results
method of replacing by zero is simply to replace all missing A computer simulation has been developed to study the
values by zero. If the values are important for clinical impact of pre-processing and missing value techniques. The
management the assessment of missing values leads to poor simulations have been carried out using MATLAB. Various
classification. The fourth technique K-nearest neighbour networks were developed and tested with random initial
(KNN) method replaces missing values in data with the weights. The network is trained ten times and the
corresponding value from the nearest-neighbour column. The performance goal is achieved at different epochs which are
nearest-neighbour column is the closest column in Euclidean shown in Figure.2
distance. If the corresponding value from the nearest-
neighbour column is also missing means, the next nearest
column is used.
C. Preprocessing of Input Data
Neural network training could be made more efficient by
performing certain pre-processing steps on the network
inputs and targets. Network input processing functions
transforms inputs into better form for the network use. The
normalization process for the raw inputs has great effect on
preparing the data to be suitable for the training. It can be
used to scale the data in the same range of values for each
input feature in order to minimize bias within the neural
network for one feature to another. Data normalization can
also speed up training time by starting the training process (a) Omit the samples
for each feature within the same scale. It is especially useful
for modeling application where the inputs are generally on
widely different scales. Principle Component Analysis
(PCA) is one of the most powerful pre-processing
techniques. Principal Component’s normalization is based on
the premise that the salient information in a given set of
features lies in those features that have the largest variance.
The PCA algorithm normalizes the components so that they
have zero mean and unity variance [7]. This is accomplished
by using eigenvector analysis on either the covariance matrix
or correlation matrix for a set of data.
D. Classification structure
Classification structure used in the proposed method is
four layers feed forward networks i.e. one input layer, two (b) Replace with Zero
161
162
VI. CONCLUSION
This paper demonstrates the impact of pre-processing and
missing values. An experimental result shows that maximum
accuracy is achieved with minimum training time. It proves
that some combination of missing values and pre-processing
the accuracy was tremendously improved. The novel
classification method can be applied for different training
method that provides high accuracy.
REFERENCES
[1] Arit Thammano and Asavin Meengen, “A New
( c ) Replace with Mean Evolutionary Neural Network Classifier”, T.B.Ho,
D.Cheung, and H.Liu (Eds.): PAKDD 2005, LNAI
3518, Springer-Verlag Berlin, pp. 249-255. 2005.
[2] Colleen M Ennett, Monique Frize, C.Robin Walker,
“Influence of Missing Values on Aritificial Neural
Network Performance”, Proceedings of Medinfo, 2001,
pp.449-453.
[3] Edgar Teufel1, Marco Kletting1, Werner G.Teich2,
Hans-Jorg Pfleiderer1, and Cristina Tarin-Sauer3,
“Modelling the Glucose Metabolism with
Backpropagation Through Time Trained Elman Nets”,
(d) Replace with KNN IEEE 13th Workshop on Neural Networks for Signal
Figure 2. Simulation Results Processing, NNSP'03, 17-19 Sept. 2003, pp.789 - 798
The impact of the missing values can be assessed by [4] Fuluf helo V Nelwamondo, Shakir Mohammed and
taking the average of ten runs and measured in terms of Tshilidzi Mawala, “Missing Data: A comparison of
classification accuracy (Table I). It shows that the accuracy
neural network and expectation maximization
was tremendously improved when using K-nearest
neighbour and mean with PCA pre-processing method. The techniques”, Current Science, Vol 93, No 11, 2007.
confusion matrix shows the correct classification rate against [5] Humberto M.Fonseca; Victor H.Ortiz, Agustin
incorrect classification (Table II).
LCabrera., “Stochastic Neural Networks Applied to
Dynamic Glucose Model for Diabetic Patients”, 1st
ICEEE, 2004, pp.522 - 525
[6] Igor Aizenberg, Claudio Moraga, “Multilayer
Feedforward Neural Network Based on Multi-Valued
neurons (MLMVN) and a back propagation learning
TABLE I. CLASSIFICATION ACCURACY algorithm”, Soft Computing, 20 April 2006, pp.169-
183.
[7] Junita Mohammad-saleh, Brain S.Hoyle, “Improved
Neural Network performance using Principle
Component Analysis on Matlab”, Int. journal of the
computer, the internet and management, Vol.16, No 2,
2008, pp.1-8.
[8] Md.Shahjahan1,M.A.H.Akhand2, and K.Murase1, “A
TABLE II. CONFUSION MATRIX (A, B, C, D) Pruning Algorithm for Training Neural Network
Ensembles”, SICE 2003 Annual Conference ,Volume
1, 2003, pp.628 – 633.
162
163
[9] M.Nawi1, R.S. Ransing and M.R. Ransing, “An
Improved Conjugate Gradient Based learning
Algorithm for Back Propagation Neural Networks”
International journal of Computational Intelligence,
2008.
[10] Rajeeb Dey and Vaibhav Bajpai, Gagan Gandhi and
Barnali Dey, “Application of Artificial Neural
Network (ANN) technique for Diagnosing Diabetes
Mellitus”, IEEE Region 10 Colloquium and the 3rd
ICIIS, Dec 2008, PID 155.
[11] P.K.Sharpe and R.J.Solly, “Dealing with Missing
Values in Neural Network based Diagnostic Systems”,
Neural Computing and Applications, Springer-Verlag
London Ltd, 1995, pp.73-77.
[12] Roelof K Brouwer PEng, PhD, “An Integer Recurrent
Artificial Neural Network for Classifying Feature
Vectors”. ESSANN, Proceedings – Eur. Symp on
Artificial Neural Networks, April 1999, D-Facto public,
pp. 307-312.
[13] S.G.Mougiakakou,K.Prountzou,K.S.Nikita , “A Real
Time Simulation Model of Glucose-Insulin
Metabolism for Type 1 Diabetes Patients” , 27th
Annual Int. Conf. of the Engineering in Medicine and
Biology Society, IEEE-EMBS 17-18 Jan. 2006 pp.298
- 301.
[14] Siti Farhanah, Bt Jafan and Darmawaty Mohd Ali,
“Diabetes Mellitus Forecast using Artificial Neural
Networks (ANN)”, Asian Conference on sensors and
the international conference on new techniques in
pharamaceutical and medical research proceedings
(IEEE), Sep 2005, pp. 135-138.
[15] Syed Muhammad Aqil Burney, Tahseen Ahmed Jilani,
Cemal Ardil, “Levenberg-Marquardt Algorithm for
Karachi Stock Exchange Share Rates Forecasting”,
Proc of world Academy of Science, Eng And Tech,
Vol 3, Janry 2005.
163
164