0% found this document useful (0 votes)
12 views

Breast Cancer Classification Model Using Principal Component Analysis and Deep Neural Network

Uploaded by

Poonkuzhali S
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Breast Cancer Classification Model Using Principal Component Analysis and Deep Neural Network

Uploaded by

Poonkuzhali S
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Chapter 13

Breast Cancer Classification Model


Using Principal Component Analysis
and Deep Neural Network

M. Sindhuja , S. Poonkuzhali , and P. Vigneshwaran

Abstract Life-threatening cancer is prevalent over the globe. According to statis-


tics, most people are diagnosed with cancer in the later stages, even though cancer
can be prevented and cured in early stages. The goal of this research is to diagnose
a breast cancer is either benign or malignant, as well as to forecast the likelihood
of a cancer recurrence even after a course of therapy has been completed. Despite
the fact that many machine learning algorithms have resulted in strong predictions,
the accuracy in the early stages of categorization is not up to the expected level.
Deep learning (DL), a higher degree of machine learning, can forecast breast cancer
types and recurrences. Classifiers were built using a deep neural network (DNN) that
used Principal Component Analysis (PCA) to choose features. Different machine
learning techniques are compared with the proposed system’s accuracy. Early-stage
breast cancer prediction is more accurate with the DNN-based method. The clin-
ical management system will benefit from the proposed system since it will aid in
identification of cancer at an early stage and the subsequent provision of appropriate
therapy.

13.1 Introduction

Uncontrolled cell division and tissue destruction are hallmarks of the cancer condi-
tion. Any part of the body might be affected by cancer, which can begin anywhere

M. Sindhuja (B)
Department of Artificial Intelligence and Machine Learning, Rajalakshmi Engineering College,
Thandalam, Chennai, Tamil Nadu 602105, India
e-mail: [email protected]
S. Poonkuzhali
Department of Computer Science and Engineering, Rajalakshmi Engineering College,
Thandalam, Chennai, Tamil Nadu 602105, India
e-mail: [email protected]
P. Vigneshwaran
Department of Networking and Communications, SRM Institute of Science and Technology,
Kattankulathur, Chengalpattu, Tamil Nadu 603203, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 137
C. So-In et al. (eds.), Information Systems for Intelligent Systems, Smart Innovation,
Systems and Technologies 324, https://doi.org/10.1007/978-981-19-7447-2_13
138 M. Sindhuja et al.

and spread throughout the body. As per the WHO’s International Agency for Cancer
Research (IARC), 8.2 million people died from cancer in 2012, and 27 million more
will by 2030.The most common malignancies in women are those of the breast,
cervix, colorectum, ovary, oral and lip; Oral cavity, lung, lip, stomach, colorectum
and throat are the most prevalent malignancies in men. Women are more likely than
males to be diagnosed with BC, which is the second-most common malignancy in
women (excluding skin cancer). Treatment options for breast cancer include both
local and systemic approaches. Surgical procedures and radiation therapy are exam-
ples of local treatments; chemotherapy and hormone therapy are local ones. To get
the best outcomes, these two treatments are often combined. Stomach, lip, oral,
colorectum, lung, and throat are the most prevalent malignancies in men. In the first
five years following breast cancer therapy, the majority of recurrences occur. Recur-
rences of breast cancer can occur locally, regionally, or as distant metastases. The
lymph nodes, bones, liver, lungs, and brain are some of the most common locations
of recurrence outside the breast. Because of this, predicting the likelihood of breast
cancer recurrence is critical. Breast cancer detection technology has progressed in
recent years, with new methods of imaging the breast, new ways of identifying
dangerous and benign tumor cells, and novel detection methodologies. Age, diet,
marital status, disease stage, and treatment all play a role in a woman’s risk for
breast cancer.
There are varying degrees of correlation and uncertainty when it comes to the stage
of the breast cancer. This section, therefore, focuses on predicting a patient’s suscep-
tibility to breast cancer. Deep learning was used to investigate this issue, employing
CNNs in the MATLAB environment to forecast the intensity range of breast cancer.
In recent research for nuclei segmentation, diverse algorithms were compared and
tested on a dataset of 500 photos, and the accuracy percentage ranged from 96%
to 100%. A new study on the diagnosis of BC used cytological pictures analysis
of needle biopsies to distinguish between benign and malignant images. The study
claims a 98% performance accuracy on 737 photos using four different classifiers
trained with a 25-dimensional feature vector. When using nuclei segmentation and
support vector machine and neural network, the study provided an accurate diagnosis
technique for breast cancer. As a result of this investigation, the accuracy ranged from
76% to 94%. It’s not uncommon for a traditional feature extraction strategy to result
in a solution that’s unique to the situation at hand, as it requires significant time and
effort, as well as extensive expertise in a related field. The CNN architecture has
been utilized to handle high-resolution texture images.
Computers may “learn” from prior examples and find hard-to-diagnose patterns
from huge, noisy, or complex datasets using machine learning, a subset of artificial
intelligence. These qualities are ideal for medical applications that rely on compli-
cated proteomic and genomic studies. Support vector machines, Bayesian belief
networks, and artificial neural networks are commonly utilized in cancer diagnosis
and detection. Recently, machine learning has been used to predict cancer outcomes.
The survey revealed numerous high-performing algorithms for analyzing datasets
properties. Instead of task-specific algorithms, deep learning is part of a larger family
of machine learning methods. We use deep learning to implement neural networks.
13 Breast Cancer Classification Model Using Principal Component … 139

13.2 Related Works

This section includes the related papers that are relevant to the proposed work. Some
of the literatures are discussed in below.
Naive Bayes (NB) and k-nearest neighbor (KNN) classifiers have been proposed
by Amrane et al. [1] for the breast cancer classification. Breast cancer classification
techniques are provided in [2–4]. Donated to the University of California, Irvine, is
the Breast Cancer Dataset (BCD) (UCI). There are eleven qualities, nine of which are
examined in detail, and the final feature provides a binary value that is used to decide
whether a tumor is benign or malignant (benign tumour-2 and malignant tumor-4). A
total of 699 cases are included in the dataset. We were unable to expand our dataset
beyond 683 samples because of the presence of 16 observations with blank BCDs.
Cross-validation is used to compare the suggested classifiers’ accuracy between the
two new implementations. KNN scored good accuracy (97.51%) and the decreased
error rate compared to NB classifiers (96.19%). Various machine learning techniques,
including support vector machine (SVM), decision tree (C4.5), Naive Bayes, and
k-Nearest Neighbors (KNN), were compared using the Wisconsin Breast Cancer
(original) datasets. The primary goal of this article was to evaluate the accuracy,
precision, sensitivity, and specificity of each algorithm in terms of efficiency and
efficacy. SVM scored the increased accuracy (97.13%) and the decreased error rate
in experiments. Two different convolutional neural networks (CNNs) were used
by Kumar et al [5], one for detecting individual nuclei even in densely populated
areas, and the other for classifying them. The first CNN uses the input HE image
to forecast the distance transform of the underlying (but unknown) multi-nuclear
map. Cases and controls are categorized in the second CNN’s analysis of nuclear
center patches. The chance of recurrence for a patient can be calculated by voting
on patches generated from images of the patient. For a sample of 30 recurrent cases
and 30 non-recurrent controls, the proposed method yielded an AUC of 0.81, after
training on an independent set of 80 case-control pairs. It’s possible that if this
technique is confirmed to be correct, it could aid in the selection of various treatment
alternatives, such as active surveillance or radical prostatectomy or radiation and
hormone therapy, if further research is done. As a result, it can also be used to predict
treatment outcomes in other tumors.
Using convolutional neural networks and hematoxylin-and-eosin (H&E) stained
breast tissues, Bejnordi et al. [6] suggested a system for classifying H&E stained
breast specimens for the diagnosis of breast cancer patients. An evaluation of 646
breast tissue biopsies utilizing the suggested approach achieved an area under the
ROC of 0.92, revealing the diagnostic value of hitherto overlooked tumor-associated
stroma. Participants’ risk factors for cancer and non-cancerous diseases were exam-
ined by Atashi et al. [7]. Risk factors were grouped into three priority levels, then
fuzzified and the subtractive clustering approach was used to input them in the same
order. In order to train and test the new model, 70% and 30% of the dataset was
randomly partitioned into two halves. Following the training, the system was put to
140 M. Sindhuja et al.

the test using data from the Wisconsin Clinic and the real Clinic, with promising
results.
The variables were given the necessary fuzzy functions, and the model was then
trained using the combined dataset. First, the model was tested on 30% of the dataset,
and then on the real data from a real Clinic (BCRC). The model’s precision for
the above phases was 81 and 84.5%, respectively, with a sensitivity of 85.1 and a
specificity of 74.5%.
Studies on BCC and M1 have been done extensively; while mammograms can miss
about 15% of breast cancer cases, alternative methods use the genome or phenotypes
to categorize the illness [8, 9]. SDC, Linear Discriminant Analysis (LDA), and Fuzzy
C Means Clustering are some of the approaches used to classify breast cancer. One
of the most commonly used machine learning methods is the k nearest neighbors
algorithm [10, 11]. A similarity measure must be used before a new element can
be classified [12]. KNN can be used to assess the rate of false positives in cancer
classification [13, 14].
Biological, chemical, and physiologic features are commonly predicted using
naive Bayesian classifiers. Combining NBC with other classifiers, like as decision
trees, can help determine prognosis or classification models. The WBC database
was used to examine the accuracy of various breast cancer diagnosis classification
algorithms [15]. The optimized learning vector approach performed at 96.7%, the
large LVQ method obtained 97.13%, and the SVM for cancer diagnosis has the
greatest accuracy of 97.13% [16]. Ashrita Kannan et al. [17] proposed random forest
model to predict whether a Type 2 diabetic patient is vulnerable to cancer or not.

13.3 Methodology

At an early stage, the system tries to differentiate between benign and malignant
breast cancers, and then use textual data to forecast a return of the disease. Principal
component analysis (PCA) was used to select the subset of characteristics from the
supplied collection of features for breast cancer categorization. As a data classifier,
deep neural network (DNN) uses metrics taken directly from tumor images to cate-
gorize breast cancer tumors. Using this information, doctors will be able to determine
the cancer’s risk level so that they can provide the appropriate treatment.
Because of its great accuracy, deep learning has become widely used in medicine
in the last few years. Because of this, a variety of techniques have been developed,
including CNN, DNN, RNN, Auto-encoder-based methods, and sparse coding tech-
niques. As compared to current methodologies, convolutional neural network-based
approaches exhibited substantial gains. To categorize breast cancer tumors based on
measures received directly from the tumors, deep neural network (DNN) is being
used as a data classifier.
As part of this strategy, a DNN model is employed in classification, and PCAs are
used in feature selection. The proposed method’s phases are outlined in the following
steps and illustrated in Fig. 13.1.
13 Breast Cancer Classification Model Using Principal Component … 141

Fig. 13.1 Architecture of the proposed system

1. Remove all occurrences with missing values from the WBC dataset.
2. PCA is used to identify the features having good impact on the dataset.
3. The eigen values and cumulative values are calculated and the best feature is
extracted and other features are eliminated through PCA.
4. Use a deep neural network to classify the dataset.

13.3.1 Dataset Description

To conduct the analysis, the Wisconsin Breast Cancer (WBC) Dataset from the UCI
Repository is used, which contains 699 cases with 9 feature attributes. Because it
has two categories, benign and malignant, this dataset can be classified as binary.
Essentially, benign tumors don’t spread to other sections of the body, but they are
still tumors. The term “malignant” refers to cancerous cells that have metastasized
to other organs or tissues in the body. Table 13.1 provides a list of each of the nine
characteristics.
142 M. Sindhuja et al.

Table 13.1 Attribute information for WBC dataset


S. No Attribute Domain
1 Clump thickness 1–10
2 Uniformaity of cell size 1–10
3 Uniformaity of cell shape 1–10
4 Marginal adhesion 1–10
5 Single epithelial cell size 1–10
6 Bare nuclei 1–10
7 Bland chromatin 1–10
8 Normal nuclei 1–10
9 Mitoses 1–10

13.3.2 Preprocessing

Two extremely important terms which will improve the ability our model to learn
efficiently are quality of data and usefulness of data. To achieve the quality of data,
preprocessing is necessary to standardize the data with feature reduction.
Sixteen out of the 699 cases have no data. Prior to feature selection, all 16 instances
of that dataset were deleted. In order to increase performance and precision, this
system selects the best features from among the feature variables. The numbers have
been reduced in size to make them more relatable to the average person.

13.3.3 Feature Selection

Feature selection is an essential part of any machine learning model. It removes any
uncertainty from the data, making it easier to use. Training a model is easier and takes
less time when the data is less. In this way, over-fitting data is avoided. Choosing
the optimal subset of features from the complete feature set can enhance accuracy.
Using wrapper methods, filters, and embedding methods to select characteristics is
one such example.
Extraction and selection of features were done using PCA. It was possible to get rid
of the original dataset by using PCA, or Principal Components Analysis (PCA). The
loading of a variable determines its influence on a component. Relative dimensions
of objects were often provided by the first PC. The most important information will
be stored on the first few PCs.
To lower the dataset’s dimensionality, researchers used three simple rules of
thumb:
• Tests for Scree
• Cumulative Variance
• Rules established by Kaiser Guttman.
13 Breast Cancer Classification Model Using Principal Component … 143

Tests for Scree


Retaining just the PCs known to as Scree Plot for testing is a regular occurrence. For
a long time, the eigenvalue plotter is used. The “elbow” in the curve is important to
look for when the curve flattens down, using a scree plot. That which begins with
“elbow” is how many PCs a person should have.
Cumulative Variance
The eigenvalues or the percentage of variance explained by each PC determines the
number of PCs [9]. It is possible to calculate the kth-longest segment’s predicted
length by randomly dividing the first p segments in half. And if the PC’s proportion
of variance explained is larger than, the PCs are kept. The permissible range of
percentages is 80–100%.
Rules established by Kaiser Guttman
As a result of the work of Henry Kaiser and Louis Gutman on a method for selecting
components fewer than the number required for perfect reconstruction, the approach
was universally regarded as accurate. This rule is commonly used in common factor
analysis and PCA. The number of factors m is calculated by counting the number of
eigenvalues greater than 1. As a result, computers with an eigenvalue larger than 1
must be kept.

13.3.4 Classification

An 80/20 training-test split is used in the proposed method. Randomly-split datasets


are used to create the datasets. After partitioning, a dataset is used to train the classifier.
There are eight input nodes, six hidden layers, and one output node in this deep neural
network classifier. As a result of its complexity, this network is computationally
intensive but provides promising results after training.

13.3.5 Deep Neural Network (DNN)

There is no difference in structure between deep neural networks and conventional


artificial neural networks. The models and models’ complex hierarchies are made
easier with its assistance. When the weights of each node are changed and the network
is back-propagated, it contains ‘n’ hidden layers that process data from the preceding
layer, known as the input level. Number of inputs can be mapped to a single input
node. It is common for DNNs to include more nodes than the input layer to speed
up the process of learning. The output layer’s output nodes can each be customized
independently. An input-output layer bias, learning rate (initial weights), hidden layer
144 M. Sindhuja et al.

count, and nodes per layer, and an epoch-ending stop condition are all examples of
these parameters. In order to avoid nullified network outcomes, this model’s bias
value is 1. It is also possible to alter the default learning rate, which is set at 0.15, to
achieve alternative outcomes. The weight of a node can be updated after each epoch
based on the error rate measured during back propagation. If have a large number
of inputs and data, will have a large number of hidden layers. When the number of
epochs has been reached or the learning model’s desired outcome has been achieved,
the network comes to an end. A model that has more layers and nodes to train demands
more time and resources.

13.3.6 Algorithmic Steps for PCA-DNN

• Step 1. The Wisconsin Breast Cancer dataset is preprocessed to remove the missing
values from the instances.
• Step 2. Extract the features using PCA from the dataset.
– Step 2.1: Calculate the eigen values and cumulative values
– Step 2.2: Select the Principal Components
– Step 2.3: Reduce the data dimensions.
• Step 3. The ranked attributes are extracted and eliminates other features.
• Step 4. Apply DNN to classify the dataset.
– Step 4.1: Define input layers with input nodes.
– Step 4.2: Prepare the data for training by establishing the hidden layers.
– Step 4.3: For each node, set the learning rate to (1) and the bias value to (0.15)
to alter the network’s weights.
– Step 4.4: Activation Function: Rectified linear unit (ReLU): f (x) = max(0, x)
– Step 4.5: Specify the number of epochs you wish an output node’s value to
travel back over when back propagation is desired.
– Step 4.6: The network should be trained using the specified set of training data
before being used.
– Step 4.7: To discover the model’s classification rate, feed the test data to the
trained network.
– Step 4.8: Train the network until all epochs have been finished.
– Step 4.9Analyze the model’s precision by determining its accuracy with respect
to evaluation measures.
• Step 5. Test data should be used to verify the model.
13 Breast Cancer Classification Model Using Principal Component … 145

13.4 Results and Discussion

There are 699 cases in the WBC dataset, with 9 variables. The binary classification
models are possible since the class label values are only binary, 0 for benign and 1 for
malignant, in this dataset. When it comes to cancer, the dataset provides two values
for benign and four for malignant. Convert the class label’s 2 and 4 values to zero
and one. 16 of the 699 cases are of no use. To cut down on the number of mistakes in
the system, the missing occurrences are removed. In the end, 683 occurrences select
the features. The WBC dataset is depicted in Table 13.1.
The WBC dataset is first subjected to PCA. The three methods for determining
how many PCs should be held before being classified are used to determine the
PCs. The number of PCs that should be preserved based on the Scree Test is the
number of PCs that remain stable and unchanging. According to the results shown
in Figure 13.2, it is reasonable to keep six PCs.
As a result, the number of PCs is determined depending on their eigenvalues or the
proportion of variance explained by the particular PC. The cumulative variance of
the eigenvalues is shown in Fig 13.3. PCs that meet the 90% requirement for keeping
are shown by a blue solid line and a dotted line in the graph. It suggest the first five
PCs should be kept.
Table 13.2 summarized the dataset’s eigenvalues and total variances. Almost
90.61% of the information is found in the top five PCs.
Finally, the third rule, the KG-Rule, recommended selecting the PC with the
greatest fraction of eigenvalues. According to Table 13.2, the eigenvalues of the first
eight PCs are all bigger than 1, which means that only eight PCs need to be preserved.

Fig. 13.2 Level of stabilization at k = 6


146 M. Sindhuja et al.

Fig. 13.3 Cumulative variance of eigen values

Table 13.2 Eigen values of PCs


S No Eigen Value Cumulative
1 48.54 68.9
2 5.17 76.2
3 4.28 82.3
4 3.1 86.7
5 2.73 90.6
6 2.44 94
7 1.77 96.6
8 1.59 98.8
9 0.8 100

Moreover, three inputs will be used and evaluated as feature vectors for the DNN
classifier, namely Dataset 1 (6 PCs using Scree Test), Dataset 2 (5 PCs using Cumula-
tive Variance) and Dataset 3 (8 PCs using KG rule). Table 13.3 gives the classification
results for both training and testing stages.

Table 13.3 Classification results


Scree test Cumulative variance KG rule
No. of PCs 6 5 8
Training accuracy 98.4 97.3 97.3
Testing accuracy 95.1 96.6 94.9
13 Breast Cancer Classification Model Using Principal Component … 147

Fig. 13.4 Classification accuracy

Table 13.4 Performance comparison of proposed system with existing systems


Classification model Accuracy Model evaluation
NB 71.67 tenfold
EM-PCA-CART 93.2 tenfold
SVM 95 70–30
SVM 78 tenfold
DNN(Proposed) 98.4 80–20

Figure 13.4 shows the training and testing accuracy for the dataset 1 which
comprises 6 attributes selected through Scree test. The proposed system achieved
better accuracy of 98.4% for 80–20% train-test split.
Table 13.4 gives the performance comparison of the proposed system with existing
system. This model behaves as expected and provides a promising result of 98.4%.

13.5 Conclusion and Future Work

Many people today are struggling with modern-day ailments. Breast cancer is the
most common and fatal diseases rising globally. Death rates will rise due to lack of
knowledge and late diagnosis of diseases. Computer-aided diagnosis will be the best
148 M. Sindhuja et al.

way for everyone to diagnose accurately. The CAD system will not replace expert
doctors, but it will help them make better decisions by evaluating patient information.
Practitioners can make mistakes owing to inexperience or inadequate report analysis.
So it will be a better medicinal cure. If the model used to train the system is clear, the
system can only make accurate decisions. This system outperforms earlier models
and just requires minor tweaking. The algorithm takes a long time to train because it
uses a deep neural network. This system will run faster on GPU-based platforms than
on commercial hardware. As a result, the user’s data will be tested and processed
using a more accurate computational model.
PCA and deep neural networks were utilized to classify data in the proposed
system. Its DNN-PCA model accuracy is 98.4%. In this system, input may be carried
and parsed across many layers, each of which contains numerous neurons. Because
of back-propagation, node weights and network fine-tuning node values in each layer
gradually lower the system’s error rate. Due to the network’s complexity, training time
will inevitably increase. In the future, researchers can use particle swarm optimization
or genetic algorithms to select features that will improve the overall model’s accu-
racy. Deep learning methods demand high-end computational resources for training
and testing in order to be implemented on local devices. Computing power can be
increased by using cloud-based virtual machines or parallel processing. It reduces
training time and makes the system computationally cheap.

References

1. Amrane, M., Oukid, S., Gagaoua, I., Ensarİ, T.: Breast cancer classification using machine
learning. In: International Conference on Electric Electronics, Computer Science, Biomedical
Engineerings’ Meeting (EBBT), 18th & 19th April 2018, Istanbul, Turkey. https://doi.org/10.
1109/EBBT.2018.8391453
2. Asria, H., Mousannif, H., Al Moatassime, H., Noeld, T.: Using machine learning algorithms
for breast cancer risk prediction and diagnosis. In: 6th International Symposium on Frontiers in
Ambient and Mobile Systems (FAMS 2016), Procedia Computer Science, vol. 83, pp. 1064–
1069 (2016)
3. Alarabeyyat, A., Alhanahnah, M.: Breast cancer detection using K-nearest neighbor machine
learning algorithm. In: 9th International conference on IEEE, v.i.e.E.(DeSE), pp. 35–39 (2016)
4. Akay, M.F., Support vector machines combined with feature selection for breast cancer
diagnosis. Expert Syst. Appl. 36(2), 3240–3247 (2009)
5. Kumar, N., Verma, R., Arora, A., Kumar, A., Gupta, S., Sethi, A., Gann, P.H.: Convolutional
neural networks for prostate cancer recurrence prediction. In: Proceedings of SPIE, vol. 10140,
Medical Imaging 2017: Digital Pathology, 101400H (1st March 2017). https://doi.org/10.1117/
12.2255774
6. Bejnordi, B.E., Lin, J., Glass, B., Mullooly, M., Gier-ach, G.L., Sherman, M.E., Karssemeijer,
N., Van Der Laak, J., Beck, A.H.: Deep learning-based assessment of tumor-associated stroma
for diagnosing breast cancer in his-topathology images. In: IEEE 14th International Symposium
on Biomedical Imaging, ISBI 2017, Apr 18th to Apr 21st 2017, Melbourne, Australia, pp. 929–
932
7. Atashi, A., Nazeri, N., Abbasi, E., Dorri, S., Alijani_Z, M.: Breast Cancer Risk Assessment
Using adaptive neuro-fuzzy inference system (ANFIS) and Subtractive Clustering Algorithm.
Multi. Cancer Invest. 1(2), pp. 20–26 (2017)
13 Breast Cancer Classification Model Using Principal Component … 149

8. Bhatia, N.: Survey of nearest neighbor techniques. Int. J. Comput. Sci. Inf. Secur. 8(2) (2010)
9. Francillon, A., Rohatgi, P.: Smart card research and advanced applications. In: 12th Interna-
tional conference CARDIS 2013. Springer International Publishing, Berlin, Germany
10. Prabhakar, S.K., Rajaguru, H., Maglaveras, N., Chouvarda, I., de Carvalho, P.: Performance
analysis of breast cancer classification with softmax discriminant classifier and linear discrim-
inant analysis. In: Precision Medicine Powered by pHealth and Connected Health. IFMBE
Proceedings, vol. 66. Springer, Singapore (2018)
11. Snchez, J.S., Mollineda, R.A., Sotoca, J.M.: An analysis of how training data complexity affects
the nearest neighbor classifiers. Pattern Anal. Appl. 10(3), 189–201 (2007)
12. Baldi, P., Brunak, S.R.B., Baldi, P.: Bioinformatics: The Machine Learning Approach (2001)
13. Raniszewski, M.: Sequential reduction algorithm for nearest neighbor rule. Comput. Vis. Graph.
(2010)
14. Bhuvaneswariaa, P., Therese, B.: Detection of cancer in lung with K-NN classification using
genetic algorithm. Proc. Mater. Sci. 10, 433–440 (2015)
15. Zhou, Z., Jiang, Y., Yang, Y., Chen, S.F.: Lung cancer cell identification based on artificial
neural network ensembles artificial intelligence. Med. Elsevier 24, 25–36 (2002)
16. Pradesh, A.: A.o.F.S.w.C.B.C.D. Indian J. Comput. Sci. Eng. 2(5), 756–763 (2011)
17. Kannan, A., Vigneshwaran, P., Sindhuja, R., Gopikanjali, D.: Classification of cancer for type
2 diabetes using machine learning. In: ICT Systems and Sustainability, Springer Advances in
Intelligent Systems and Computing, vol. 1, 1077, pp. 133–141 (2020)

You might also like