technologies-12-00151
technologies-12-00151
Article
A Collaborative Federated Learning Framework for Lung and
Colon Cancer Classifications
Md. Munawar Hossain 1,* , Md. Robiul Islam 1, Md. Faysal Ahamed 1 , Mominul Ahsan 2 and Julfikar Haider 3,*
1 Department of Electrical & Computer Engineering, Rajshahi University of Engineering & Technology,
Rajshahi 6204, Bangladesh; [email protected] (M.R.I.); [email protected] (M.F.A.)
2 Department of Computer Science, University of York, Deramore Lane, Heslington YO10 5GH, UK;
[email protected]
3 Department of Engineering, Manchester Metropolitan University, Chester Street, Manchester M1 5GD, UK
* Correspondence: [email protected] (M.M.H.); [email protected] (J.H.)
Abstract: Lung and colon cancers are common types of cancer with significant fatality rates. Early
identification considerably improves the odds of survival for those suffering from these diseases.
Histopathological image analysis is crucial for detecting cancer by identifying morphological anoma-
lies in tissue samples. Regulations such as the HIPAA and GDPR impose considerable restrictions
on the sharing of sensitive patient data, mostly because of privacy concerns. Federated learning
(FL) is a promising technique that allows the training of strong models while maintaining data
privacy. The use of a federated learning strategy has been suggested in this study to address privacy
concerns in cancer categorization. To classify histopathological images of lung and colon cancers, this
methodology uses local models with an Inception-V3 backbone. The global model is then updated
on the basis of the local weights. The images were obtained from the LC25000 dataset, which consists
of five separate classes. Separate analyses were performed for lung cancer, colon cancer, and their
combined classification. The implemented model successfully classified lung cancer images into three
separate classes with a classification accuracy of 99.867%. The classification of colon cancer images
was achieved with 100% accuracy. More significantly, for the lung and colon cancers combined, the
accuracy reached an impressive 99.720%. Compared with other current approaches, the proposed
framework showed an improved performance. A heatmap, visual saliency map, and GradCAM were
Citation: Hossain, M.M.; Islam, M.R.; generated to pinpoint the crucial areas in the histopathology pictures of the test set where the models
Ahamed, M.F.; Ahsan, M.; Haider, J. A focused in particular during cancer class predictions. This approach demonstrates the potential of
Collaborative Federated Learning federated learning to enhance collaborative efforts in automated disease diagnosis through medical
Framework for Lung and Colon image analysis while ensuring patient data privacy.
Cancer Classifications. Technologies
2024, 12, 151. https://doi.org/ Keywords: lung cancer; colon cancer; histopathological image analysis; image classification; decentralized
10.3390/technologies12090151 machine learning; federated learning; privacy preservation; explainability
Academic Editor: Sikha Bagui
(a) (b)
Figure1. 1.
Figure Total
Total number
number andand proportion
proportion of cancer
of cancer (a) incidences
(a) incidences and
and (b) (b) mortalities
mortalities in 2022in 2022taken
(data (data
taken from [1]).
from [1]).
Onthe
On thebasis
basisofofthetheanalysis,
analysis,ititcan canbebeconcluded
concludedthat thatcolorectal
colorectalcancercancerranks
ranksasasthe the
secondleading
second leadingcause causeofofmortality
mortalityand andthe thethird
thirdhighest
highestinintermstermsofofincidence
incidenceglobally.
globally.InIn
2022,it it
2022, ranked
ranked second
second in terms
in terms of newof newcases,cases, accounting
accounting for 9.6% for of
9.6% of alljust
all cases, cases,
afterjust after
breast
breast cancer. Additionally, it constituted 9.3% of the total
cancer. Additionally, it constituted 9.3% of the total number of deaths, ranking second number of deaths, ranking sec-
ond only to lung cancer [1]. The global impact of this phenomenon
only to lung cancer [1]. The global impact of this phenomenon resulted in approximately resulted in approxi-
mately
1.93 million1.93newmillioncasesnewand cases and 935,000
935,000 deaths, deaths, representing
representing approximately
approximately 10% of10% of all
all new
new cancer
cancer cases and cases and fatalities
fatalities worldwide.worldwide. Withprevalence
With these these prevalence rates sharply
rates rising rising sharply
annually,an-
itnually, it is projected
is projected that by
that by 2040, there2040,
willthere
be 3.2will be 3.2new
million million new instances
instances of colorectal
of colorectal cancer
cancerposing
(CRC), (CRC),aposingseriousathreat
serious to threat
global to global
public public
health [3].health
Hence, [3].itHence,
is crucialit is
tocrucial
employto
employ expeditious
expeditious and efficientandprotocols
efficient protocols
for diagnosticfor diagnostic decisiontomaking
decision making formulate to formulate
a tailored a
tailored treatment
treatment strategy that strategy that maximizes
maximizes patientrates
patient survival survival ratesindividual
in every in every individual
case. case.
Multiple
Multiplefactors factorscontribute
contributetotothe thedevelopment
developmentofofcancer.cancer.TheseTheseinclude
includeexposure
exposuretoto
physical
physicalcarcinogens,
carcinogens,such suchasasradiation
radiationand andultraviolet
ultravioletrays,rays,and andcertain
certainbehaviors,
behaviors,such such
asashaving
havinga ahigh highbody
bodymassmassindexindexand andusing
usingalcohol
alcoholand andtobacco,
tobacco,ininaddition
additiontotospecific
specific
biological
biologicaland andgenetic
genetic factors [4]. [4]. While
Whilesymptoms
symptoms maymay varyvary
amongamong individuals
individuals andandeven
even different types of cancers, none of these signs are exclusive
different types of cancers, none of these signs are exclusive to cancer, and not all patients to cancer, and not all
patients will encounter them. In light of this, cancer detection
will encounter them. In light of this, cancer detection may be difficult in the absence of may be difficult in the
absence of specialized
specialized diagnostic diagnostic
tools such tools
as such as ultrasound,
ultrasound, positron positron
emission emission tomography
tomography (PET),
(PET), computed
computed tomography
tomography (CT), or
(CT), MRI, MRI, or biopsy.
biopsy. Prompt Prompt identification
identification is crucial
is crucial for theforde-
the detection
tection of both of lung
both and
lungcolonand colon
cancers.cancers.
In the In theoffield
field of clinical
clinical medicine, medicine,
symptomssymptoms
of these
ofparticular
these particular
types oftypescancersof cancers
typically typically
manifest manifest
during during the advanced
the advanced stages ofstages of the
the disease.
disease. Physicians face difficulties in the early diagnosis of lung
Physicians face difficulties in the early diagnosis of lung cancer through exclusive reliance cancer through exclusive
reliance
on visual on assessment
visual assessment of CT images.
of CT images. The application
The application of computer-aided
of computer-aided diagnosisdiagnosis
(CAD)
(CAD)
in the examination of histopathological images continues to be a prominent areaarea
in the examination of histopathological images continues to be a prominent of
of em-
emphasis
phasis inin the
the field
field ofof cancerdetection.
cancer detection.
Privacy
Privacy is a significant concernfor
is a significant concern forhealthcare
healthcarefacilities.
facilities.As Asa aresult,
result,regulations
regulationssuch such
asasHIPAA
HIPAA [5] and GDPRs [6] pose significant challenges for institutionswith
[5] and GDPRs [6] pose significant challenges for institutions withrespect
respecttoto
disclosing
disclosingpatient
patientdata,
data,including
includinganonymous
anonymousinformation.
information.To Toovercome
overcomethese thesedifficulties,
difficulties,
the
the adoption of a decentralized method for machine learning called federatedlearning
adoption of a decentralized method for machine learning called federated learninghas has
been suggested. Currently, several deep learning techniques have shown excellent accuracy
been suggested. Currently, several deep learning techniques have shown excellent accu-
when predicting the lung cell class. Nevertheless, none of these entities have adopted a
racy when predicting the lung cell class. Nevertheless, none of these entities have adopted
thorough strategy to address data privacy regulations.
a thorough strategy to address data privacy regulations.
Technologies 2024, 12, 151 3 of 28
representation of the dataset used in this study and the methodology proposed for the
federated learning
federated learningmodel.
model. TheThe experimental
experimental conditions
conditions underunder
which which the proposed
the proposed method
method
is is compared
compared are presented
are presented in4.
in Section Section 4. 5Section
Section 5 provides
provides a comprehensive
a comprehensive analysisanal-
and
evaluation of the experiment’s
ysis and evaluation performance,
of the experiment’s highlighting
performance, its findings.
highlighting itsSection 6 concludes
findings. Section 6
the proposed
concludes theresearch
proposedand provides
research recommendations
and for further enhancement.
provides recommendations Finally,
for further enhance-
the article concludes by providing a concise overview of the results.
ment. Finally, the article concludes by providing a concise overview of the results.
Figure 2.
Figure 2. Federated learning
learning conceptual
conceptual framework
framework applicable
applicable in
in healthcare
healthcare sector.
sector.
2.
2. Literature
Literature Survey
Survey
2.1. Lung and Colon Cancer Diagnoses
2.1. Lung and Colon Cancer Diagnoses
Lung and colon cancers are prevalent and highly fatal diseases that have a significant
Lung and colon cancers are prevalent and highly fatal diseases that have a significant
global impact, affecting many individuals annually. Despite their organ-specific origins,
global impact, affecting many individuals annually. Despite their organ-specific origins,
there are several shared features and important distinctions between the two in terms
there are several shared features and important distinctions between the two in terms of
of prognosis, diagnostic criteria, and therapeutic options. Numerous research groups
prognosis, diagnostic criteria, and therapeutic options. Numerous research groups have
have achieved substantial advancements in the detection of lung and colon cancers in
achieved substantial advancements in the detection of lung and colon cancers in recent
recent years. These developments include the application of deep learning methods on the
years.ofThese
basis developmentsimage
histopathological include the application
analysis. The worksof deep learning methods
are organized into threeon the basis
categories
of histopathological
for image(adenocarcinoma,
classifying lung cancer analysis. The works are organized
squamous into three
cell carcinoma, andcategories for
benign), two
Technologies 2024, 12, 151 5 of 28
categories for classifying colon cancer subtypes (adenocarcinoma and benign), and five
categories for classifying both lung and colon cancer categories.
In a previous study [7], a deep learning-based supervised learning technique was
developed to classify lung and colon cancer tissues into five distinct categories. The ap-
proach implemented utilized two methods for feature extraction: 2D Fourier features and
2D wavelet features. The final accuracy of the work was 96.33%. Another study [8] utilized
feature extraction from histopathology images and various machine learning classifiers,
such as random forest (RF) and XGBoost, to classify lung and colon cancers. The study
achieved impressive accuracies of 99%. A CNN pretrained diagnostic network was specifi-
cally designed for the detection of lung and colon cancers [9]. The network demonstrated
a high level of accuracy in diagnosing colon and lung cancers, achieving accuracies of
96% and 97%, respectively. Convolutional neural networks (CNNs) using the VGG16
model and contrast-limited adaptive histogram equalization (CLAHE) were used by other
researchers [10] to classify 25,000 histopathology images. Transformers have advanced
medical image analysis but struggle with feature capture, information loss, and segmen-
tation accuracy; CASTformer addresses these issues with multi-scale representations, a
class-aware transformer module, and adversarial training [11]. Furthermore, incremental
transfer learning (ITL) offers an efficient solution for multi-site medical image segmentation
by sequentially training a model across datasets, mitigating catastrophic forgetting, and
achieving superior generalization to new domains with minimal computational resources
and domain-specific expertise [12].
One study [13] discussed the use of histogram equalization as a preprocessing step,
followed by the application of pretrained AlexNet, to improve the classification of lung
and colon cancers. Toğaçar et al. [14] utilized a pretrained DarkNet-19 in conjunction with
support vector machine classifiers to attain a 99.69% accuracy rate in their study. Using
DenseNet-121 and RF classifiers, Kumar et al. [15] achieved a 98.6% accuracy rate in their
classification. Another study utilized feature extraction and ensemble learning techniques,
along with the incorporation of high-performance filtering, to attain an impressive accuracy
rate of 99.3% when using LC25000 data [16]. The use of artificial neural networks (ANNs)
with merged features from the VGG-19 and GoogLeNet models was covered in [17]. The
ANN achieved an accuracy of 99.64% when the fusion features of VGG-19 and the hand-
crafted features were combined. In a separate study, the authors employed a convolutional
neural network (CNN) with a SoftMax classifier, which they named AdenoCanNet [18].
The accuracy of the entire LC25000 dataset was 99.00%.
In addition to the previously discussed methods, recent studies have made significant
advancements in learning-based methods for medical image segmentation. Contrastive
learning and distillation techniques have shown promise in addressing the challenges of
limited labeled data and segmentation accuracy in medical image analysis, with methods
like contrastive voxel-wise representation learning (CVRL) [19] and SimCVD [20] advanc-
ing state-of-the-art voxel-wise representation learning by capturing 3D spatial context,
leveraging bi-level contrastive optimization, and utilizing simple dropout-based augmenta-
tion to achieve competitive performance even with less labeled data. Additionally, ACTION
(Anatomical-aware ConTrastive dIstillatiON) [21] tackles multi-class label imbalance by
using soft negative labeling and anatomical contrast, improving segmentation accuracy and
outperforming state-of-the-art semi-supervised methods on benchmark datasets. Finally,
ARCO enhances semi-supervised medical image segmentation by introducing stratified
group theory and variance-reduction techniques, addressing tail-class misclassification and
model collapse, and demonstrating superior performance across eight benchmarks [22].
When the disadvantages of the current models used in lung and colon cancer classifi-
cations are analyzed, several challenges and limitations can be identified:
(1) Data Privacy Concerns: Many existing models require centralized data collection,
where medical images from different institutions are pooled together in a single
repository. This raises serious privacy concerns, especially in healthcare, where patient
Technologies 2024, 12, 151 6 of 28
data are highly sensitive. Centralized models can be susceptible to data breaches and
may not comply with regulations such as HIPAA or GDPR.
(2) Limited Generalization: Centralized models are often trained on data from a limited
number of sources or geographic locations, which can result in poor generalizability
to other patient populations. This lack of diversity in the training data can lead to
biases and reduced effectiveness when applied to new datasets, limiting the model’s
ability to handle variations in medical imaging from different institutions or regions.
(3) Computational Requirements: Modern models for cancer classification, such as
deep convolutional neural networks (CNNs), demand significant computational
resources. This can be a barrier for smaller institutions with limited access to high-
performance computing infrastructure. Moreover, training large-scale models can be
time-consuming and energy-intensive.
(4) Imbalance in Class Distribution: Medical datasets, including lung and colon cancer
imaging datasets, often suffer from class imbalance, where the number of images of
cancerous tissues is much lower than that of non-cancerous ones. This imbalance can
bias the model, making it more likely to misclassify cancer cases, which is especially
problematic in clinical settings where false negatives can be life-threatening. Work
reported by You et al. [23] introduced adaptive anatomical contrast with a dynamic
contrastive loss, which better handles class imbalances in long-tail distributions.
(5) Difficulty in Handling Heterogeneous Data: Medical imaging data can be highly
heterogeneous due to differences in imaging equipment, protocols, and settings across
institutions. Current models may struggle to handle this heterogeneity, leading to
reduced performance when applied to data from sources other than the training data.
2.2. Federated Learning Applications
The integration of massive amounts of data can benefit machine learning models,
as stated previously. Access to data in the medical field is highly limited because of
the strict considerations of user privacy and data security. In this context, decentralized
collaborative machine learning algorithms that protect privacy are appropriate for creating
intelligent medical diagnostic systems. The notion of federated learning, which was initially
introduced by Google in 2016 [24,25], has since been expanded to encompass scenarios
involving knowledge integration and collaborative learning between organizations.
A client server-based method called federated averaging (FedAvg) was used for breast
density classification in [26]. This method incorporates local stochastic gradient descent
(SGD) on each client with a server that performs model averaging. In their publication [27],
the authors proposed a federated learning approach utilizing pretrained deep learning
models for the purpose of COVID-19 detection. The clients aimed to collaborate to achieve
a global model without the need to share individual samples from the dataset. Another
federated learning framework [28] for lung cancer classification utilizing histopathological
images demonstrated 99.867% accuracy, while imposing significant limitations on data
sharing between institutions. Zhang et al. [29] introduced a dynamic fusion-based approach
for COVID-19 detection. An image fusion technique was employed to diagnose patients
with COVID-19 on the basis of their medical data. The evaluation parameters yielded
favorable outcomes. However, the lack of consideration for patient data privacy was a
significant oversight in the proposed medical image analysis.
In the healthcare industry 5.0 domain, researchers have proposed that the Google
net deep machine learning model is utilized for precise disease prediction in the smart
healthcare industry 5.0 [30]. The proposed methodology for secure IoMT-based transfer
learning achieved a 98.8% accuracy rate, surpassing previous state-of-the-art methodologies
used in cancer disease prediction within the smart healthcare industry 5.0 on the LC25000
dataset. In a parallel investigation of Society 5.0 [31], researchers presented data as a service
(DaaS) along with a suggested framework that uses the blockchain network to provide
safe and decentralized transmission and distribution of data and machine learning systems
on the cloud. The main contributions and shortcomings of previous federated learning
research can be found in Table 1.
Technologies 2024, 12, 151 7 of 28
The major gaps in the literature concerning lung and colon cancer classifications that
inspired the current study framework are briefly summarized below.
• There is a noticeable absence of sufficient measures to guarantee the privacy and
security of patient data.
• There are instances where the computational cost becomes considerably higher owing
to the substantial increase in the data scale, making it challenging to maintain efficiency
and performance.
Figure
Figure 3. 3.Diagram
Diagramillustrating
illustratingthe
theentire
entirepath
pathofofthe
thefederated
federatedlearning
learningstudy,
study, including
including steps
steps from
from
data collection to model evaluation and explainable AI
data collection to model evaluation and explainable AI integration.integration.
TheLC25000
The data that are subsequently
dataset distributed
consisted primarily among
of 1250 clientsslide
pathology via images
an independent
of lung and and
identically
colon tissues.distributed
Borkowski (IID) approach.
et al. [32] usedLocal models are library
an Augmentor first developed
to applythrough training
preprocessing
on the data,
techniques andimages
to the then the clients
and send the size
increased model parameters
of our dataset to athe server.
total Afterimages.
of 25,000 training
This
thewas achieved
local models,through the implementation
the results are aggregatedofinvarious augmentations,
a secure, including
centralized server left and a
to update
right rotations
global model,with
which a maximum
representsangle of 25 degrees
the combined and a probability
knowledge of 1.0. Additionally,
of all institutions. This process
horizontal
involves training data on individual client devices and subsequently merging the the
and vertical flips were applied with a probability of 0.5. Consequently, local
dataset
modelswas on aexpanded to a total
central server. of 25,000 images,
The workflow whichlearning
for federated were further categorizedisinto
via Inception-V3 illus-
five distinct
trated categories.
in Figure 4. Each category contained 5000 images. The images were resized to
dimensions of 768 × 768 prior to the application of augmentation techniques. To guarantee
Technologies 2024, 12, 151 9 of 28
privacy and unrestricted usage, these images underwent validation and adhered to the
regulations set forth by the Health Insurance Portability and Accountability Act (HIPAA).
Table 2 displays the designated names and IDs assigned to each class of images within
the dataset and an overview of the dataset split. To reduce computational complexity, we
downsized the images in our training and test directories from our pre-existing dataset,
which had 100 × 100 pixels. The utilization of the training and test directories is justified
by the fact that the test directory’s images are utilized to test the global model, whereas the
training directory’s images are disseminated to end devices/clients for local data training.
Technologies 2024, 12, x FOR PEER REVIEW 9 of 31
The dataset containing 25,000 lung and colon cancer images is organized into training,
testing, and validation sets at an 80:10:10 ratio.
Federated
Learning
Train Model
Locally
Send
Send Parameters
Parameters
Back to the
Back to the Server
Server Communication
Rounds
After 50 communication
rounds, final evaluation is
performed
Figure
Figure 4.
4. Federated
Federated learning
learning with
with Inception-V3
Inception-V3 methodology.
methodology.
Table
3.1. 2. Description
Dataset, of the employed
Preprocessing, dataset.
and Splitting
The dataset
Image LC25000, publishedFolder in 2020 Total
by A. Borkowski
TrainingandTesting
colleagues [32], was
Validation
utilized in this
Type study. The collection
Title contains images
Images of lung and
Set colon tissues,
Set which
Set are
categorized into five distinct classes.
Lung Adenocarcinoma lung_aca
There are5000
three distinct types of 500
4000
lung tissue 500
images:
adenocarcinoma,
Lung Benign squamous cell carcinoma, and
lung_bnt benign. Some
5000 4000 sample500
images of the
500 clas-
ses canLung
be seen in Figure 5. The production of this content was made possible through the
Squamous
lung_scc 5000 4000 500 500
Cell of
provision Carcinoma
resources and utilization of facilities at James A. Haley Veterans’ Hospital. It
Colon Adenocarcinoma
is collected from patients throughcolon_aca 5000 by physiologists.
keen observation 4000 500 500
Colon Benign colon_bnt 5000 4000 500 500
Technologies 2024,12,
Technologies2024, 12,151
x FOR PEER REVIEW 10 of
10 of 28
31
Figure 5.
Figure 5. Histopathological images from LC25000 dataset where (a) lung adenocarcinoma; (b) lung
benign; (c)
benign; (c) lung
lung squamous
squamous cell
cell carcinoma;
carcinoma; (d)
(d) colon
colon adenocarcinoma;
adenocarcinoma;and
and(e)
(e)colon
colonbenign.
benign.
3.2. Description
The LC25000 of the Classes
dataset consisted primarily of 1250 pathology slide images of lung and
3.2.1. Lung Adenocarcinoma
colon tissues. Borkowski et al. [32] used an Augmentor library to apply preprocessing
techniques to the images and
Lung adenocarcinoma increased
represents thethe size of our
prevailing formdataset to a total
of primary lungofcancer
25,000observed
images.
This was
inside theachieved throughThis
United States. the particular
implementation of various
condition augmentations,
is classified within theincluding
categoryleft
of
and right rotations
non-small with a (NSCLC)
cell lung cancer maximumand angle of 25 degrees
is closely linked toand a probability
a history of 1.0.
of tobacco Addi-
smoking.
tionally, horizontal
Although and vertical
there has been a decreaseflipsinwere applied
incidence andwith a probability
mortality of 0.5.continues
rates, cancer Consequently,
to be
the primary
the dataset wascauseexpanded
of deathto a totaltoofthis
related 25,000 images,
disease in thewhich
United were further
States. categorized into
Adenocarcinoma of
the
fivelung typically
distinct arisesEach
categories. fromcategory
the mucosal glands5000
contained and images.
accountsThe forimages
approximately 40% of
were resized to
the total cases
dimensions of of
768lung cancer
× 768 prior[33].
to the application of augmentation techniques. To guarantee
privacy and unrestricted usage, these images underwent validation and adhered to the
3.2.2. Lung Benign
regulations set forth by the Health Insurance Portability and Accountability Act (HIPAA).
TableThe lung and
2 displays the bronchus
designatedencompass
names andaIDs diverse collection
assigned to eachof benign
class tumors,
of images which
within the
typically
dataset and manifest as single,
an overview ofperipheral
the datasetlung nodules
split. or, less
To reduce frequently, ascomplexity,
computational endobronchialwe
lesions
downsizedthat the
result in obstructive
images symptoms.
in our training These
and test tumors from
directories commonly occur without
our pre-existing any
dataset,
noticeable
which hadsymptoms. Surgical
100 × 100 pixels. Theremoval of all
utilization endobronchial
of the training andlesions is recommended
test directories to
is justified
ease symptoms and prevent potential damage to distal lung tissue.
by the fact that the test directory’s images are utilized to test the global model, whereas
the training directory’s images are disseminated to end devices/clients for local data train-
3.2.3. Lung Squamous Cell Carcinoma
ing. The dataset containing 25,000 lung and colon cancer images is organized into training,
Squamous
testing, cell carcinoma
and validation sets at an(SCC) of theratio.
80:10:10 lung, alternatively referred to as squamous cell
lung cancer, represents a subtype of non-small cell lung cancer (NSCLC). Squamous cell
lung cancers frequently manifest in the central region of the lung or the primary airway,
specifically the left or right bronchus. Tobacco smoke is well recognized as the primary
causal agent responsible for cellular change. The prevalence of smoking-related lung cancer
is estimated to be approximately 80% in males and 90% in females [34].
Technologies 2024, 12, 151 11 of 28
k
nk k
ω t +1 ← ∑ ω
n t +1
(1)
k =1
proach enables participants to collectively train a global model without the need to share
their individual private data, as illustrated in Figure 6. This diagram illustrates a federated
learning architecture where multiple decentralized data sources (represented by servers)
locally train machine learning models on their own data. The locally trained models are
then encrypted and sent to a central server, which aggregates them into a global model
Technologies 2024, 12, 151 12 of 28
without accessing the raw data. This decentralized approach preserves data privacy while
improving the global model through collaborative learning. The regional data need to un-
dergo preprocessing for each contributor. This involves making modifications, digitizing,
where
and standardizing the data to transform it into a standardized format while ensuring pri-
k = participants;
vacy. The distribution of images from our dataset among various clients is achieved by
nk = samples of participants k;
dividing the total size of the image by the number of clients. The dataset is divided uni-
n = samples of all participants;
formly, resulting in the generation of independent and identically distributed (IID) data.
wt+1 = local parameter of k participants.
Figure Workflowofoffederated
Figure 6. Workflow federated learning
learning within
within a federated
a federated server server to client.
to a local a local client.
A federated
After round
receiving is the name
the model given to
parameters eachthe
from iteration
clients,ofthe
this process,
server willand it consists
summarize the of
concurrent training, update aggregation, and parameter distribution. The following
information on the basis of the structure of the central server. It updates the parameters are the
primary
of control
the existing parameters
model thatitare
and stores forutilized in the process
the subsequent roundof ofcomputing FL:
training parameter upload
C = customers or contributors who took part in an update cycle;
and collection from the participants before it is redistributed. The initial FL iterative pro-
cedureE is
= number of local
subsequently epochs by
followed thatthe
each contributor
iterative process has
ofbeen responsible for; model. In
our comprehensive
B = smallest batch size that can be utilized for a local update.
this process, a CNN was integrated by tailoring for cancer tissue data samples and modi-
1 and
fying βthe β2 are
model toconsidered hyperparameters.
enable continuous iterations. In the context of dispersed machine
During the regional model optimization step, the clients execute a specific number of
local epochs. The Adam optimizer is utilized for the first- and second-order moments to
overcome local minima [35]:
1 − βn2
p
m
ωi,t ← ωi,t − η × √ i,n (2)
1 − βn1 vi,n + σ
convolutional layers, pooling layers, and inception modules that combine various filter
sizes (e.g., 1 × 1, 3 × 3, and 5 × 5) to capture different types of image features. The
model extensively utilizes batch normalization, which is applied to activation inputs.
The computation of loss involves the utilization of Softmax. The Inception-V3 model
also includes the integration of global average pooling, a technique that replaces the
conventional fully connected layers located at the final stage of the network. By reducing
Technologies 2024, 12, x FOR PEER REVIEW 14 of 31
overfitting and parameterizing the model size, the efficiency of the system is enhanced and
its adaptability to new tasks is improved.
into the implementation. These layers included the dense layer, flatten layer, and dropout
layer. The ImageNet dataset was utilized as the default weight in the model, similar to the
implementation observed in the CNN. The Adam optimizer with a learning rate of 0.00001
was used to optimize the accuracy of the Inception-V3 model. Additionally, the categorical
cross-entropy method was implemented to calculate the loss function.
System Specification
Processor Intel Xeon CPU
CPU ~2.30 GHz
RAM 85 GB
GPU NVIDIA A100
GPU RAM 40 GB
Hard Disk 80 GB
Hyperparameter Value
Optimizer Adam
Loss Categorical Crossentropy
Batch Size 16
Image Size 100 × 100
No. of Epochs 50
No. of Clients 5
the proportion of accurate predictions relative to the ground truth. The F1 score takes
the harmonic mean of both precision and recall to create a single metric. The respective
formulas are presented here.
TP + TN
Accuracy = (4)
TP + TN + FP + FN
TP
Precision = (5)
TP + FP
TP
Recall = (6)
TP + FN
2 × ( Precision × Recall )
F1_score = (7)
Precisin + Recall
TN
Speci f icity = (8)
TN + FP
where TP = True Positive, TN = True Negative, FP = False Positive, and FN = False Negative.
4. Experimental Results
The suggested technique is evaluated via histopathology images obtained from the
LC25000 dataset [32]. This section presents a concise overview of the outcomes obtained
by the proposed federated learning model when provided with Inception-V3 for the
categorization of histological images from the lung and colon cancer datasets. The results
were categorized into three sections: lung, colon, and combined lung–colon outcomes for
the model.
Every experiment involved the training of five distinct transfer learning models,
namely, Inception-V3, VGG16, ResNet-50, ResNeXt50, and Xception. Initially, transfer
learning models were utilized to train the initial lung cancer images to evaluate the
models’ performance on the dataset. A comparative analysis of the outcomes of the
models was conducted to determine which model was more suitable for the federated
learning approach.
Table 5. Performance analysis of the lung cancer images compared to the base models.
Figure 8. Performance comparisons of federated learning with Inception-V3 with other centralized
transfer8.learning
Figure 8. models,
Performance i.e., Inception-V3,
comparisons federatedVGG-16,
of federated ResNet-50,
learning with ResNeXt-50,
with Inception-V3
Inception-V3 andcentralized
with other
other Xception, on lung
Figure Performance comparisons of learning with centralized
cancer
transferimages.
learning models, i.e., Inception-V3, VGG-16, ResNet-50, ResNeXt-50, and Xception, on lung
transfer learning models, i.e., Inception-V3, VGG-16, ResNet-50, ResNeXt-50, and Xception, on lung
cancer images.
cancer images.
Figure 9 displays the confusion matrix generated by the proposed technique for lung
Figure
cancer. The99data
Figure displays
displays the
theconfusion
in Table 6 clearlymatrix
confusion generated
indicate
matrix byproposed
that the
generated the the
by proposed technique
model
proposed achievedfor lung
technique 100%
for accu
cancer.
lung The
cancer. data
The in Table
data in 6 clearly
Table 6 indicate
clearly that
indicatethe
thatproposed
the model
proposed achieved
model 100%
achieved
racy in detecting squamous cell carcinoma (lung_scc) and benign lung (lung_bnt) images accu-
100%
racy in detecting
accuracy squamous
in detecting cell carcinoma
squamous (lung_scc)
cell carcinoma and benign benign
(lung_scc) lung (lung_bnt) images.
Additionally, the model demonstrated 99.60% accuracyandin correctlylung (lung_bnt)
identifying lung ade
Additionally,
images. the model
Additionally, demonstrated
the model 99.60% accuracy
demonstrated 99.60% in correctly
accuracy in identifying
correctly lung ade-
identifying
nocarcinoma (lung_aca) images.
nocarcinoma (lung_aca) images.
lung adenocarcinoma (lung_aca) images.
Figure 9. Confusion matrix for federated learning with Inception-V3 for lung cancer classification.
Figure9.9.Confusion
Figure Confusionmatrix
matrixfor for
federated learning
federated with Inception-V3
learning for lungfor
with Inception-V3 cancer
lungclassification.
cancer classification.
Technologies 2024, 12, x FOR PEER REVIEW 18 of 31
Table 6. Class-wise performance analysis of the lung cancer images using federated learning with
Technologies 2024, 12, 151 17 of 28
Inception-V3.
Figure
Figure 10.
10. Performance
Performancecomparisons
comparisonsof offederated
federatedlearning
learningwith
withInception-V3
Inception-V3 with
with other
othercentralized
centralized
transfer learning models, i.e., Inception-V3, VGG-16, ResNet-50, ResNeXt-50, and Xception, on co-
transfer learning models, i.e., Inception-V3, VGG-16, ResNet-50, ResNeXt-50, and Xception, on colon
lon cancer images.
cancer images.
Figure 11 displays the confusion matrix of the proposed method for colon cancer.
On the basis of the data presented in Table 8, the proposed model clearly exhibited a
remarkable ability to accurately detect colon adenocarcinoma (colon_aca) and colon benign
(colon_bnt) images, achieving a 100% accuracy rate.
Table 7. Performance analysis of colon cancer images compared to the base models.
Figure 11.Confusion
Figure11. Confusionmatrix for for
matrix federated learning
federated with Inception-V3
learning for colon
with Inception-V3 cancer
for colonclassification.
cancer classification
Table 8. Class-wise performance analysis of the colon cancer images using federated learning with
Table 8. Class-wise performance analysis of the colon cancer images using federated learning with
Inception-V3.
Inception-V3.
Type of Class Precision Recall F1 Score Specificity Accuracy
Type of Class Precision Recall F1 Score Specificity Accuracy
colon_aca 100% 100% 100% 100% 100%
colon_aca
colon_bnt 100%
100% 100%
100% 100% 100% 100% 100% 100% 100%
colon_bnt
Macro Average 100%
100% 100%
100% 100% 100% 100% 100% 100% 100%
Micro Average
Macro Average 100%
100% 100%
100% 100% 100% 100% 100% 100% 100%
Micro Average 100% 100% 100% 100% 100%
4.3. Lung and Colon Cancers
4.3. Table 9 and
Lung and Figure
Colon 12 provide a comprehensive overview of the performance of
Cancers
the federated learning model in comparison to other base models on lung and colon
cancerTable 9 and
images. TheFigure 12 provide
Inception-V3, a comprehensive
VGG16, overview
ResNet50, ResNeXt50, andofXception
the performance
models of th
federated learning model in comparison to other base models on lung and
achieved average classification accuracies of 98.96%, 98.36%, 98.88%, 98.88%, and 99.10%, colon cance
images. TheThe
respectively. Inception-V3,
Xception modelVGG16, ResNet50,
demonstrated the ResNeXt50, andprecision,
highest accuracy, Xceptionand
models
recall,achieved
average
all classification
at 99.10%. Conversely,accuracies
the VGG16of 98.96%,
model 98.36%,
exhibited 98.88%,
the lowest 98.88%, and
performance. 99.10%, respec
Inception-
V3 demonstrated
tively. a strong
The Xception performance,
model comparable
demonstrated to theaccuracy,
the highest results achieved by Xception.
precision, and recall, all a
After implementing the federated learning model with Inception-V3,
99.10%. Conversely, the VGG16 model exhibited the lowest performance. it became clear that it
Inception-V
significantly outperformed all the other methods. The achieved accuracy, precision, and
demonstrated a strong performance, comparable to the results achieved by Xception. Af
recall were all 99.72%.
ter implementing the federated learning model with Inception-V3, it became clear that i
significantly outperformed all the other methods. The achieved accuracy, precision, and
recall were all 99.72%.
Technologies 2024, 12, 151 19 of 28
Table 9. Performance analysis of lung and colon cancer images compared to the base models.
Figure 12. Performance comparisons of federated learning with Inception-V3 with other centralized
Figure 12. Performance comparisons of federated learning with Inception-V3 with other centralized
transfer learning models, i.e., Inception-V3, VGG-16, ResNet-50, ResNeXt-50, and Xception, on com-
transfer
bined lunglearning models,
and colon i.e.,
cancer Inception-V3, VGG-16, ResNet-50, ResNeXt-50, and Xception, on
images.
combined lung and colon cancer images.
Figure 13 displays the confusion matrix of the proposed method for lung and colon
Figure 13 displays the confusion matrix of the proposed method for lung and colon
cancers. On the basis of the data presented in Table 10, the proposed model exhibited a
cancers. On the basis of the data presented in Table 10, the proposed model exhibited a
high level of accuracy in detecting benign lung (lung_bnt) images, with a 100% accuracy
high level of accuracy in detecting benign lung (lung_bnt) images, with a 100% accuracy
rate. Similarly, the model demonstrated 99.72% accuracy in correctly identifying lung ad-
rate. Similarly, the model demonstrated 99.72% accuracy in correctly identifying lung
enocarcinoma (lung_aca) and squamous cell carcinoma (lung_ssc) images. For the colon
adenocarcinoma (lung_aca) and squamous cell carcinoma (lung_ssc) images. For the colon
cancer
cancer images,
images, 100%
100% accuracy
accuracy was
was achieved
achieved for
for both
both classes.
classes.
Table 10. Class-wise performance analysis of the lung and colon cancer images using federated
learning with Inception-V3.
13.Confusion
Figure 13.
Figure Confusion matrix
matrixforfor
federated learning
federated with with
learning Inception-V3 for combined
Inception-V3 lung andlung
for combined colonand colon
cancer classifications.
cancer classifications.
4.4. Client-Wise Results
4.4. Client-Wise Results
Figure 14 illustrates the iterative process of updating and optimizing the global model
Figure
following each14communication
illustrates the iterative
event and how process of updating
the clients’ individualand optimizing
accuracies the global
vary from
model
the globalfollowing
accuracy. each
After communication event and
a total of 50 communication how the
rounds, theaccuracy
clients’reached
individual accuracies
99.867%
vary from
for lung theclassification,
cancer global accuracy.100% forAfter
colona total
cancerofclassification,
50 communicationand 99.72% rounds,
for lungthe
andaccuracy
colon cancer
reached classifications.
99.867% for lungIncancer
the local context, each 100%
classification, communication
for colonround cancerencompassed
classification, and
comprehensive
99.72% for lungtraining sessions
and colon cancerfor all clients. To facilitate
classifications. In thethe simulation,
local context, aeach
single epoch
communication
was conducted for the purpose of local communication rounds. Following
round encompassed comprehensive training sessions for all clients. To facilitate the sim- the completion
of the simulation, accuracy, loss, and categorical accuracy metrics for all of the esteemed
ulation, a single epoch was conducted for the purpose of local communication rounds.
clients were successfully acquired. Despite not being immediately evident, the findings
Following the completion
indicate a positive upward trend of the simulation,
(thicker red lines).accuracy,
This impliedloss,that
and categorical
there accuracy met-
was a possibility
rics for all of the esteemed clients were successfully acquired.
of an increase in the accuracy of the client, although it cannot be guaranteed. Despite not being immedi-
atelyItevident,
was evidentthethat,
findings indicatewith
in accordance a positive upward
the initial trend
predictions, the(thicker red lines). This im-
clients demonstrated
plied thataccuracy
increased there was withaeach
possibility
successiveof communication
an increase in round.
the accuracy of the client,
In the client-wise although it
accuracy
measure,
cannot bethere was a steady increase in accuracy, indicating that performance improved
guaranteed.
with each communication round.
Figure 14. Global accuracy vs. local accuracy for the combination of the lung, colon and lung–colon methods.
Technologies
Technologies 2024,
2024, 12,
12, x151
FOR PEER REVIEW 24 of 31
22 of 28
Figure 15. Attention visualization of images employing GradCAM, heatmaps, and saliency maps.
Figure 15. Attention visualization of images employing GradCAM, heatmaps, and saliency maps.
5. Discussion
Grad-CAM determines the gradients of the target class in relation to the last convolu-
5.1. Comparative
tional layer of theAnalysis
model. Grad-CAM uses gradients to identify the influential regions of an
imageTothat contributethe
demonstrate to the classification
effectiveness of decision. The heatmap
the proposed method,illustrates the significance
a comparative analysis
of each
was pixel in with
conducted contributing
previous to studies
the model’s
that classification decision. dataset.
utilized the LC25000 The colorThere
red represents
are very
regions
few thatthat
studies havearegreater significance
exclusively concernedfor the
withclassification,
lung and colon whereas cooler colors
malignancies, and thesuch as
ma-
blue indicate regions that are less important. Saliency maps are generated
jority of them employed CNN-based centralized deep learning techniques. The method- by calculating
the gradient
ologies of thehere
discussed predicted
do not class
ensurescore
dataofprivacy,
the model with necessitate
as they respect to the pixels
access of the
to all input
training
image. and
images The their
purpose of this technique
corresponding is to identify
classification labels.the regions
The in themethodology
proposed input image that have
not only
the greatest impact
outperforms the mostonadvanced
the model’s output.
methods forThe images
detecting were
lung andchosen
colon randomly from the
cancers separately
dataset
and andcomparable
shows may behave differently from
performance one image
for detecting to and
lung the other.
colon cancers combinedly but
By examining the GRAD-CAM output and heatmaps,
also maintains the privacy element of patient data (Table 11). it was observed that the classi-
fier distinguishes lung adenocarcinoma by focusing on cell near-white regions, which are
indicative of mucosa or connective tissue. For lung squamous cell carcinoma, the classifi-
cation is based on the fish-scale appearance of the cells under the microscope. In healthy
lung tissue (lung_benign), the classifier identifies red blood cells as a key distinguishing
feature. For colon adenocarcinoma, the classifier relies on areas with irregular glandular
structures, atypical cell shapes, and disrupted tissue architecture, highlighting the presence
of desmoplastic stroma and inflammatory cell infiltrates. In contrast, benign colon tissue is
characterized by regular, well-organized glandular structures and uniform cell shapes.
Technologies 2024, 12, 151 23 of 28
5. Discussion
5.1. Comparative Analysis
To demonstrate the effectiveness of the proposed method, a comparative analysis was
conducted with previous studies that utilized the LC25000 dataset. There are very few
studies that are exclusively concerned with lung and colon malignancies, and the majority
of them employed CNN-based centralized deep learning techniques. The methodologies
discussed here do not ensure data privacy, as they necessitate access to all training images
and their corresponding classification labels. The proposed methodology not only outper-
forms the most advanced methods for detecting lung and colon cancers separately and
shows comparable performance for detecting lung and colon cancers combinedly but also
maintains the privacy element of patient data (Table 11).
Performance
Previous Studies Year Approaches
Colon Lung Lung and Colon
Mangal et al. [9] 2020 Deep learning approach using CNN Accuracy: 96.00% Accuracy: 97.89% -
Tasnim et al. [43] 2021 CNN with max pooling Accuracy: 99.67% - -
Talukder et al. [16] 2021 Deep feature extraction and ensemble learning Accuracy: 100% Accuracy: 99.05% Accuracy: 99.30%
Shandilya et al. [44] 2021 Pretrained CNN - Accuracy: 98.67% -
Hadiyoso et al. [10] 2022 VGG-19 architecture and CLAHE framework Accuracy: 98.96% - -
Karim et al. [45] 2022 Extreme learning machine (ELM)-based DL - Accuracy: 98.07% -
Accuracy: 98.97%
Raju et al. [46] 2022 Extreme learning machine (ELM)-based DL Precision: 98.87% - -
F1 Score: 98.84%
Accuracy: 99.00% Accuracy: 99.53%
Chehade et al. [8] 2022 XGBoost Precision: 98.6% Precision: 99.33% Accuracy: 99%
F1 Score: 98.8% F1 Score: 99.33%
Accuracy: 99.84%
Ren et al. [47] 2022 Deep convolutional GAN (LCGAN) - Precision: 99.84% -
F1 Score: 99.84%
Transfer learning
Mehmood et al. [13] 2022 - - Accuracy: 98.4%
with class selective image processing
Transfer learning with a secure
Khan et al. [30] 2023 - - Accuracy: 98.80%
IoMT-based approach
Toğaçar et al. [14] 2022 DarkNet-19 model and SVM classifier - - Accuracy: 99.69%
Attallah et al. [48] 2022 CNN features with transformation methods - - Accuracy: 99.6%
Accuracy: 96.33%
Deep learning (DL) and digital image processing
Masud et al. [7] 2022 - - Precision: 96.39%
(DIP) techniques
F1 Score: 96.38%
Accuracy: 99.64%
Al-Jabbar et al. [17] 2023 Fusion of GoogleNet and VGG-19 - -
Precision: 100%
Ananthakrishnan et al. [18] 2023 CNN with an SVM classifier Accuracy: 99.8% Accuracy: 98.77% Accuracy: 100%
Accuracy: 100% Accuracy: 99.87% Accuracy: 99.72%
Proposed Model 2024 Federated learning with Inception-V3 Precision: 100% Precision: 99.87% Precision: 99.72%
F1 Score: 100% F1 Score: 99.87% F1 Score: 99.72%
Note: Bold numerical values indicate best results.
In this context, we introduce the proposed federated learning methodology, which can
effectively guarantee adherence to regulations while addressing patient data and precisely
categorizing cancer cells with great accuracy.
The federated averaging technique aggregates model updates from multiple clients
to compute a global model. This aggregation process combines knowledge from diverse
sources while mitigating biases and noise present in individual client datasets. As a result,
the global model benefits from the collective intelligence of all participating clients, leading
to enhanced performance in cancer classification tasks. The framework used here hosts five
clients to demonstrate the federated learning process, but it is possible to scale efficiently
to a large number of devices. Each device performs local computations, and only model
updates are aggregated centrally, reducing the burden on central servers.
The utilization of federated learning techniques for lung and colon cancer classifica-
tions represents a significant advancement with important clinical implications. Through
the utilization of decentralized data sources from various healthcare institutions, federated
learning models present unique possibilities to improve diagnostic accuracy and customize
treatment strategies. By engaging in collaborative data sharing while prioritizing patient
privacy, these models have the ability to identify complex patterns within different types
of cancers. This allows for personalized interventions and well-informed clinical decision
making. Real-time decision support systems, powered by federated learning algorithms,
enable healthcare providers to gain timely insights, facilitating proactive management and
enhancing patient outcomes. In addition, the iterative process of federated learning allows
for the ongoing improvement of models, ensuring their ability to adapt to changing clinical
practices and contributing to advancements in precision oncology and population health
management. Federated learning is a revolutionary method in cancer care that promotes
cooperation, creativity, and advancement in the pursuit of better patient care and results.
6. Conclusions
The proposed federated learning approach with the Inception-V3 model to classify
lung and colon histopathological images yielded a significant outcome in accurately distin-
guishing between three subtypes of lung cancer and two subtypes of colon cancer from
histopathological images. The model demonstrated a remarkable accuracy of 99.720%,
as well as a recall, a precision, and an F1 score of 99.720%. The detection of colon cancer
achieved 100% accuracy in classifying both classes. The accuracy of lung cancer classifica-
tion for the three classes was 99.867%. This proves that the proposed federated learning
methodology can effectively guarantee adherence to regulations while dealing with patient
data and precisely categorizing cancer cells with great accuracy.
Author Contributions: All authors had an equal contribution in preparing and finalizing the
manuscript. Conceptualization: M.M.H., M.R.I., M.F.A., M.A. and J.H.; methodology, M.M.H.,
M.R.I., M.F.A., M.A. and J.H.; validation: M.M.H., M.R.I., M.F.A., M.A. and J.H.; formal analysis:
M.M.H., M.R.I., M.F.A., M.A. and J.H.; investigation: M.M.H., M.R.I., M.F.A., M.A. and J.H.; data
curation: M.M.H., M.R.I. and M.F.A.; writing—original draft preparation: M.M.H., M.R.I. and M.F.A.;
writing—review and editing: M.M.H., M.R.I., M.F.A., M.A. and J.H.; supervision: M.R.I., M.A. and
J.H. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Data is contained within the article.
Conflicts of Interest: The authors declare no conflicts of interest.
Technologies 2024, 12, 151 26 of 28
Abbreviations
The following table alphabetically lists all the acronyms along with their definitions.
Acronym Stands for
ANN Artificial Neural Network
CAD Computer-aided Diagnosis
CNN Convolutional Neural Network
CT Computed Tomography
colon_aca Colon Adenocarcinoma
colon_bnt Colon Benign
DaaS Data as a Service
DL Deep Learning
DT Decision Tree
ELM Extreme Learning Machine
FL Federated Learning
FedAvg Federated Averaging
GDPR General Data Protection Regulation
HIPAA Health Insurance Portability and Accountability Act
IID Independent and Identically Distributed
IoMT The Internet of Medical Things
LCC Large Cell Carcinoma
lung_aca Lung Adenocarcinoma
lung_bnt Lung Benign
lung_scc Lung Squamous Cell Carcinoma
MRI Magnetic Resonance Imaging
NSCLC Non-small Cell Lung Cancer
RF Random Forest
SCC Squamous Cell Carcinoma
SGD Stochastic Gradient Descent
TL Transfer Learning
WHO World Health Organization
XAI Explainable Artificial Intelligence
XGBoost Extreme Gradient Boosting
References
1. Bray, F.; Laversanne, M.; Sung, H.; Ferlay, J.; Siegel, R.L.; Soerjomataram, I.; Jemal, A. Global cancer statistics 2022: GLOBOCAN
estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA A Cancer J. Clin. 2024, 74, 229–263. [CrossRef]
[PubMed]
2. Cancer Today. Available online: https://gco.iarc.fr/today/online-analysis-pie?v=2020&mode=cancer&mode_population=
continents&population=900&populations=900&key=total&sex=0&cancer=39&type=0&statistic=5&prevalence=0&population_
group=0&ages_group[]=0&ages_group[]=17&nb_items=7&group_cancer=1&include_nmsc=1&include_nmsc_other=1&half_
pie=0&donut=0 (accessed on 13 January 2024).
3. Xi, Y.; Xu, P. Global colorectal cancer burden in 2020 and projections to 2040. Transl. Oncol. 2021, 14, 101174. [CrossRef] [PubMed]
4. Cancer. Available online: https://www.who.int/news-room/fact-sheets/detail/cancer (accessed on 13 January 2024).
5. N.A. and R.A. Office of the Federal Register. Public Law 104-191-Health Insurance Portability and Accountability Act of
1996. govinfo.gov, August 1996. Available online: https://www.govinfo.gov/app/details/PLAW-104publ191 (accessed on 13
January 2024).
6. I (Legislative Acts) Regulations Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016
on the Protection of Natural Persons with Regard to the Processing of Personal Data and on the Free Movement of Such
Data, and Repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA Relevance)’. Available online:
https://eur-lex.europa.eu/eli/reg/2016/679/oj (accessed on 10 January 2024).
7. Masud, M.; Sikder, N.; Nahid, A.-A.; Bairagi, A.K.; Al Zain, M.A. A Machine Learning Approach to Diagnosing Lung and Colon
Cancer Using a Deep Learning-Based Classification Framework. Sensors 2021, 21, 748. [CrossRef]
8. Chehade, A.H.; Abdallah, N.; Marion, J.-M.; Oueidat, M.; Chauvet, P. Lung and colon cancer classification using medical imaging:
A feature engineering approach. Phys. Eng. Sci. Med. 2022, 45, 729–746. [CrossRef]
9. Mangal Engineerbabu, S.; Chaurasia Engineerbabu, A.; Khajanchi, A. Convolution Neural Networks for Diagnosing Colon and
Lung Cancer Histopathological Images. September 2020. Available online: https://arxiv.org/abs/2009.03878v1 (accessed on 1
February 2024).
Technologies 2024, 12, 151 27 of 28
10. Hadiyoso, S.; Aulia, S.; Irawati, I.D. Diagnosis of lung and colon cancer based on clinical pathology images using convolutional
neural network and CLAHE framework. Int. J. Appl. Sci. Eng. 2023, 20, 1–7. [CrossRef]
11. You, C.; Zhao, R.; Liu, F.; Dong, S.; Chinchali, S.; Topcu, U.; Staib, L.; Duncan, J. Class-aware adversarial transformers for medical
imagesegmentation. Adv. Neural Inf. Process. Syst. 2022, 35, 29582–29596. [PubMed]
12. You, C.; Xiang, J.; Su, K.; Zhang, X.; Dong, S.; Onofrey, J.; Staib, L.; Duncan, J.S. Incremental Learning Meets Transfer Learning:
Application to Multi-Site Prostate MRI Segmentation; Springer: Cham, Switzerland, 2022; Volume 13573, pp. 3–16. [CrossRef]
13. Mehmood, S.; Ghazal, T.M.; Khan, M.A.; Zubair, M.; Naseem, M.T.; Faiz, T.; Ahmad, M. Malignancy Detection in Lung and
Colon Histopathology Images Using Transfer Learning with Class Selective Image Processing. IEEE Access 2022, 10, 25657–25668.
[CrossRef]
14. Toğaçar, M. Disease type detection in lung and colon cancer images using the complement approach of inefficient sets. Comput.
Biol. Med. 2021, 137, 104827. [CrossRef]
15. Kumar, N.; Sharma, M.; Singh, V.P.; Madan, C.; Mehandia, S. An empirical study of handcrafted and dense feature extraction
techniques for lung and colon cancer classification from histopathological images. Biomed. Signal Process. Control 2022, 75, 103596.
[CrossRef]
16. Talukder, A.; Islam, M.; Uddin, A.; Akhter, A.; Hasan, K.F.; Moni, M.A. Machine learning-based lung and colon cancer detection
using deep feature extraction and ensemble learning. Expert Syst. Appl. 2022, 205, 117695. [CrossRef]
17. Al-Jabbar, M.; Alshahrani, M.; Senan, E.M.; Ahmed, I.A. Histopathological Analysis for Detecting Lung and Colon Cancer
Malignancies Using Hybrid Systems with Fused Features. Bioengineering 2023, 10, 383. [CrossRef]
18. Ananthakrishnan, B.; Shaik, A.; Chakrabarti, S.; Shukla, V.; Paul, D.; Kavitha, M.S. Smart Diagnosis of Adenocarcinoma Using
Convolution Neural Networks and Support Vector Machines. Sustainability 2023, 15, 1399. [CrossRef]
19. You, C.; Zhao, R.; Staib, L.H.; Duncan, J.S. Momentum Contrastive Voxel-Wise Representation Learning for Semi-Supervised Volumetric
Medical Image Segmentation; Springer: Cham, Switzerland, 2022; Volume 13434, pp. 639–652. [CrossRef]
20. You, C.; Zhou, Y.; Zhao, R.; Staib, L.; Duncan, J.S. SimCVD: Simple Contrastive Voxel-Wise Representation Distillation for
Semi-Supervised Medical Image Segmentation. IEEE Trans. Med. Imaging 2022, 41, 2228–2237. [CrossRef] [PubMed]
21. You, C.; Dai, W.; Min, Y.; Staib, L.; Duncan, J.S. Bootstrapping Semi-Supervised Medical Image Segmentation with Anatomical-Aware
Contrastive Distillation; Springer: Cham, Switzerland, 2022; Volume 13939, pp. 641–653. [CrossRef]
22. You, C.; Dai, W.; Min, Y.; Liu, F.; Clifton, D.A.; Zhou, S.K.; Staib, L.; Duncan, J.S. Rethinking Semi-Supervised Medical Image
Segmentation: A Variance-Reduction Perspective. Adv. Neural Inf. Process. Syst. 2023, 36, 9984–10021.
23. You, C.; Dai, W.; Min, Y.; Staib, L.; Sekhon, J.; Duncan, J.S. ACTION++: Improving Semi-Supervised Medical Image Segmentation with
Adaptive Anatomical Contrast; Springer: Cham, Switzerland, 2023; Volume 14223, pp. 194–205. [CrossRef]
24. Konečn, J.K.; Brendan, H.; Google, M.; Google, D.R.; Richtárik, P. Federated Optimization: Distributed Machine Learning for
On-Device Intelligence. October 2016. Available online: https://arxiv.org/abs/1610.02527v1 (accessed on 1 February 2024).
25. Konečn, J.; McMahan, H.B.; Yu, F.X.; Suresh, A.T.; Google, D.B.; Richtárik, P. Federated Learning: Strategies for Improving
Communication Efficiency. October 2016. Available online: https://arxiv.org/abs/1610.05492v2 (accessed on 1 February 2024).
26. Roth, H.R.; Chang, K.; Singh, P.; Neumark, N.; Li, W.; Gupta, V.; Gupta, S.; Qu, L.; Ihsani, A.; Bizzo, B.C.; et al. Federated Learning
for Breast Density Classification: A Real-World Implementation; Springer: Cham, Switzerland, 2020; Volume 12444, pp. 181–191.
[CrossRef]
27. Florescu, L.M.; Streba, C.T.; Şerbănescu, M.-S.; Mămuleanu, M.; Florescu, D.N.; Teică, R.V.; Nica, R.E.; Gheonea, I.A.; Florescu,
L.M.; Streba, C.T.; et al. Federated Learning Approach with Pre-Trained Deep Learning Models for COVID-19 Detection from
Unsegmented CT images. Life 2022, 12, 958. [CrossRef]
28. Hossain, M.; Ahamed, F.; Islam, R.; Imam, R. Privacy Preserving Federated Learning for Lung Cancer Classification. In
Proceedings of the 2023 26th International Conference on Computer and Information Technology, ICCIT 2023, Cox’s Bazar,
Bangladesh, 13–15 December 2023. [CrossRef]
29. Zhang, W.; Zhou, T.; Lu, Q.; Wang, X.; Zhu, C.; Sun, H.; Wang, Z.; Lo, S.K.; Wang, F.-Y. Dynamic-Fusion-Based Federated Learning
for COVID-19 Detection. IEEE Internet Things J. 2021, 8, 15884–15891. [CrossRef] [PubMed]
30. Khan, T.A.; Fatima, A.; Shahzad, T.; Rahman, A.U.; Alissa, K.; Ghazal, T.M.; Al-Sakhnini, M.M.; Abbas, S.; Khan, M.A.; Ahmed, A.
Secure IoMT for Disease Prediction Empowered with Transfer Learning in Healthcare 5.0, the Concept and Case Study. IEEE
Access 2023, 11, 39418–39430. [CrossRef]
31. Peyvandi, A.; Majidi, B.; Peyvandi, S.; Patra, J.C. Privacy-preserving federated learning for scalable and high data quality
computational-intelligence-as-a-service in Society 5.0. Multimed. Tools Appl. 2022, 81, 25029–25050. [CrossRef] [PubMed]
32. Borkowski, A.A.; Bui, M.M.; Thomas, L.B.; Wilson, C.P.; DeLand, L.A.; Mastorides, S.M. Lung and Colon Cancer Histopathological
Image Dataset (LC25000). December 2019. Available online: https://arxiv.org/abs/1912.12142v1 (accessed on 1 February 2024).
33. Bhimji, S.S.; Wallen, J.M. Lung Adenocarcinoma. StatPearls, June 2023. Available online: https://www.ncbi.nlm.nih.gov/books/
NBK519578/ (accessed on 1 February 2024).
34. Walser, T.; Cui, X.; Yanagawa, J.; Lee, J.M.; Heinrich, E.; Lee, G.; Sharma, S.; Dubinett, S.M. Smoking and Lung Cancer: The Role
of Inflammation. Proc. Am. Thorac. Soc. 2008, 5, 811–815. [CrossRef]
35. Ma, Z.; Zhang, M.; Liu, J.; Yang, A.; Li, H.; Wang, J.; Hua, D.; Li, M. An Assisted Diagnosis Model for Cancer Patients Based on
Federated Learning. Front. Oncol. 2022, 12, 860532. [CrossRef]
Technologies 2024, 12, 151 28 of 28
36. Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016;
pp. 2818–2826. [CrossRef]
37. Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. 2015. Available online:
http://www.robots.ox.ac.uk/ (accessed on 17 February 2024).
38. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. Available online: http://image-net.org/
challenges/LSVRC/2015/ (accessed on 17 February 2024).
39. Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated Residual Transformations for Deep Neural Networks. In Proceedings
of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017;
pp. 5987–5995. [CrossRef]
40. Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the 30th IEEE Conference on
Computer Vision and Pattern Recognition, CVPR, Honolulu, HI, USA, 21–26 July 2017; pp. 1800–1807. [CrossRef]
41. Ribeiro, M.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the
2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, San Diego,
CA, USA, 12–17 June 2016; pp. 97–101. [CrossRef]
42. Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks
via Gradient-based Localization. Int. J. Comput. Vis. 2016, 128, 336–359. [CrossRef]
43. Tasnim, Z.; Chakraborty, S.; Shamrat, F.M.J.M.; Chowdhury, A.N.; Alam Nuha, H.; Karim, A.; Zahir, S.B.; Billah, M. Deep Learning
Predictive Model for Colon Cancer Patient using CNN-based Classification. Int. J. Adv. Comput. Sci. Appl. 2021, 12. [CrossRef]
44. Shandilya, S.; Nayak, S.R. Analysis of Lung Cancer by Using Deep Neural Network; Springer: Singapore, 2022; Volume 814,
pp. 427–436. [CrossRef]
45. Karim, D.Z.; Bushra, T.A. Detecting Lung Cancer from Histopathological Images using Convolution Neural Network. In
Proceedings of the IEEE Region 10 Annual International Conference, Proceedings/TENCON, Auckland, New Zealand, 7–10
December 2021; pp. 626–631. [CrossRef]
46. Raju, M.S.N.; Rao, B.S. Lung and colon cancer classification using hybrid principle component analysis network-extreme learning
machine. Concurr. Comput. Pr. Exp. 2022, 35, e7361. [CrossRef]
47. Ren, Z.; Zhang, Y.; Wang, S. A Hybrid Framework for Lung Cancer Classification. Electronics 2022, 11, 1614. [CrossRef]
48. Attallah, O.; Aslan, M.F.; Sabanci, K. A Framework for Lung and Colon Cancer Diagnosis via Lightweight Deep Learning Models
and Transformation Methods. Diagnostics 2022, 12, 2926. [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.