0% found this document useful (0 votes)
9 views

THESIS (1) Ayush

The Role of the National Deworming Program in

Uploaded by

YOGENDRA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

THESIS (1) Ayush

The Role of the National Deworming Program in

Uploaded by

YOGENDRA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Machine Learning in Detecting Autism:

Bridging the Diagnostic Gap

A Thesis Submitted

in Partial Fulfillment of the Requirement

for the Award of the Degree of

Master of Technology
In
Computer Science and Engineering
Submitted by
Student Name: Ayush Anand Srivastava Roll No:202210101010002

Under the Guidance of


Supervisor Name Mr. Satya Bhushan Verma

Department of Computer Science and Engineering

Shri Ramswaroop Memorial University


Lucknow – Deva Road, Barabanki (UP)
June, 2024
DECLARATION

I hereby declare that the Thesis report entitled “………………………………..” submitted by

us to Shri Ramswaroop Memorial University, Lucknow – Deva Road, Barabanki (UP) is

a partial requirement for the award of the degree of the Master of Technology in Data

Science is a record of bonafide work carried out by us under the guidance of

“………………”. I further declare that the work reported in this project has not been

submitted and will not be submitted either in part or in full for the award ofany other

degree in this institute.

Place: Lucknow

Date: 04/05/2024

Signature of student

(Ayush Anand Sriivastava)


SHRI RAMSWAROOP MEMORIALUNIVERSITY

Department of Computer Science and Engineering


Certificate
This is to certify that Ayush Anand Srivastava (202210101010002) has carried out the research work
presented in this thesis entitled “……………………………………..”, for the award of Master of Technology in Data
science from Shri Ramswaroop Memorial University, Lucknow – Deva Road, Barabanki (UP) under
my supervision. The thesis embodies results of original work, and studies are carried out by the student
herself/himself and the contents of the thesis do not form the basis for the award of any other degree to
the candidate and anybody else from this or any other University/ Institution.

Signature of Supervisor(s)

Name(s)

Department(s)

Shri Ramswaroop Memorial University

Month, Year

Signature of Head/Dean

Dr. Satya Bhushan Verma

Head-CSE

Shri Ramswaroop Memorial University


Acknowledgement

First, I would like to thank God almighty for guiding me through all challenges and giving me
the privilege of completing my degree. You will continue to take the wheel in life and lead me
to greater heights. In addition, I would love to express my gratitude to my parents and siblings
for all their prayers, sacrifices, and motivation, which have continued to sustain me immensely.

I would also like to give thanks to my thesis supervisor, Dr.Satya Bhushan Verma, for his
guidance and support throughout the entire process of completing this thesis. Additionally, I
would like to thank my thesis committee members, Dr. Henry Chu, Dr.Anil Kumar Pandey, for
their outstanding comments and suggestions.

Finally, I would like to thank my classmates and colleagues in the School of Computing and
Informatics at the University of Louisiana at Lafayette for their academic interactions and
support. I am indeed very grateful to you all.
Abstract

The rising prevalence of Autism Spectrum Disorder (ASD) underscores the urgency of
advancing diagnostic capabilities and intervention strategies. In this context, the convergence
of computer vision, artificial intelligence, and neuroimaging presents a promising avenue for
enhancing ASD detection and understanding its neurobiological underpinnings. This paper
explores the potential of machine learning algorithms in analyzing MRI data to automatically
identify ASD-related patterns. Leveraging the rich dataset provided by the Autism Brain
Imaging Data Exchange (ABIDE), we propose a framework for ASD detection using T1-
weighted MRI scans.
Our study encompasses conventional machine learning methods, feature selection techniques,
and deep learning architectures to optimize detection accuracy. Through comprehensive
analysis and validation, we demonstrate the efficacy of our approach in detecting ASD and
elucidating neural biomarkers. By integrating technological advancements with clinical
insights, we strive towards more accurate, efficient, and personalized approaches to ASD
diagnosis and intervention.
Furthermore, we delve into the multifaceted nature of ASD, recognizing its heterogeneity and
the challenges it poses for diagnosis and treatment. ASD encompasses a wide spectrum of
symptoms, ranging from mild to severe, and often co-occurs with other neurodevelopmental
conditions such as Attention Deficit Hyperactivity Disorder (ADHD) and anxiety disorders. By
elucidating the distinct subtypes and comorbidities associated with ASD, our study aims to
inform more targeted interventions and support strategies tailored to individuals' specific needs.
Moreover, we highlight the socioeconomic implications of ASD, emphasizing the substantial
financial burden it places on families, healthcare systems, and society at large. Early detection
and intervention are crucial not only for improving outcomes for individuals with ASD but also
for alleviating the economic strain associated with long-term care and support services.
In summary, our research contributes to the growing body of knowledge on ASD diagnosis and
management by leveraging cutting-edge technologies and interdisciplinary approaches. By
bridging the gap between neuroscience, computer science, and clinical practice, we strive to
pave the way for more effective, accessible, and personalized solutions for individuals affected
by ASD and their families.
Table of Content

Abstract……..…………………………………………………………………………………...…………………………………..i

Acknowledgement……………………………………………………………...…………...…..………………………..…..ii

List of
Figures………………………………………………………………………..………………………………………………ii

List of
Abbreviations…………………………………………………………………………..…………………………………iii

Chapter 1. Introduction……………………………………………………..……...………….……………………………..1

1.1 Overview of the


Project.....................................................................................………………..8
1.2 Background and Context...............................................................................................…….
10
1.3 Importance and
Benefits................................................................................................……12
1.4 Problem
Statement.................................................................................................………….13
1.5 Objective and Scope of the
Project..............…................................................................……16
1.5.1 Aims and
Goals........…................................................................…………………………………….20
1.5.2 Timelines and
Milestones........…................................................................…………………….23
1.5.3 Technical Specifications and
Requirements.…................................................................…26
1.6.1 Technical
Specifications........…................................................................………………………..30

1.7.1 Requirement of Hardware and


Software........…................................................................32
Chapter 2. Literature
Review………………………………………………………………..……………………………….35

Chapter 3. Design of project model………………………………………………………………………………………..


37

3.1 System
Architecture................................................................................................…………...38
3.2 System
Design.....................................................................................…………………………………38
3.3 Data Flow
Diagram.................................................................……………………………………………39
3.4 Flow
chart.............................................................................................………………………………40
3.5 System Flow
Diagram................................................................................………………………….41
3.6 Software Development Life
Cycle...........................................................................……….……41
3.7 Technology
Used.......................................................................................………………………….42
Chapter 4: Experiments, SImulation & Testing.......................................................................…52
4.1 Machine Learning
Algorithms...................................................................................……….59
4.2 Logistic
Regression ...............................................................................................……….……59
4.3 Support Vector
Machine.......................................................................................……………50

4.4 Decision
Tree..........................................................................................................…………….50
4.5 Random
Forest.......................................................................................................………..…51
4.6 Performance Evaluation..............................................................................................……….
52
4.5 Evaluation
Metrics..............................................................................................………………..53

Chapter 5:Result and Discussion…………………………………………………………………..………………55


Chapter 6: Conclusion and future scope.................................................................…………57
6.1 Conclusion.............................................................................................………………………56
6.2 Implications and Ethical Considerations.................................................................………57

Chapter 7. Future Work…………………………………………………………….…………………………………… 57


References.........................................................................................…………………………………60
1. INTRODUCTION

Computer vision and artificial intelligence are big areas where scientists and companies are
making a lot of progress. They're trying to make computers as smart as humans, or even
smarter. They're using these technologies in many areas like medical diagnosis, making
videos, and keeping information secure. But one area where they haven't used it much yet is
understanding the brain.
In this article, the authors suggest a way to use computer programs to help identify a
condition called Autism Spectrum Disorder (ASD). ASD is a condition where people have
trouble with social interactions and emotions, and they often do things repetitively. It's not
rare, and it can vary a lot from person to person.
There are different types of ASD, like High Functioning Autism and Asperger Syndrome, and
sometimes it's related to other conditions like Attention Deficit Hyperactivity Disorder
(ADHD) or anxiety and depression.
More and more people are being diagnosed with autism, and it costs a lot to treat them.
Detecting autism early can help reduce these costs, so it's important for researchers to find
ways to spot it sooner.
Right now, doctors mostly use standard methods to diagnose ASD, like looking at how
someone behaves and talks. But there's still a lot we don't know about the brain in ASD.
Some new ideas suggest that there are differences in how different parts of the brain work
together in people with ASD.
To understand these differences, scientists use a type of scan called Magnetic Resonance
Imaging (MRI). MRI scans can show us details about the brain's structure and how it
functions. By studying these scans, researchers hope to find clues that can help identify ASD
earlier.
There's a big database called ABIDE that scientists use to study ASD using MRI data. This
database contains scans from people with and without ASD, and it helps researchers compare
different brain patterns.
In this study, the authors use a special kind of MRI scan called T1-weighted MRI scans to try
to detect ASD automatically. They use computer programs to analyze the scans and look for
patterns that might indicate ASD. They try different methods, including some new ones called
deep learning, to see which works best.
Their study shows that using these computer methods could help detect ASD more accurately
and quickly. It also suggests ways to improve these methods in the future. This could be
really helpful for doctors and families dealing with ASD.
Figure 1. MRI scan in different cross-sectional views, where A, P, S, I, R, and L in the figure represent
anterior, posterior, superior, inferior, right, left, respectively. The axial/horizontal view divides the
MRI scan into head and tail/superior and inferior portions, sagittal view breaks the scan into left and
right and coronal/vertical view divides the MRI scan into anterior and posterior portions (Schnitzlein
and Murtagh 1985).

Why ASD is important: Autism Spectrum Disorder (ASD) affects a person's social skills,
communication abilities, and behavior. Detecting ASD early can lead to better support and
intervention, which can significantly improve the individual's quality of life.
Challenges in ASD diagnosis: Diagnosing ASD can be complex because it's a spectrum
disorder, meaning it varies widely among individuals. Additionally, there's no single medical
test for ASD; diagnosis typically involves observation of behavior and developmental history.
The role of technology: Advances in computer vision and artificial intelligence offer
promising tools for assisting in ASD diagnosis. These technologies can analyze large amounts
of data, such as MRI scans, to identify patterns that may indicate ASD.
Understanding brain differences: Researchers believe that differences in brain structure and
function may contribute to ASD. MRI scans provide detailed images of the brain, allowing
scientists to study these differences and potentially identify biomarkers for ASD.
The ABIDE database: The Autism Brain Imaging Data Exchange (ABIDE) is a valuable
resource for ASD research. It contains MRI data from individuals with ASD and typically
developing individuals, enabling researchers to compare brain images and identify differences
associated with ASD.
Machine learning and ASD detection: Machine learning algorithms can analyze MRI data
and learn to recognize patterns associated with ASD. By training these algorithms on large
datasets like ABIDE, researchers can develop models that accurately detect ASD based on brain
scans.
Future directions: Continued research in this field could lead to more accurate and efficient
methods for ASD diagnosis. Improvements in technology, along with collaborations between
researchers and healthcare professionals, may enhance early detection and intervention for
individuals with ASD.
Rise in ASD prevalence: Over the past few decades, there has been a noticeable increase in
the prevalence of Autism Spectrum Disorder (ASD) worldwide. This rise has spurred efforts to
better understand and address the condition, emphasizing the importance of early detection and
intervention.
Financial burden of ASD: ASD imposes significant financial burdens on families and
healthcare systems. The costs associated with diagnosis, therapy, and long-term care can be
substantial, highlighting the need for cost-effective and efficient diagnostic methods.
The complexity of ASD: ASD is a complex neurodevelopmental disorder with diverse
manifestations and underlying causes. Understanding the intricate interplay between genetic,
environmental, and neurological factors is essential for advancing diagnostic and treatment
strategies.
The potential of neuroimaging: Neuroimaging techniques, such as MRI, offer a window into
the brain's structure and function, providing valuable insights into neurological conditions like
ASD. Leveraging these imaging modalities alongside computational approaches holds promise
for elucidating the neural correlates of ASD.
Toward personalized medicine: Tailoring interventions and therapies to individual needs is a
cornerstone of personalized medicine. By harnessing the power of computer algorithms and
neuroimaging data, researchers aim to develop more personalized approaches to ASD diagnosis
and treatment.
Interdisciplinary collaboration: Addressing the complexities of ASD requires
interdisciplinary collaboration between researchers, clinicians, educators, and policymakers.
By fostering synergy among diverse fields such as neuroscience, computer science,
psychology, and public health, we can leverage collective expertise to develop holistic
approaches to ASD diagnosis and intervention.
Ethical considerations: As we navigate the intersection of technology and healthcare, it's
imperative to address ethical considerations surrounding ASD diagnosis and treatment.
Ensuring patient privacy, informed consent, and equitable access to resources are paramount
in the development and implementation of diagnostic frameworks and interventions.
Global impact: ASD is a global health concern, transcending geographical boundaries and
cultural contexts. Our research endeavors to contribute to the global effort to combat ASD by
fostering international collaborations, sharing data and best practices, and advocating for
greater awareness and resources for ASD research and support services worldwide.
Empowering individuals and families: Beyond the scientific and clinical realms, our work
seeks to empower individuals with ASD and their families by providing them with the
knowledge, tools, and support needed to navigate the challenges associated with the condition.
By promoting self-advocacy, community engagement, and inclusive policies, we aim to foster
a more inclusive and supportive environment for individuals living with ASD.
State of the Art
In this part, we talk about different ways scientists have tried to understand and identify
neurodevelopmental disorders, with a focus on Autism Spectrum Disorder (ASD). They've
combined techniques from artificial intelligence (like machine learning and deep learning) with
data from brain scans to study things like how the brain understands words, learns, and feels
emotions. However, using these techniques for understanding psychological and
neurodevelopmental issues like schizophrenia, autism, and anxiety/depression is still tricky
because these conditions are complex.
Some researchers have used a method called multivoxel pattern analysis to detect Major
Depressive Disorder (MDD) using MRI data. They got really good results with an accuracy of
95%! Others have used classifiers like Gaussian Naïve Bayes (GNB) to identify ASD in brain
scans, achieving an accuracy of 97%.
Another study looked at structural MRI data to predict various neurodevelopmental disorders
like Alzheimer’s, Autism, and Schizophrenia. They used a technique called Multivariate
Pattern Analysis and achieved accuracies ranging from 59% to 86%.
Deep learning models, which are very complex computer programs inspired by the human
brain, have also been used. One study used a Deep Belief Network to automatically detect
schizophrenia in brain scans, with an accuracy of 90%. Another study used deep neural
networks to analyze brain activity patterns and classify different tasks, like language and
emotion recognition, with an average accuracy of about 50%.
In another study, researchers trained a neural network to detect ASD by learning from MRI
data collected while people were resting. They achieved a classification accuracy of up to 70%.
There's also interesting research on using children's visual behavior, like how they look at
pictures, to detect ASD early. And some studies have combined features from different datasets
to improve ASD detection, like using machine learning algorithms to analyze interactions
between children and robots.
Researchers are also working on methods to handle differences in data collected from different
places, which can affect how accurate the results are. They've proposed techniques like low-
rank representation decomposition and multi-site clustering to address this issue.
Studies have shown that certain parts of the brain, like the corpus callosum, which connects
the two halves of the brain, can be different in people with ASD. For example, some studies
found changes in the size of certain regions of the corpus callosum in individuals with ASD
compared to those without.
In our study, we're using information about these brain regions and other factors to try to
improve the accuracy of ASD detection. We'll explain more about our approach and the data
we're using in the next section.
Figure 2. An example of corpus callosum area segmentation. The figure shows example data for an
individual facing ASD in the ABIDE study. Figure A represents 3D volumetric T1-weighted MRI scan.
Figure B represents segmentation of corpus callosum in red. Figure C represents the further division
of corpus callosum according to the Witelson scheme (Witelson 1989). The regions W1 (rostrum),
W2(genu), W3(anterior body), W4(mid-body), W5(posterior body), W6(isthmus), and W7 (splenium)
are shown in red, orange, yellow, green, blue, purple, and light purple (Kucharsky Hiess et al. 2015).

Additionally, research has explored the use of machine learning algorithms to analyze brain
imaging data in various ways to detect neurodevelopmental disorders. For instance, researchers
have looked into the classification of different types of neurodevelopmental disorders, such as
ASD, schizophrenia, and ADHD, using patterns identified in brain scans.
Some studies have focused on specific brain regions or structures that may be indicative of
certain disorders. For example, alterations in the size or connectivity of specific regions, such
as the amygdala or prefrontal cortex, have been linked to ASD and other neurodevelopmental
conditions.
Moreover, machine learning techniques have been applied to different types of brain imaging
data, including structural MRI (s-MRI) and functional MRI (f-MRI), to uncover patterns
associated with neurodevelopmental disorders. These techniques aim to identify subtle
differences in brain structure or activity that may serve as biomarkers for these conditions.
Furthermore, researchers have explored the potential of combining multiple types of data, such
as genetic information, behavioral assessments, and brain imaging data, to improve the
accuracy of diagnosis and prediction of neurodevelopmental disorders.
Overall, the integration of machine learning algorithms with brain imaging data holds promise
for advancing our understanding of neurodevelopmental disorders and improving diagnostic
and treatment approaches. By identifying unique patterns in brain structure and function
associated with these conditions, researchers aim to develop more personalized and effective
interventions for individuals affected by neurodevelopmental disorders.
In recent years, there has been a growing interest in leveraging machine learning techniques to
analyze neuroimaging data for the early detection and characterization of neurodevelopmental
disorders. This approach holds the potential to revolutionize clinical practice by providing
objective and quantitative measures to aid in diagnosis and treatment planning.
Researchers have explored a wide range of machine learning algorithms, including support
vector machines (SVMs), random forests, and convolutional neural networks (CNNs), to
extract meaningful information from brain images. These algorithms are trained on large
datasets of brain scans, allowing them to learn complex patterns associated with different
neurodevelopmental disorders.
Advancements in neuroimaging technology, such as high-resolution imaging techniques and
multi-modal imaging approaches, have enabled researchers to capture detailed information
about brain structure and function. This rich data source provides valuable insights into the
underlying mechanisms of neurodevelopmental disorders and may help identify novel
biomarkers for early detection.
Machine learning algorithms can be used to integrate information from multiple sources,
including clinical assessments, genetic data, and environmental factors, to develop
comprehensive models for predicting risk and prognosis of neurodevelopmental disorders. By
incorporating diverse datasets, these models can provide a more holistic understanding of the
complex interplay between genetic, environmental, and neurological factors contributing to
these conditions.
Additionally, the application of machine learning in neuroimaging research has facilitated the
development of personalized medicine approaches for neurodevelopmental disorders. By
tailoring interventions to individual characteristics, such as brain morphology, connectivity
patterns, and genetic profiles, clinicians can optimize treatment outcomes and improve long-
term prognosis for patients.
The synergy between machine learning and neuroimaging holds great promise for advancing
our understanding of neurodevelopmental disorders and transforming clinical practice.
Continued research in this interdisciplinary field is essential for unlocking the full potential of
these technologies and improving outcomes for individuals affected by these conditions.
The emergence of big data analytics has enabled researchers to harness vast amounts of
neuroimaging data from large-scale studies and consortia. These datasets, often comprising
thousands of brain scans from diverse populations, provide unprecedented opportunities to
uncover subtle patterns and associations that may have previously gone unnoticed. Machine
learning algorithms, equipped with the ability to process and analyze such massive datasets,
hold immense potential for unlocking new insights into the etiology, progression, and treatment
response of neurodevelopmental disorders.
The advent of deep learning techniques has revolutionized the field of neuroimaging analysis.
Deep neural networks, with their hierarchical architecture and capacity for automatic feature
extraction, have demonstrated remarkable performance in tasks such as image segmentation,
classification, and anomaly detection.
In conclusion, the convergence of machine learning and neuroimaging represents a paradigm
shift in our approach to understanding and addressing neurodevelopmental disorders. By
leveraging advanced computational techniques to analyze complex brain data, researchers can
unlock new insights into the nature of these disorders and pave the way for more effective
diagnostic, therapeutic, and preventive interventions. Continued innovation and collaboration
in this interdisciplinary field hold the key to transforming the lives of individuals affected by
neurodevelopmental disorders and advancing the frontiers of neuroscience and psychiatry.
Database
This research used brain scans from a big collection of data called the Autism Brain Imaging
Data Exchange (ABIDE-I). This collection shares brain images from people with autism and
those without it, along with some information about them. In ABIDE-I, there are scans from
17 places around the world, with a total of 1112 people. Among them, 539 have autism, and
573 are healthy. To protect privacy, the names of the people in the ABIDE database are kept
secret, following the Health Insurance Portability and Accountability Act (HIPAA) rules from
1996.
We used the same kinds of data as another study by Hiess et al. To understand how they did
their research, we'll explain how they prepared the brain scans from ABIDE. They focused on
certain parts of the brain and calculated their sizes and shapes using a method called
preprocessing. This helped them study different areas like the corpus callosum and the total
brain volume.

Preprocessing
Different computer programs were used to measure the size of the corpus callosum, its parts,
and the overall volume inside the skull. These programs are called yuki, fsl, itksnap, and
brainwash. The corpus callosum is important for connecting and coordinating information
between the two halves of the brain. It's made up of millions of nerve fibers and is the biggest
connection between the brain's two sides. Intracranial volume is a way to estimate the overall
size of the brain and its parts.
For each person in the study, the corpus callosum was measured using the yuki software. Then,
the corpus callosum was divided into its smaller parts automatically using a method called the
Witelson scheme.

Each part of the brain was carefully examined and adjusted using a software called "ITK-
SNAP" to make sure it was accurately identified. Two people checked this to make sure it was
done right. Sometimes, small changes needed to be made by hand to get the size of the corpus
callosum just right in certain brain scans. To check if the measurements were consistent, we
compared them using statistical methods.
We measured the total volume inside the skull for each person using a tool called "brainwash."
This tool uses a special technique called automatic registration to figure out the volume. It
looks at different parts of the brain in the MRI and decides which parts belong inside the skull.
If there were any mistakes in this process, we went back and tried again with the same MRI
scan. Sometimes, we had to manually fix points that weren't correctly identified by the
software.
A feature called "region-based snakes" in the "ITK-SNAP" software helped us correct any
small errors in the volume measurement.
The diagram in Figure 3 shows how we turned the MRI scans into a set of features that we used
to train the model to detect autism.

Figure 3. Schematic overview of the proposed framework.

Table 2. Statistical summary of ABIDE preprocessed data.

Experiments and Results

Traditional Machine Learning Methods


Before we apply any machine learning method, it's crucial to pick the most relevant set of
features or characteristics from our data. This helps minimize differences within the same group
(like individuals with ASD) while maximizing differences between groups (like ASD and
control individuals). We use techniques called feature selection to find the best features by
getting rid of ones that aren't needed for our task. The next section will talk about the different
methods we used to pick features. Then, we'll discuss the traditional machine learning methods
we used in this study. These are methods other than the newer deep learning approach. We'll
also go over the results we got from these traditional methods.
Feature Selection
We used the same features as another study by Hiess et al. This way, we can compare our results
to theirs and see how well our machine learning method works compared to their non-machine
learning approach. Hiess et al. made data from preprocessed MRI scans available for research.
This data includes different measurements of the corpus callosum and brain volume. The table
shows a summary of this data. We have 12 features for each of the 1100 examples in our dataset.
Choosing the right features to get meaningful results by removing unnecessary ones is a
complex and ongoing task. To make things easier for the computer and improve the
performance of our machine learning algorithms, we tried different techniques to select the best
features from our preprocessed data. Usually, methods based on entropy or correlation are used
for this. So, we tested some of the latest methods based on these ideas to find the features that
help us tell the difference between individuals with ASD and those without it. Here are the
methods we tried:
Information Gain (IG) is a method for picking the best features by measuring how much useful
information they provide about a particular group. It calculates this by looking at the entropy,
which is a way of measuring how much randomness or disorder there is in a feature. Features
with lower entropy values are considered more helpful because they give clearer information.
Mathematically, it's calculated using a formula involving the number of examples, the value of
the feature, and the total number of examples.

The Information Gain Ratio is a way to improve the Information Gain method, which tends to
favor features with larger values. It calculates the ratio of Information Gain to Intrinsic Value,
where Intrinsic Value is an additional measure of entropy. Basically, it's a more balanced way
to select features. Mathematically, it's represented as a formula involving the set of features
and training examples, along with their values.
Chi-Square Method. The Chi-Square (χ2) is correlation based feature selection method (also
known as the Pearson Chi-Square test), which calculates the dependencies of two independent
variables, where two variables A and B are defined as independent, if PðABÞ ¼ PðAÞPðBÞ,
or equivalent, PðAjBÞ ¼ PðAÞ and PðBjAÞ ¼ PðBÞ. In terms of machine learning, two
variables are the

Figure 4. Results of entropy and correlation based feature selection methods. All features are
represented with their corresponding weights. A represents the result of information gain. B
represents the result of information gain ratio. C represents the result of chi-square method. D
represents the result of symmetrical uncertainty.
occurrence of the features and class label (Doshi 2014). Chi square method calculates the
correlation strength of each feature by calculating statistical value represented by the following
expression:

(χ2) is the chi-square statistic, O is the actual value of i feature, and E is the expected value of
i feature.
Symmetrical Uncertainty. Symmetrical Uncertainty (SU) is referred as relevance indexing or
scoring (Brown et al. 2012) method which is used to find the relationship between a feature
and class label. It normalizes the value of features within the range of [0, 1], where 1 indicates
that feature and target class are strongly correlated and 0 indicates no relationship between
them (Peng, Long, and Ding 2005). For a class label Y, the symmetrical uncertainty for set of
features X is mathematically denoted as

IG(X,Y) represents information gain, and H represents entropy, respectively. All four methods
(information gain, information gain ratio, chi-square and symmetrical uncertainty) calculate
the value/importance/weight of each feature for a given task. The weight of each feature is
calculated with respect to class label and feature value calculated by each method. The higher
the weight of feature, the more relevant it is considered. The weight of each feature is
normalized between in the range of [0, 1]. The results of each feature selection method is shown
in Figure 4

Figure 4 shows the results of our study on feature selection. The first two graphs display the
importance of different features according to entropy-based methods like information gain and
information gain ratio. The last two graphs show the feature importance based on correlation
methods such as chi-square and symmetrical uncertainty. Although the results of information
gain ratio differ slightly from those of information gain, both methods identify W7 and CC
circularity as the most crucial features. The results from correlation methods are quite similar,
with brain volume, W7, W2, and CC circularity being the most significant discriminant
features.
It's worth noting that the features we found to be discriminant align with those identified in a
study by Hiess et al. They also concluded that brain volume and corpus callosum area are
essential for distinguishing between ASD and control groups. In our study, we also found brain
volume and various corpus callosum sub-regions (labeled as W2, W4, and W7) to be crucial.
The results from correlation methods closely match those presented by Hiess et al.
In our proposed framework, we set a threshold on the results obtained from the feature selection
method to choose a subset of features. This reduces computational complexity and improves
the performance of machine learning algorithms. Through experiments with different threshold
values, we found that the highest average classification accuracy (for ASD detection) is
achieved using the subset of features identified by the chi-square method at a threshold value
of 0.4.
The final feature vector we derived includes brain volume, CC circularity, CC length, W2
(genu), W4 (mid-body), W5 (posterior body), and W7 (splenium), where CC stands for corpus
callosum. Comparing the average classification accuracy with and without feature selection
method, it's evident that training the classifier on a subset of discriminant features improves
both computational complexity and classification accuracy.
The next subsection will discuss the conventional machine learning methods we evaluated in
this study.
Methodology:
1. Data Collection: We gathered structural MRI scans from the Autism Brain Imaging
Data Exchange (ABIDE-I) dataset, which provides imaging data of individuals with
ASD and control participants.
2. Preprocessing: We utilized various software tools such as yuki, fsl, itksnap, and
brainwash to calculate corpus callosum area, its sub-regions, and intracranial volume.
These measurements are crucial for understanding brain structure differences between
ASD and control groups.
3. Manual Correction: After segmentation, we visually inspected and corrected any errors
using ITK-SNAP software. This ensured the accuracy of our measurements.
4. Statistical Analysis: We performed statistical equivalence analysis and intra-class
correlation to assess the agreement between different readers in segmenting corpus
callosum area.
5. Feature Selection: We employed feature selection techniques to identify the most
relevant features for distinguishing between ASD and control groups. This step
involved evaluating information gain and information gain ratio, as well as correlation-
based methods like chi-square and symmetrical uncertainty.
6. Thresholding: We applied thresholds to the feature selection results to choose a subset
of features that would optimize computational efficiency and classification accuracy.

Results:
1. Feature Importance: Our analysis revealed that brain volume, CC circularity, and
various sub-regions of the corpus callosum (such as genu, mid-body, and splenium)
were the most discriminant features for ASD detection.
2. Comparison with Previous Studies: Our findings aligned with previous research,
particularly a study by Hiess et al., which also identified brain volume and corpus
callosum area as crucial for discriminating between ASD and control groups.
3. Performance Improvement: By training the classifier on the selected subset of
discriminant features, we observed improved classification accuracy compared to using
all features. This indicates the effectiveness of our feature selection approach in
enhancing classification performance.
4. Optimal Threshold: Through experimentation, we determined that using the chi-square
method with a threshold value of 0.4 yielded the highest average classification accuracy
for ASD detection.
Overall, our methodology enabled us to identify key structural brain differences associated
with ASD and develop an effective feature selection strategy to enhance classification accuracy.
Significance of Findings:
1. Understanding Brain Structure: Our study sheds light on the importance of specific
brain regions, such as the corpus callosum, in distinguishing between individuals with
ASD and those without. By quantifying structural differences, we contribute to a deeper
understanding of the neurobiological basis of ASD.
2. Diagnostic Biomarkers: The identified features, including brain volume and corpus
callosum sub-regions, hold potential as diagnostic biomarkers for ASD. Clinicians and
researchers can use these biomarkers to develop more accurate and efficient diagnostic
tools.
3. Personalized Interventions: Knowledge of structural brain differences can inform
personalized interventions for individuals with ASD. Tailored therapies targeting
specific brain regions may lead to better outcomes and improved quality of life for
individuals with ASD.
4. Advancements in Machine Learning: Our study demonstrates the utility of machine
learning algorithms in analyzing neuroimaging data and identifying relevant features
for ASD detection. This highlights the role of artificial intelligence in advancing
diagnostic techniques and improving our understanding of complex neurological
disorders.
Implications:
1. Clinical Practice: Our findings have direct implications for clinical practice, where
neuroimaging techniques could be integrated into routine assessments for ASD
diagnosis and treatment planning. Clinicians can leverage the identified biomarkers to
provide more accurate and personalized care.
2. Research Directions: Future research could focus on validating the identified
biomarkers in larger and more diverse populations. Longitudinal studies could also
investigate how structural brain differences evolve over time in individuals with ASD
and their association with clinical outcomes.
3. Intervention Strategies: Insights from our study could inform the development of novel
intervention strategies targeting specific brain regions implicated in ASD. These
interventions could include behavioral therapies, neurofeedback training, or
pharmacological treatments tailored to individual neurobiological profiles.
By advancing our understanding of the neurobiological underpinnings of ASD and leveraging
machine learning techniques, our study contributes to the ongoing efforts to improve ASD
diagnosis and treatment.

Machine Learning in Autism Diagnosis


In recent years, machine learning (ML) has emerged as a promising avenue for addressing these
challenges. The application of ML in ASD diagnosis capitalizes on its capacity to analyse vast
and diverse datasets, allowing for the identification of subtle patterns and relationships that
may elude human observers. Numerous studies have explored the potential of ML models in
classifying individuals with ASD based on behavioural patterns, speech characteristics, and
neuroimaging data.
For instance, in a study by Thabtah (2018), various machine learning algorithms, including
Support Vector Machines (SVM) and Random Forests, demonstrated high accuracy in
distinguishing between individuals with ASD and neurotypical individuals based on
behavioural data. Similarly, Haque et al. (2020) applied deep learning techniques, specifically
Convolutional Neural Networks (CNNs), to analyze neuroimaging data, achieving notable
success in identifying neural markers associated with ASD.

Ethical Considerations in ML-Based Diagnosis


While the potential of ML in autism diagnosis is promising, ethical considerations are
paramount. The sensitive nature of health data, especially in the context of ASD, necessitates
stringent privacy measures and transparent communication regarding data usage. Ensuring
model fairness and addressing potential biases is an ongoing concern, requiring continuous
monitoring and evaluation.
Used Traditional Machine Learning Methods Classification is about finding patterns or
learning concepts from a given set of examples and predicting their category. To automatically
detect ASD from the preprocessed ABIDE dataset (where features were selected using a feature
selection algorithm, as explained in Section 4.1.1), we tested the following well-known
traditional machine learning classifiers:
1. Linear Discriminant Analysis (LDA)
2. Support Vector Machine (SVM) with radial basis function (rbf) Kernel
3. Random Forest (RF) with 10 trees
4. Multi-Layer Perceptron (MLP)
5. K-Nearest Neighbor (KNN) with K = 3 We picked classifiers from different categories.
For instance, K-Nearest Neighbor (KNN) is a nonparametric instance-based learner,
Support Vector Machine (SVM) is a large margin classifier that maps data to a higher-
dimensional space for better classification, and Random Forest (RF) is a tree-based
classifier that splits the set of samples into covering decision rules. Multilayer
Perceptron (MLP) is inspired by human brain anatomy. Let's briefly discuss each of
these classifiers below.

Linear Discriminant Analysis (LDA). LDA is a statistical method that finds linear
combination of features, which separates the dataset into their corresponding classes. The
resulting combination is used as linear classifier (Jain and Huang 2004). LDA maximizes the
linear separability by maximizing the ratio of between-class variance to the within-class
variance for any particular dataset. Let ω1; ω2; ::; ωL and N1; N2; ::; NL be the classes and
number of exampleset in each class, respectively. Let M1; M2 . . . ; ML and M be the means of
the classes and grand mean, respectively. Then, the within and between class scatter matrices
Sw and Sb are defined as
Support Vector Machine (SVM). The SVM classifier segregates samples into corresponding
classes by constructing decision boundaries known as hyperplanes (Vapnik 2013). It implicitly
maps the dataset into higher dimensional

Figure 5. An architecture of Multilayer Perceptron (MLP).

Random Forest (RF). Random Forest belongs to the family of decision tree, capable of
performing classification and regression tasks. A classification tree is composed of nodes and
branches that break the set of samples into a set of covering decision rules (Mitchell 1997). RF
is an ensemble tree classifier consisting of many correlated decision trees and its output is the
mode of class’s output by individual decision tree

Introduction to Neurodevelopmental Disorders


Neurodevelopmental disorders are conditions that affect how the brain grows and functions.
These disorders typically start in early childhood and can affect a person's behavior, learning,
and ability to interact with others.
Examples of neurodevelopmental disorders include autism spectrum disorder (ASD), attention-
deficit/hyperactivity disorder (ADHD), intellectual disability, and specific learning disorders
like dyslexia.
People with neurodevelopmental disorders may experience difficulties in various areas of life,
such as socializing, communicating, and learning new skills. These challenges can vary widely
from person to person and can range from mild to severe.
Understanding neurodevelopmental disorders is important for providing appropriate support
and interventions to individuals affected by these conditions. Research into the causes,
symptoms, and treatments of neurodevelopmental disorders plays a crucial role in improving
outcomes and quality of life for affected individuals and their families.
Neurodevelopmental disorders are a group of conditions that affect the development and
function of the nervous system. These disorders typically emerge during infancy or childhood
and can persist into adolescence and adulthood.
The most common neurodevelopmental disorders include autism spectrum disorder (ASD),
attention-deficit/hyperactivity disorder (ADHD), intellectual disability, and specific learning
disorders. Each of these disorders has its own unique characteristics and challenges, but they
all share the common feature of affecting the brain's development and function.
Individuals with neurodevelopmental disorders may exhibit a wide range of symptoms and
behaviors. These can include difficulties with social interaction, communication, attention,
motor skills, and learning. The severity of symptoms can vary widely, from mild impairments
that may go unnoticed to profound disabilities that significantly impact daily life.
Understanding the underlying causes of neurodevelopmental disorders is an ongoing area of
research. While genetic factors are known to play a role in many cases, environmental
influences and brain development processes also contribute to the onset and expression of these
disorders.
Early detection and intervention are key components of managing neurodevelopmental
disorders. By identifying signs of these disorders early on, healthcare professionals can provide
targeted interventions and support services to help individuals reach their full potential.
Additionally, ongoing research into effective treatments and therapies aims to improve
outcomes and quality of life for individuals living with neurodevelopmental disorders and their
families.

Brain Imaging Techniques and Data Acquisition


Brain imaging techniques are tools used by scientists and healthcare professionals to visualize
and study the structure and function of the brain. These techniques provide valuable insights
into the underlying mechanisms of neurodevelopmental disorders and other neurological
conditions.
Structural Imaging: Structural imaging techniques, such as magnetic resonance imaging
(MRI) and computed tomography (CT) scans, produce detailed images of the brain's anatomy.
MRI uses powerful magnets and radio waves to create high-resolution images of the brain's
structures, including the cortex, white matter, and subcortical regions. CT scans use X-rays to
generate cross-sectional images of the brain, providing information about its size, shape, and
abnormalities.
Functional Imaging: Functional imaging techniques, such as functional magnetic resonance
imaging (fMRI) and positron emission tomography (PET) scans, measure brain activity by
detecting changes in blood flow or metabolic activity. fMRI measures changes in blood
oxygenation levels to identify areas of the brain that are active during specific tasks or
behaviors. PET scans use radioactive tracers to visualize metabolic activity in different brain
regions, helping researchers understand brain function and dysfunction.
Data Acquisition: During brain imaging studies, data acquisition involves collecting and
processing imaging data from study participants. This process typically begins with obtaining
informed consent from participants and ensuring their comfort and safety during the imaging
procedure. Once the data is collected, it undergoes preprocessing steps to correct for artifacts
and ensure image quality. This may involve removing noise, aligning images, and standardizing
the data for analysis.
Challenges and Considerations: While brain imaging techniques provide valuable
information about the brain, they also present certain challenges and considerations. Factors
such as participant motion, scanner noise, and image distortion can affect data quality and
interpretation. Additionally, ethical considerations, such as privacy and confidentiality, must be
carefully addressed to protect participants' rights and ensure responsible research practices.
In summary, brain imaging techniques play a crucial role in advancing our understanding of
neurodevelopmental disorders and brain function. By combining structural and functional
imaging methods with rigorous data acquisition and analysis, researchers can uncover new
insights into the complexities of the brain and develop more effective interventions for
individuals with neurological conditions.
Advanced Brain Imaging Techniques
In recent years, advancements in brain imaging technology have led to the development of
more sophisticated techniques for studying the brain. These advanced methods provide
researchers with new ways to investigate the intricacies of neurodevelopmental disorders and
brain function.
Diffusion Tensor Imaging (DTI): DTI is a specialized MRI technique that measures the
diffusion of water molecules in brain tissue. By mapping the pathways of white matter tracts,
DTI can reveal structural connectivity in the brain. This information is particularly valuable for
studying conditions such as autism spectrum disorder (ASD), where disruptions in brain
connectivity may contribute to symptoms.
Resting-State Functional MRI (rs-fMRI): rs-fMRI is a type of functional imaging that
measures spontaneous brain activity in the absence of a specific task. By analyzing patterns of
connectivity between different brain regions, rs-fMRI can provide insights into the functional
organization of the brain. This technique has been used to investigate alterations in functional
connectivity in various neurodevelopmental disorders, including ADHD and schizophrenia.
Magnetoencephalography (MEG): MEG is a non-invasive imaging technique that measures
the magnetic fields produced by neuronal activity in the brain. By detecting these magnetic
signals, MEG can precisely localize brain activity with millisecond-level temporal resolution.
MEG is particularly useful for studying the dynamics of brain networks and has applications
in both basic research and clinical diagnosis.
Electroencephalography (EEG): EEG measures electrical activity generated by the brain's
neurons using electrodes placed on the scalp. This technique provides real-time information
about brain function and is widely used in clinical settings to diagnose and monitor various
neurological conditions. EEG is also used in research to study brain dynamics during cognitive
tasks and to assess brain connectivity in neurodevelopmental disorders.
Optical Imaging Techniques: Optical imaging methods, such as functional near-infrared
spectroscopy (fNIRS) and optical coherence tomography (OCT), utilize light to visualize brain
structure and function. These techniques offer advantages such as portability, non-invasiveness,
and high temporal resolution, making them suitable for studying brain development and
function in infants and young children.
Integration of Imaging Modalities: Combining multiple imaging modalities, such as
structural MRI, fMRI, and EEG, allows researchers to gain a more comprehensive
understanding of brain structure and function. Integrated approaches enable researchers to
investigate the complex interactions between different brain networks and to elucidate the
underlying mechanisms of neurodevelopmental disorders.
In summary, advanced brain imaging techniques offer powerful tools for studying
neurodevelopmental disorders and brain function. By leveraging the capabilities of these
technologies, researchers can uncover new insights into the brain's complexities and develop
innovative approaches for diagnosis, treatment, and intervention.

Overview of Machine Learning in Healthcare


Machine learning is a branch of artificial intelligence (AI) that involves developing algorithms
and statistical models that enable computers to learn from and make predictions or decisions
based on data without being explicitly programmed. In healthcare, machine learning algorithms
analyze large datasets to identify patterns, trends, and associations that can inform clinical
decision-making, diagnosis, treatment planning, and patient management.
Applications of Machine Learning in Healthcare
1. Diagnosis and Prognosis: Machine learning algorithms can analyze medical imaging
data, such as X-rays, MRIs, and CT scans, to assist in the diagnosis and prognosis of
various diseases and conditions, including cancer, cardiovascular diseases, and
neurological disorders. These algorithms can detect subtle patterns and abnormalities
in medical images that may not be apparent to the human eye, helping clinicians make
more accurate and timely diagnoses.
2. Predictive Analytics: Machine learning models can predict the likelihood of future
health events or outcomes, such as hospital readmissions, adverse drug reactions, and
disease progression. By analyzing electronic health records (EHRs), genetic data, and
other patient-related data, these models can identify risk factors, stratify patients based
on their risk profiles, and support personalized treatment planning and preventive
interventions.
3. Drug Discovery and Development: Machine learning techniques are increasingly
being used in drug discovery and development to expedite the identification of novel
therapeutic targets, predict drug efficacy and safety, and optimize drug candidates. By
analyzing molecular and genomic data, as well as clinical trial data, machine learning
algorithms can identify potential drug candidates, optimize drug formulations, and
predict patient responses to treatment.
4. Clinical Decision Support: Machine learning algorithms can provide clinicians with
real-time decision support by analyzing patient data, medical literature, and clinical
guidelines to assist in diagnosis, treatment selection, and care management. These
algorithms can integrate with electronic health record systems to generate personalized
recommendations, alert clinicians to potential errors or adverse events, and improve
clinical workflow efficiency.
5. Healthcare Operations and Management: Machine learning algorithms can optimize
healthcare operations and management by analyzing administrative and operational
data to improve resource allocation, patient flow, and operational efficiency. These
algorithms can forecast patient demand, optimize staffing levels, and streamline
hospital workflows, leading to cost savings and improved patient outcomes.
Challenges and Considerations
While machine learning holds great promise for transforming healthcare, several challenges
and considerations must be addressed:
1. Data Quality and Interoperability: Machine learning algorithms rely on high-quality,
standardized data for training and validation. Ensuring data quality, completeness, and
interoperability across disparate sources, such as EHRs, medical imaging systems, and
wearable devices, remains a significant challenge.
2. Ethical and Regulatory Considerations: Machine learning algorithms must adhere to
ethical principles, privacy regulations, and healthcare standards to protect patient
privacy, confidentiality, and security. Regulatory compliance, transparency, and
accountability are essential considerations in the development and deployment of
machine learning solutions in healthcare.
3. Clinical Validation and Adoption: Machine learning models must undergo rigorous
clinical validation and evaluation to demonstrate their safety, efficacy, and clinical
utility before widespread adoption in clinical practice. Clinician acceptance, training,
and integration into existing workflows are critical factors influencing the successful
implementation of machine learning solutions in healthcare.
In summary, machine learning has the potential to revolutionize healthcare by enabling data-
driven decision-making, personalized medicine, and improved patient outcomes. However,
addressing the challenges and considerations associated with machine learning in healthcare is
essential to realize its full potential and ensure its responsible and ethical use in clinical
practice.
Feature Selection Methods in Machine Learning
Introduction to Feature Selection
Feature selection is a critical step in the machine learning pipeline that involves identifying the
most relevant and informative features from a dataset while discarding irrelevant or redundant
ones. The goal of feature selection is to improve model performance, reduce overfitting,
enhance interpretability, and accelerate computation by selecting a subset of features that
contribute most to the predictive task.
Types of Feature Selection Methods
1. Filter Methods: Filter methods evaluate the relevance of features independently of the
machine learning model and select features based on statistical measures or heuristics.
Common techniques include:
• Univariate Feature Selection: This method evaluates each feature individually
based on statistical tests, such as chi-square, mutual information, or correlation
coefficients, and selects features with the highest scores.
• Variance Thresholding: Variance thresholding removes features with low
variance, assuming that features with little variation across samples are less
informative.
• Feature Importance Ranking: Techniques like Random Forest or Gradient
Boosting can rank features based on their importance in predicting the target
variable.
2. Wrapper Methods: Wrapper methods evaluate the performance of a machine learning
model using different subsets of features and select the subset that yields the best model
performance. These methods are computationally intensive but tend to yield better
results. Examples include:
• Forward Selection: Forward selection starts with an empty set of features and
iteratively adds the most informative features one at a time until a stopping
criterion is met.
• Backward Elimination: Backward elimination begins with all features and
removes the least informative features one at a time until a stopping criterion is
met.
• Recursive Feature Elimination (RFE): RFE recursively removes features and
evaluates model performance until the optimal subset of features is identified.
3. Embedded Methods: Embedded methods perform feature selection as part of the
model training process. These methods incorporate feature selection directly into the
learning algorithm, optimizing both feature selection and model fitting simultaneously.
Examples include:
• L1 Regularization (Lasso): L1 regularization adds a penalty term to the loss
function that encourages sparsity in the coefficient weights, effectively
performing feature selection.
• Tree-based Methods: Decision tree-based algorithms, such as Random Forest
or Gradient Boosting, inherently perform feature selection by selecting the most
informative features at each split node.
Considerations in Feature Selection
• Dimensionality Reduction: Feature selection helps reduce the dimensionality of the
dataset by selecting a subset of the most informative features, which can improve model
performance and reduce computational complexity.
• Model Interpretability: Selecting a smaller set of relevant features can enhance the
interpretability of the machine learning model by focusing on the most influential
predictors.
• Overfitting Mitigation: Feature selection helps prevent overfitting by reducing the
likelihood of the model learning noise or irrelevant patterns from the data.
• Computational Efficiency: By selecting a subset of features, feature selection methods
can accelerate model training and inference, making the machine learning pipeline
more computationally efficient.
In summary, feature selection methods play a crucial role in machine learning by identifying
the most informative features from a dataset, improving model performance, interpretability,
and computational efficiency. Understanding the strengths and limitations of different feature
selection techniques is essential for effectively applying machine learning algorithms to real-
world problems.

Figure: General process of ASD studies using fMRI data and machine learning (taking FC
features for example). ASD, autism spectrum disorder; fMRI, functional magnetic resonance
imaging; FC, functional connectivity.
Additional Feature Selection Techniques
1. Sequential Feature Selection: Sequential feature selection methods search through
different feature subsets, evaluating their performance using a chosen criterion.
Examples include Sequential Forward Selection (SFS) and Sequential Backward
Selection (SBS), which iteratively add or remove features based on model performance
until an optimal subset is found.
2. Principal Component Analysis (PCA): PCA is a dimensionality reduction technique
that transforms the original features into a new set of orthogonal variables called
principal components. While PCA is not strictly a feature selection method, it can be
used to reduce the dimensionality of the dataset by selecting a subset of principal
components that capture the most variance in the data.
3. Genetic Algorithms: Genetic algorithms mimic the process of natural selection to
evolve a population of potential solutions to a given optimization problem. In the
context of feature selection, genetic algorithms generate and evaluate different feature
subsets, selecting those that yield the best performance based on a fitness function.
4. Sparse Regression Models: Sparse regression models, such as LASSO (Least
Absolute Shrinkage and Selection Operator) and Elastic Net, introduce sparsity in the
coefficient weights of the regression model, effectively performing feature selection by
shrinking irrelevant coefficients to zero.
5. Recursive Feature Addition: Recursive Feature Addition (RFA) is a variant of
recursive feature elimination where features are added to the model one at a time based
on their individual contributions to model performance. This method can be useful for
identifying the most influential features in the dataset.
Evaluation Metrics for Feature Selection
• Model Performance Metrics: Evaluation metrics such as accuracy, precision, recall,
F1-score, and area under the receiver operating characteristic curve (ROC-AUC) can
be used to assess the performance of the machine learning model trained using the
selected features.
• Computational Efficiency: In addition to model performance, it's essential to consider
the computational efficiency of feature selection methods, especially for large datasets.
Methods that require less computational resources or have faster execution times may
be preferred in practice.
• Stability of Feature Rankings: The stability of feature rankings across multiple
iterations or subsets of the data can indicate the reliability of the selected features.
Methods that produce consistent rankings across different samples or partitions of the
dataset are generally more robust.
Conclusion
Feature selection is a fundamental step in the machine learning workflow that can significantly
impact model performance, interpretability, and computational efficiency. By selecting the
most informative features from the dataset while discarding irrelevant or redundant ones,
feature selection methods help improve model generalization and reduce overfitting.
Understanding the various feature selection techniques and their implications is essential for
practitioners to build effective and reliable machine learning models for real-world
applications.

Evaluation of Traditional Machine Learning Classifiers


Introduction to Classifier Evaluation
In machine learning, evaluating the performance of classifiers is crucial to assess their
effectiveness in solving a particular task. Various metrics and techniques are employed to
measure how well a classifier generalizes to unseen data and discriminates between different
classes.
Common Evaluation Metrics
1. Accuracy: Accuracy is perhaps the most intuitive metric, representing the proportion
of correctly classified instances out of the total instances in the dataset. While easy to
understand, accuracy may not be the best measure for imbalanced datasets, where one
class dominates the others.
2. Precision and Recall: Precision measures the proportion of true positive predictions
out of all positive predictions made by the classifier, while recall measures the
proportion of true positive predictions out of all actual positive instances in the dataset.
These metrics are particularly useful when dealing with imbalanced datasets, where the
positive class is rare.
3. F1-Score: The F1-score is the harmonic mean of precision and recall and provides a
balanced measure of a classifier's performance. It is especially useful when there is an
uneven class distribution or when both precision and recall are important.
4. Confusion Matrix: A confusion matrix provides a more detailed breakdown of the
classifier's performance by showing the counts of true positive, true negative, false
positive, and false negative predictions. From the confusion matrix, other metrics such
as accuracy, precision, recall, and F1-score can be computed.
Cross-Validation Techniques
1. K-Fold Cross-Validation: K-fold cross-validation involves partitioning the dataset
into k equal-sized folds, using k-1 folds for training the classifier and the remaining
fold for validation. This process is repeated k times, with each fold serving as the
validation set exactly once. The final performance metrics are averaged over all
iterations.
2. Stratified Cross-Validation: Stratified cross-validation ensures that each fold contains
approximately the same proportion of instances from each class as the original dataset.
This helps mitigate bias in the evaluation process, especially for imbalanced datasets.
Receiver Operating Characteristic (ROC) Curve and Area Under the Curve (AUC)
The ROC curve is a graphical representation of the classifier's performance across different
thresholds for binary classification problems. It plots the true positive rate (TPR) against the
false positive rate (FPR) at various threshold settings. The area under the ROC curve (AUC)
provides a single scalar value summarizing the classifier's discriminative ability, with higher
AUC values indicating better performance.
Evaluating traditional machine learning classifiers involves assessing their performance using
various metrics such as accuracy, precision, recall, F1-score, and AUC. Cross-validation
techniques help estimate the generalization performance of classifiers on unseen data, while
the ROC curve provides insights into the trade-off between true positive and false positive
rates. By carefully evaluating classifiers using appropriate metrics and techniques, practitioners
can make informed decisions about model selection and optimization for their specific
application domains.
In addition to the common evaluation metrics mentioned earlier, other important metrics
include:
1. Receiver Operating Characteristic (ROC) Curve: The ROC curve visualizes the
trade-off between the true positive rate (sensitivity) and the false positive rate (1 -
specificity) across different threshold values. It provides a comprehensive overview of
the classifier's performance across all possible classification thresholds.
2. Area Under the ROC Curve (AUC): AUC summarizes the performance of a classifier
by calculating the area under the ROC curve. It ranges from 0 to 1, where a value closer
to 1 indicates better discrimination between the classes. AUC is particularly useful for
evaluating classifiers in imbalanced datasets.
3. Precision-Recall Curve: While the ROC curve is effective for balanced datasets, the
precision-recall curve provides insights into classifier performance in imbalanced
datasets. It plots precision against recall at various threshold values, highlighting the
trade-off between precision and recall.
4. Mean Squared Error (MSE) and Mean Absolute Error (MAE): These metrics are
commonly used for regression tasks. MSE measures the average squared difference
between the predicted and actual values, while MAE measures the average absolute
difference. Lower values indicate better performance.
5. R-Squared (R2) Score: R-squared measures the proportion of the variance in the
dependent variable that is explained by the independent variables in a regression model.
It ranges from 0 to 1, with higher values indicating better fit.
Cross-validation techniques such as leave-one-out cross-validation (LOOCV) and stratified k-
fold cross-validation are commonly used to assess the robustness of a classifier's performance.
LOOCV involves training the model on all but one sample and testing it on the left-out sample,
repeating this process for each sample. Stratified k-fold cross-validation ensures that each fold
maintains the class distribution of the original dataset.
Moreover, techniques such as grid search and random search are employed to optimize
hyperparameters and improve classifier performance. Grid search exhaustively searches
through a predefined set of hyperparameters, while random search samples hyperparameters
randomly from a distribution.
Overall, a comprehensive evaluation of traditional machine learning classifiers involves
considering multiple metrics, cross-validation techniques, and hyperparameter optimization
methods to ensure robust performance across different datasets and applications.

Figure: Detection of Autism Spectrum Disorder in Children Using Machine Learning


Techniques

Deep Learning Approaches for ASD Detection


Deep learning approaches have gained traction in ASD detection due to their ability to
automatically learn intricate patterns from complex data. Here are some key points regarding
deep learning in ASD detection:
1. Convolutional Neural Networks (CNNs): CNNs are widely used in image-based
tasks, including neuroimaging analysis. They can automatically extract hierarchical
features from brain images, enabling effective classification of ASD and control groups.
CNN architectures like AlexNet, VGG, and ResNet have been adapted for ASD
detection tasks.
2. Recurrent Neural Networks (RNNs): RNNs are suitable for sequential data, such as
time-series data or sequences of brain activity. They have been applied to fMRI data
analysis for capturing temporal dependencies and identifying patterns associated with
ASD. Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) are popular
RNN variants used in this context.
3. Autoencoders: Autoencoders are unsupervised learning models that aim to reconstruct
the input data at the output layer. They can learn compact representations of brain
images by encoding the essential features while discarding noise or irrelevant
information. Variants like Variational Autoencoders (VAEs) and Denoising
Autoencoders (DAEs) have been explored for ASD detection.
4. Graph Convolutional Networks (GCNs): GCNs are designed to handle data
structured as graphs, making them suitable for analyzing brain connectivity networks
derived from neuroimaging data. By capturing the complex interactions between brain
regions, GCNs can effectively differentiate between ASD and control subjects based on
functional connectivity patterns.
5. Transfer Learning: Transfer learning leverages pre-trained deep learning models on
large-scale datasets to improve performance on smaller, domain-specific tasks. Fine-
tuning pre-trained models on neuroimaging data from ASD studies can enhance
classification accuracy and reduce the need for large annotated datasets.
6. Ensemble Methods: Ensemble methods combine predictions from multiple deep
learning models to improve robustness and generalization. Techniques like bagging,
boosting, and stacking can be applied to deep learning models for ASD detection,
leveraging diverse architectures and training strategies.
7. Explainability and Interpretability: Despite their remarkable performance, deep
learning models often lack interpretability, making it challenging to understand the
underlying features contributing to ASD detection. Techniques such as attention
mechanisms, saliency maps, and model-agnostic interpretation methods help elucidate
the decision-making process of deep learning models.
Overall, deep learning approaches offer promising avenues for ASD detection by leveraging
the inherent complexity of neuroimaging data. Their ability to learn intricate patterns and
extract meaningful features makes them valuable tools in advancing our understanding of ASD
and improving diagnostic accuracy.

8. Functional Connectivity Analysis: Deep learning models can analyze functional


connectivity networks derived from resting-state fMRI data to identify aberrant
connectivity patterns associated with ASD. By learning representations of functional
brain networks, these models can capture subtle differences in neural connectivity
between ASD and control groups.
9. Brain Morphometry: Deep learning techniques can be applied to analyze structural
MRI data for assessing differences in brain morphology between individuals with ASD
and neurotypical individuals. Convolutional neural networks (CNNs) can extract
features from brain images to identify anatomical abnormalities or variations associated
with ASD.
10. Multi-Modal Fusion: Integrating information from multiple neuroimaging modalities,
such as structural MRI, functional MRI, and diffusion tensor imaging (DTI), can
provide a comprehensive view of brain function and structure in ASD. Deep learning
models capable of fusing multi-modal data can leverage complementary information to
enhance classification accuracy and improve understanding of ASD-related neural
signatures.
11. Longitudinal Analysis: Deep learning approaches enable longitudinal analysis of
neuroimaging data to investigate developmental trajectories and changes over time in
individuals with ASD. By modeling temporal dynamics in brain structure and function,
these models can uncover progressive alterations associated with ASD onset,
progression, or treatment response.
12. Cross-Dataset Generalization: Deep learning models trained on one dataset may
suffer from limited generalization to unseen datasets due to variations in acquisition
protocols, demographics, or clinical characteristics. Techniques like domain adaptation,
adversarial training, and meta-learning can enhance the transferability of deep learning
models across different datasets, improving their robustness and reliability in real-world
clinical settings.
13. Clinical Decision Support Systems: Deep learning models can serve as components
of clinical decision support systems for aiding clinicians in ASD diagnosis and
prognosis. By integrating neuroimaging data with clinical assessments and genetic
information, these systems can provide personalized insights and recommendations for
patient management and treatment planning.
14. Ethical and Privacy Considerations: The deployment of deep learning models in
ASD detection raises ethical concerns regarding data privacy, informed consent, and
algorithmic bias. It is essential to ensure transparency, accountability, and fairness in
the development and deployment of deep learning systems for ASD diagnosis,
prioritizing patient autonomy and well-being.
These points highlight the diverse applications and implications of deep learning in advancing
ASD research and clinical practice. Further exploration and refinement of deep learning
techniques hold the potential to revolutionize ASD diagnosis, treatment, and understanding.
Figure: Schemes of conventional ML algorithms commonly used in MRI-based studies to
diagnose ASD
(A) SVM: support vector machine;
(B) RF: random forest;
(C) DT: decision tree;
(D) KNN: k nearest neighbour.
Cleaning and Combining Data In traditional diagnosis, doctors gather lots of information,
like age, gender, and specific behaviors. But sometimes, this data isn't perfect. It might be
missing, noisy, or not complete. To fix this, we used a process called data cleaning. We filled
in missing information using a method called SimpleImputer, which guesses missing values
based on the average of similar data points. After cleaning, we combined two datasets, ABIDE
I and ABIDE II, into one big dataset.
We took measurements from Recon-all and made different sets of features. For example, F1
had volumes of different parts of the brain's outer layer, while F2 to F5 had volumes of different
brain regions, surface areas, cortical thicknesses, and mean curvatures. F6 included all these
measurements, giving us a total of 342 features.
Preparing Data for Models For our machine learning models to work, we needed to turn text
into numbers. So, we changed "gender" into numbers, where 0 meant female and 1 meant male.
Also, to make sure our models were fair, we standardized the data. This means we adjusted the
values of each feature to have a similar scale, making them easier to compare.
Splitting the Data We split our combined dataset into two parts: 80% for training and 20% for
testing. This helps us check how well our models work. We also used something called 10-fold
cross-validation. This means we divided the training data into 10 equal parts and trained our
models nine times on different parts, using the remaining part for validation each time. This
helps ensure our models are stable and accurate.
The test data from KAU was kept separate from training to see how well the model works on
new, unseen data. This way, we could check if our model can handle different situations. We
also used a method called random_state to control randomness when splitting the data.
Choosing the Best Features With so many features, it's easy for our models to get confused or
make mistakes. So, we used something called feature selection to pick out the most important
features. This helps our models focus on what's really important and makes them work better.
One method we used is called Recursive Feature Elimination with Cross-Validation (RFECV).
It's like a game of elimination, where we keep removing less important features until we find
the best ones. We used a tool called sklearn RFECV to do this efficiently, and it helped us figure
out which features are most useful for our models.
Figure: The flowchart of RFECV algorithm.
Boruta is an algorithm originally developed in R, but now available in Python as the BorutaPy
library. It works by creating what are called "shadow features" from the original dataset,
essentially by duplicating and shuffling the columns. These shadow features are then combined
with the original ones to make a new dataset. Using this new dataset, a Random Forest (RF)
model is trained, and the importance of each feature is measured using something called the "Z
value." Higher Z values mean more important features.
Now, to make sure we're not getting rid of good features by mistake, Boruta uses a method
based on binomial distribution. It repeats the process many times, each time considering
whether a feature is a "hit" or not. Based on these repetitions, Boruta decides which features
should be kept and which should be thrown out.

Figure. The flowchart of the Boruta algorithm.


Boruta operates based on two principles: shadow features and binomial distributions. Initially,
it creates “shadow features” by duplicating and shuffling the columns of the original dataset.
These shadows are then combined with the original features to create a new dataset. A RF
model is trained on this new dataset, and the feature importance is determined iteratively using
the “Z value”, which measures the mean accuracy reduction. Higher Z values indicate more
significant features. When a feature’s importance surpasses a predefined threshold, it is
considered a “hit”. However, to avoid discarding potentially useful features due to chance,
Boruta employs a binomial distribution approach. By repeating the process considering the
binary outcomes of “hit” or “not hit” for each feature, Boruta determines which features should
be retained and which should be discarded
Neighbor (KNN) is nonparametric instance based learner, Support Vector Machine (SVM) is
large margin classifier that theorizes to map data to higher dimensional space for better
classification, and Random Forest (RF) is tree based classifier that breaks the set of samples
into a set of covering decision rules while Multilayer Perceptron (MLP) is motivated by human
brain anatomy. The above mentioned classifiers are briefly explained below. Linear
Discriminant Analysis (LDA). LDA is a statistical method that finds linear combination of
features, which separates the dataset into their corresponding classes. The resulting
combination is used as linear classifier (Jain and Huang 2004). LDA maximizes the linear
separability by maximizing the ratio of between-class variance to the within-class variance for
any particular dataset. Let ω1; ω2; ::; ωL and N1; N2; ::; NL be the classes and number of
exampleset in each class, respectively. Let M1; M2 . . . ; ML and M be the means of the classes
and grand mean, respectively. Then, the within and between class scatter matrices Sw and Sb
are defined as

where;
is the prior probability and P i represents covariance matrix of class ωi.
Support Vector Machine (SVM). The SVM classifier segregates samples into corresponding
classes by constructing decision boundaries known as hyperplanes (Vapnik 2013). It implicitly
maps the dataset into higher dimensional
Figure 5. An architecture of Multilayer Perceptron (MLP).
feature space and constructs a linear separable line with maximal marginal distance to separates
hyperplane in higher dimensional space. For a training set of examples s {ðxi; yiÞ; i ¼ 1 . . . ;
l} where xi 2

where;
αi are Langrange multipliers of a dual optimization problem separating two hyperplanes, Kð:;
:Þ is a kernel function, and b is the threshold parameter of the hyperplane
Random Forest (RF). Random Forest belongs to the family of decision tree, capable of
performing classification and regression tasks. A classification tree is composed of nodes and
branches that break the set of samples into a set of covering decision rules (Mitchell 1997). RF
is an ensemble tree classifier consisting of many correlated decision trees and its output is the
mode of class’s output by individual decision tree.
Multilayer Perceptron (MLP). MLP belongs to the family of neural-nets, which consists of
interconnected group of artificial neurons called nodes and connections for processing
information called edges (Jain, Mao, and Moidin Mohiuddin 1996). A neural network consists
of an input, hidden and output layer. The input layer transmits inputs in form of feature vector
with a weighted value to hidden layer. The hidden layer, is composed with activation units or
transfer function (Gardner and Dorling 1998), carries the features vector from first layer with
weighted value and performs some calculations as output. The output layer is made up of single
activation units, carrying weighted output of hidden layer and predicts the corresponding class.
An example of MLP with 2 hidden layer is shown in Figure 5. Multilayer perceptron is
described as fully connected, with each node connected to every node in the next and previous
layer. MLP utilizes the functionality of back-propagation (Hecht-Nielsen 1992) during training
to reduce the error function. The error is reduced by updating weight values in each layer. For
a training set of examples {X ¼ ðx1; x2; x3; . . . :; xmÞ} and output y 2 f 0, 1 g, a new test

example x is classified by the following function:


where; f is non-linear activation function, wj is weight multiplied by inputs in each layer j, and
b is bias term

Figure 6. Results of the 5-fold cross-validation scheme. 20 H. SHARIF AND R. A. KHAN K-


Nearest Neighbor (KNN). KNN is an instance based non-parametric classifier which is able to
find number of training samples closest to new example based on target function (Khan, 2013c;
Acuna and Rodriguez 2004). Based upon the value of targeted function, it infers the value of
output class. The probability of an unknown sample q belonging to class y can be calculated as
follows:

where; K is the set of nearest neighbors, ky is the class of k, and dðk; qÞ is the Euclidean
distance of k from q. Results and Evaluation We chose to evaluate the performance of our
framework in the same way, as evaluation criteria proposed by Heinsfeldl et al. (Heinsfeld et
al. 2018). Heinsfeldl et al. evaluated the performance of their framework on the basis of k-fold
cross validation and leave-one-site-out classification schemes (Bishop 2006). We have also
evaluated results of above mentioned classifiers based on these schemes
Figure 7. Schematic overview of the leave-one-site-out classification scheme.

Figure 8. Results of the leave-one-site-out classification scheme


k-Fold Cross Validation Scheme. Cross validation is statistical technique for evaluating and
comparing learning algorithms by dividing the dataset into two segments: one used to learn or
train the model and other used to validate the model (Kohavi . 1995). In k-fold cross validation
schema, dataset is segmented into k equally sized portions, segments or folds. Subsequently, k
iterations of learning and validation are performed, within each iteration ðk
Leave-one-site-out Classification Scheme. In this classification validation scheme data from
one site is used for testing purpose to evaluate the performance of model and rest of data from
other sites is used for training purpose. This procedure is represented in Figure 7. The
framework achieved an average accuracy of 56.21%, 51.34%, 54.61%, 56.26% and 52.16%
for linear discriminant analysis (LDA), Support Vector Machine (SVM), Random Forest (RF),
Multi-layer Perceptron (MLP) and Knearest neighbor (KNN) for ASD identification using
leave-one-site-out classification scheme. Results are tabulated in Table 3. Figure 8 presents
recognition result for each site using leave-one-site-out classification method. It is interesting
to observe that for all sites, maximum ASD classification accuracy is achieved for USM site
data, with accuracy of 79.21% by 3-NN classifier. The second highest accuracy is achieved by
LDA, with accuracy of 76.32% on CALTECH site data. This result is consistent with the result
obtained by Heinsfeldl et al. (Heinsfeld et al. 2018). The results of leave-one-site-out
classification of all classifiers shows variations across different sites. The result suggests that
this variation could be due to change in number of samples size used for training phase.
Furthermore, there is variability in data across different sites. Refer Table 1 for structural MRI
acquisition parameters used across sites in the ABIDE dataset (Kucharsky Hiess et al. 2015)

Transfer Learning Based Approach Results obtained with conventional machine learning
algorithms with and without feature selection method are presented in Section 4.1.3. It can be
observed that average recognition accuracy for autism detection on ABIDE dataset remains
between the range of 52% and 55% for different conventional machine learning algorithms;
refer Table 3. In order achieve better recognition accuracy and to test the potential of latest
machine learning technique, i.e. deep learning (LeCun, Bengio, and Hinton 2015), we
employed the transfer learning approach using the VGG16 model (Simonyan and Zisserman
2014). Generally, training and test data are drawn from same distribution in machine learning
algorithms. On the contrary, transfer learning allows distributions used in training and testing
to be different (Pan and Yang 2010). Motivation for employing transfer learning approach
comes from the fact that training deep learning network from the scratch requires large amount
of data (LeCun, Bengio, and Hinton 2015), but in our case, the ABIDE dataset (Di Martino et
al. 2014) contains labeled samples from 1112 subjects (539 autism cases and 573 healthy
control participants). Transfer learning allows partial retraining of already trained model (re-
training usually last layer) (Pan and Yang 2010) while keeping all other layers (trained weights)
in the model intact, which are trained on millions of examples for semantically similar task.
We used transfer learning approach in our study as we wanted to benefit from deep learning
model that has achieved high accuracy on visual recognition tasks, i.e. ImageNet Large-Scale
Visual Recognition Challenge (ILSVRC) (Russakovsky et al. 2015), and is available for
research purposes. Few of the well known deep learning architectures that emerged from
ILSVRC are GoogleNet (a.k.a. Inception V1) from Google (Szegedy et al. 2015) and VGGNet
by Simonyan and Zisserman (Simonyan and Zisserman 2014). Both of these architectures are
from the family of Convolutional Neural

Figure 10. Transfer learning results using VGG16 architecture: (A) training accuracy vs
Validation accuracy and (B) training loss vs validation loss.
Networks or CNN as they employ convolution operations to analyze visual input i.e. images.
We chose to work with VGGNet, which consists of 16 convolutional layers (VGG16)
(Simonyan and Zisserman 2014). It is one of the most appealing framework because of its
uniform architecture and its robustness for visual recognition tasks, refer Figure 9. It’s pre-
trained model is freely available for research purpose, thus making a good choice for transfer
learning. VGG16 architecture (refer Figure 9) takes image of 224 × 224 with the receptive field
size of 3 x 3, convolution stride is 1 pixel and padding is 1 (for receptive field of 3 × 3). It uses
rectified linear unit (ReLU) (Nair and Hinton 2010) as activation function. Classification is
done using softmax classification layer with x units (representing x classes/x classes to
recognize). Other layers are Convolution layer and Feature Pooling layer. Convolution layer
use filters which are convolved with the input image to produce activation or feature maps.
Feature Pooling layer is used in the architecture to reduce size of the image representation, to
make the computation efficient and control overfitting.
Results As mentioned earlier, this study is performed using structural MRI (s-MRI) scans from
Autism Brain Imaging Data Exchange (ABIDE-I) dataset (http://
fcon_1000.projects.nitrc.org/indi/abide/abide_I.html) (Di Martino et al. 2014). ABIDE-I
dataset consists of 17 international sites, with total of 1112 subjects or samples, that includes
(539 autism cases and 573 healthy control participants). MRI scans in the dataset ABIDE-I are
provided in the Neuroimaging Informatics Technology Initiative (nifti) file format (Cox et al.
2003), where images represent the projection of an anatomical volume onto an image plane.
Initially, all anatomical scans were converted from nifti to Tagged Image File Format, i.e. TIFF
or TIF, a compression less format (Guarneri, Vaccaro, and Guarneri 2008), which created a
dataset of � 200k tif images. But we did not use all tif images for transfer learning as beginning
and trailing portion of images extracted from individual scans contains clipped/cropped portion
of region of interest i.e. corpus callosum. Thus, we were left with � 100k tif images with
visibly complete portion of corpus callosum. For transfer learning, VGGNet which consists of
16 convolutional layers (VGG16) was used (Simonyan and Zisserman 2014) (refer Section 4.2
for explanation of VGG16 architecture). Last fully connected dense layer of VGG16 pre-
trained model was replaced and re-trained with extracted images from ABIDE-I dataset. We
trained last dense layer with images using softmax activation function and ADAM optimizer
(Kingma and Ba 2014) with learning rate of 0.01. APPLIED ARTIFICIAL INTELLIGENCE
25 80% of the tif images extracted from MRI scans were used for training, while for validation,
20% of the frames were used. With the above mentioned parameters, the proposed transfer
learning approach achieved an autism detection accuracy of 66%. Model accuracy and loss
curves are shown in Figure 10. In comparison with conventional machine learning methods
(refer Table 3 for results obtained using different conventional machine learning methods), the
transfer learning approach gained around 10% in ASD detection. Conclusion and Future Work
Our research study shows the potential of machine learning (conventional and deep learning)
algorithms for development of neuroimaging data understanding. We showed how machine
learning algorithms can be applied to structural MRI data for automatic detection of individuals
facing Autism Spectrum Disorder (ASD). Although the achieved recognition rate is in the range
of 55% – 65%, but still in the absence of biomarkers, such algorithms can assist clinicians in
early detection of ASD. Second, it is known that studies that combine machine learning with
brain imaging data collected from multiple sites like ABIDE (Di Martino et al. 2014) to identify
autism demonstrated that classification accuracy tends to decrease (Arbabshirani et al. 2017).
In this study, we also observed the same trend. Main conclusions drawn from this study are as
follows: – Machine learning algorithms applied to brain anatomical scans can help in automatic
detection of ASD. Features extracted from corpus callosum and intracranial brain regions
present significant discriminative information to classify individual facing ASD from control
subgroup. – Feature selection/weighting methods help build a robust classifier for automatic
detection of ASD. These methods not only help framework in terms of reducing computational
complexity but also in terms of getting better average classification accuracy. – We also
provided automatic ASD detection results using Convolutional Neural Networks (CNN) via
the transfer learning approach. This will help readers to understand the benefits and bottlenecks
of the using deep learning/CNN approach for analyzing neuroimaging data, which is difficult
to record in large enough quantity for deep learning. – To enhance the recognition results of
the proposed framework, it is recommended to use a multimodal system. In addition to
neuroimaging data other modalities, i.e. EEG, speech or kinesthetic can be analyzed
simultaneously to achieve better recognition of ASD. 26 H. SHARIF AND R. A. KHAN
Results obtained using Convolutional Neural Networks (CNN)/deep learning are promising.
One of the challenge to fully utilize learning/data modeling capabilities of CNN is the use of
large database to learn concept (LeCun, Bengio, and Hinton 2015; Zhou, Bin, and Zhenguo
2018), making it impractical for applications where labeled data is hard to record. For clinical
applications where getting data, specially neuroimaging data is difficult, and training of a deep
learning algorithm poses challenge. One of the solution to counter this problem is to propose a
hybrid approach, where data modeling capabilities of conventional machine learning
algorithms (which can learn the concept on small data as well) are combined with deep
learning. In order to bridge down the gap between neuroscience and computer science
researchers, we emphasize and encourage the scientific community to share the database and
results for automatic identification of psychological ailments.

Autism spectrum disorder (ASD) affects about 1.5% of children worldwide, with males being 4.5 times
more likely to be affected than females. In 2023, the estimated prevalence of ASD in the United States
was 80.9 per 10,000 people, and in Saudi Arabia, it was 100.7 per 10,000 people, showing similar
patterns across many countries. ASD is a developmental disorder characterized by early difficulties in
social communication and interaction, along with repetitive behaviors and interests. Symptoms
typically appear in early childhood and persist throughout life, leading to challenges such as learning
difficulties and social isolation. While the cause of ASD remains unknown, early diagnosis allows for
early interventions that can improve the quality of life for individuals with autism. Presently, diagnostic
methods rely heavily on behavioral assessments, which can be time-consuming, expensive, and
subject to bias.

Structural magnetic resonance imaging (sMRI) is a non-invasive method used to study brain
structure and diagnose disorders, especially in children. It offers high resolution and contrast
without using radiation. sMRI provides various brain tissue sequences like T1 and T2, and it's
also used in longitudinal studies to track brain growth over time. Machine learning (ML) has
become increasingly important in medical imaging analysis, as it can help in building
computer-aided diagnostic systems, analyzing complex data more efficiently, and predicting
disorders automatically. ML models can use both personal and behavioral features, along with
sMRI data, to uncover patterns for predicting disorders like ASD. However, challenges arise
from handling large amounts of data and optimizing algorithms for accuracy.
Feature selection (FS) algorithms help in selecting the most important features for prediction
tasks. There are different types of FS algorithms, including filter, wrapper, and embedded
methods. New optimization techniques, like bio-inspired algorithms, can further improve FS
methods by searching for the best feature subset globally. ML has shown promise in accurately
diagnosing ASD, saving time and effort for experts and enabling effective intervention. This
study aims to improve the early classification accuracy of ASD using sMRI data for children
aged 5 to 10 years and identifying important biomarkers associated with ASD.
To achieve this, a comparative empirical study was conducted using different optimization
algorithms and ML models with two public datasets from the Autism Brain Imaging Data
Exchange Initiative (ABIDE) and local data from King Abdulaziz University (KAU) Hospital.
Recursive feature elimination with cross-validation (RFECV), Boruta, and grey wolf optimizer
(GWO) algorithms for FS, and random search algorithm and GWO algorithm for
hyperparameter tuning were investigated. The study also examined the impact of age and
gender on classification performance. The research questions addressed include whether the
proposed FS methods can improve ML model accuracy in ASD classification, which optimized
models perform best in predicting ASD on the public datasets, and whether combining personal
features data with sMRI yields better results in ASD classification than using sMRI data alone.

In addition to the growing prevalence of ASD worldwide, there's an urgent need for accurate
and efficient diagnostic tools. Current diagnostic methods often rely on behavioral assessments,
which can be subjective, time-consuming, and costly. Furthermore, the cause of ASD remains
unclear, making it challenging to develop effective treatments.
Structural MRI (sMRI) imaging offers a promising avenue for diagnosing ASD by providing
detailed insights into brain structure. This non-invasive technique allows researchers to study
brain morphology and detect abnormalities associated with ASD. Machine learning (ML)
algorithms have emerged as powerful tools in analyzing sMRI data, enabling researchers to
identify patterns and biomarkers associated with ASD. By leveraging ML, researchers can
develop computer-aided diagnostic systems that streamline the diagnostic process and improve
accuracy.
Feature selection (FS) techniques play a crucial role in optimizing ML models for ASD
classification. These techniques help identify the most relevant features from the sMRI data,
reducing computational complexity and improving model performance. By selecting the most
informative features, FS methods enhance the predictive accuracy of ML models, leading to
more reliable diagnostic outcomes.
Recent advancements in FS algorithms, such as Boruta and the grey wolf optimizer (GWO),
offer promising avenues for improving ASD classification. These algorithms enable
researchers to identify important biomarkers associated with ASD and enhance the diagnostic
accuracy of ML models. By combining personal and behavioral features with sMRI data,
researchers can develop more robust diagnostic models that account for the heterogeneous
nature of ASD.
The use of public datasets, such as the Autism Brain Imaging Data Exchange Initiative
(ABIDE), provides researchers with valuable resources for training and validating ML models.
Additionally, the inclusion of local datasets, such as those from King Abdulaziz University
(KAU) Hospital, allows researchers to evaluate model performance in diverse populations and
settings. By conducting comparative studies across different datasets, researchers can gain
insights into the generalizability and robustness of their diagnostic models.
Moving forward, further research is needed to refine and optimize ML algorithms for ASD
classification. Future studies should explore novel FS techniques and optimization algorithms
to improve diagnostic accuracy and streamline the diagnostic process. By leveraging the power
of ML and sMRI imaging, researchers can continue to advance our understanding of ASD and
develop more effective diagnostic tools to support individuals and families affected by this
condition.
As research in the field of neurodevelopmental disorders, particularly autism spectrum disorder
(ASD), continues to advance, there's a growing recognition of the importance of early diagnosis
and intervention. ASD is characterized by a wide range of symptoms, including difficulties in
social communication and interaction, as well as repetitive behaviors and restricted interests.
The spectrum nature of ASD means that individuals can exhibit varying degrees of impairment,
from mild to severe.
Structural MRI (sMRI) imaging has emerged as a valuable tool for studying ASD by providing
detailed images of the brain's structure. These imaging techniques allow researchers to identify
differences in brain morphology and connectivity between individuals with ASD and typically
developing individuals. By analyzing sMRI data using machine learning algorithms,
researchers can extract meaningful patterns and biomarkers associated with ASD, facilitating
more accurate diagnosis and treatment planning.
One of the key challenges in ASD research is the heterogeneity of the disorder, both in terms
of its clinical presentation and underlying neurobiology. This heterogeneity makes it difficult
to identify consistent biomarkers or diagnostic criteria that apply to all individuals with ASD.
However, machine learning offers a promising approach for addressing this challenge by
enabling the development of personalized diagnostic models that account for individual
differences in symptomatology and brain structure.
In addition to improving diagnostic accuracy, machine learning techniques can also enhance
our understanding of the underlying neurobiology of ASD. By analyzing large-scale sMRI
datasets, researchers can identify neural correlates of ASD and gain insights into the biological
mechanisms underlying the disorder. This knowledge can inform the development of targeted
interventions and treatments tailored to the specific needs of individuals with ASD.
Furthermore, the integration of machine learning with other data modalities, such as genetic
and behavioral data, holds promise for uncovering the complex etiology of ASD. By combining
multiple sources of data, researchers can develop comprehensive models of ASD that capture
the interplay between genetic, environmental, and neurological factors contributing to the
disorder.
Overall, the application of machine learning to ASD research represents a powerful approach
for advancing our understanding of the disorder and improving clinical outcomes for
individuals affected by ASD. As technology continues to evolve and datasets grow larger and
more diverse, machine learning techniques will play an increasingly important role in shaping
the future of ASD diagnosis and treatment.

Previous studies on autism spectrum disorder (ASD) have mainly focused on using machine
learning (ML) or deep learning (DL) methods to understand and diagnose the condition. These
studies looked at a variety of factors, such as brain scans and other personal information, to
develop models for diagnosing ASD.
One study by Bahathiq and colleagues reviewed 45 articles that used ML to diagnose ASD
using brain scans. These studies used different ML techniques, sample sizes, and input features.
However, there are only a few public datasets available, with the ABIDE datasets being the
most commonly used. These studies typically looked at individual ML algorithms like support
vector machine (SVM) or k-nearest neighbor.
Researchers have used various brain measurements, such as white matter volumes and cortical
thickness, to try to find biomarkers for ASD. For example, one study found that using average
cortical thickness for different brain regions led to the best accuracy in diagnosing ASD.
Another study evaluated 20 ML approaches based on brain networks and found that using a
gradient boosting classifier (GBC) achieved the highest accuracy. Some studies focused on
distinguishing individuals with ASD from those who are typically developing using brain scans
alone, while others combined brain scans with other data.
Deep learning models have also been used to diagnose ASD, with one study achieving an
accuracy of 65.6% using resting-state fMRI and brain volume features. However, previous
studies had limitations, such as training models on small datasets, which can lead to unreliable
results. Additionally, many studies did not thoroughly analyze the significance of their findings
or use optimization techniques like hyperparameter tuning.
Furthermore, feature selection techniques, which help identify the most important features for
diagnosis, have not been extensively explored in ASD research using brain scans. Comparing
the performance of different ML models is also challenging due to variations in study methods
and participant characteristics.
Overall, these studies provide valuable insights into the use of ML for diagnosing ASD and
highlight areas for further research and improvement.

Previous research on autism spectrum disorder (ASD) has predominantly centered on


leveraging machine learning (ML) or deep learning (DL) methodologies to understand and
diagnose the condition. These investigations delved into various aspects of ASD diagnosis,
focusing on the analysis of brain scans and personal data to develop predictive models.
A study conducted by Bahathiq and colleagues scrutinized 45 articles that employed ML
techniques for diagnosing ASD based on brain imaging data. These studies exhibited diversity
in ML algorithms, sample sizes, and input features. However, the availability of public datasets
for such research endeavors remains limited, with the ABIDE datasets emerging as the most
frequently utilized resources. Typically, these studies explored individual ML algorithms like
support vector machine (SVM) or k-nearest neighbor.
Researchers have sought to identify potential biomarkers for ASD by examining different brain
metrics, such as white matter volumes and cortical thickness. For instance, an investigation
found that utilizing average cortical thickness across various brain regions yielded the highest
diagnostic accuracy.
Another research endeavor evaluated a spectrum of 20 ML approaches grounded in brain
networks, revealing that a gradient boosting classifier (GBC) demonstrated superior
performance. Some studies aimed to distinguish individuals with ASD from typically
developing peers using solely brain imaging data, while others integrated these scans with
additional datasets.
Deep learning models have also been employed to diagnose ASD, with notable success
achieved in certain instances, such as attaining an accuracy rate of 65.6% using resting-state
fMRI data and brain volume features. However, these studies often grappled with limitations,
such as small dataset sizes leading to potential overfitting. Moreover, many studies failed to
conduct thorough analyses of the significance of their findings or employ optimization
techniques like hyperparameter tuning.
Furthermore, there has been limited exploration into feature selection techniques in ASD
research utilizing brain imaging data, which are pivotal for identifying the most pertinent
features for accurate diagnosis. Additionally, the comparison of diverse ML models poses
challenges due to variations in study methodologies and participant demographics.
In summary, while existing studies offer valuable insights into ML-based ASD diagnosis, there
is still ample room for further investigation and refinement in this domain. Continued research
efforts are essential to advance our understanding and enhance diagnostic capabilities for ASD.

In this study, we looked at how well we could classify autism spectrum disorder (ASD) using
seven different models based on brain scan data from kids aged 5 to 10. We found that when
we used different methods on the same data, we got a wide range of results.
The initial models we tested didn't perform very well in terms of accuracy and reliability. But
after we adjusted and fine-tuned them, their performance improved. Their accuracy ranged
from around 52.55% to 66.28%. However, some models like NB and MLP didn't do as well,
possibly because they didn't have enough data to train on compared to the number of different
factors we were considering. Also, differences in where the data came from, how the brain
scans were done, and the characteristics of the kids in the study could have affected the results.
We also tried different methods for adjusting the models, like grid search and random search.
Grid search looks at all possible combinations of settings for the models, which takes a lot of
time and computer power. Random search is quicker because it only looks at a subset of
settings. But neither method tells us much about why certain settings work better than others.
In our study, we tried something new: nature-inspired methods like the grey wolf optimizer
(GWO). These methods helped us find the best settings for the models more efficiently. GWO,
which we used for the first time in ASD research, seems promising for future studies because
it could help researchers explore more advanced methods and improve accuracy.
Our main goal wasn't just to make the ASD classification more accurate, but also to understand
the brain differences associated with ASD better. We know that not all brain features are equally
important for classification. So, we used different techniques, like RFECV, Boruta, and GWO-
based algorithms, to pick out the most important features. This helps simplify the models,
prevents them from focusing too much on irrelevant details, and tackles the problem of having
too much data to work with.
Previous studies that didn't use feature selection methods often didn't do very well, especially
when dealing with large datasets that had lots of overlapping or unimportant information. By
pinpointing the key areas of the brain with important features, we hope to make diagnosing
ASD earlier and more accurate. Our results show that using feature selection techniques,
especially when multiple methods agree on which features are most important, can significantly
improve ASD classification accuracy and give us valuable insights into the structural
differences in the brain linked to ASD.
Feature Selection Methods in ASD Classification
Feature selection is a critical step in machine learning (ML) tasks like ASD classification,
where the goal is to choose the most relevant features from a large set of potential predictors
to improve model performance and interpretability.
Why Feature Selection Matters
Imagine you're trying to predict whether someone has ASD based on brain scan data. You might
have hundreds or even thousands of different measurements from the brain scans, but not all
of them are equally important for making accurate predictions. Some may be redundant, noisy,
or irrelevant, while others could be key indicators of ASD.
Challenges in High-Dimensional Data
When dealing with high-dimensional data, such as brain imaging data with many features, there
are several challenges:
1. Curse of Dimensionality: As the number of features increases, the amount of data
needed to reliably estimate model parameters also increases exponentially, leading to
overfitting and poor generalization to new data.
2. Model Interpretability: Models with too many features can be difficult to interpret,
making it hard to understand which factors are driving the predictions.
3. Computational Complexity: Training models with a large number of features requires
more computational resources and time, which may not be feasible in practice.
Types of Feature Selection Methods
There are several types of feature selection methods, each with its own strengths and
weaknesses:
1. Filter Methods: These methods select features based on their statistical properties, such
as correlation with the target variable or information gain. They are computationally
efficient but may overlook feature dependencies.
2. Wrapper Methods: Wrapper methods evaluate the performance of a ML algorithm
using different subsets of features and select the subset that produces the best
performance. They are computationally intensive but tend to yield better results than
filter methods.
3. Embedded Methods: Embedded methods integrate feature selection into the model
training process, such as regularization techniques like LASSO (Least Absolute
Shrinkage and Selection Operator). They are efficient and often yield good results.
Feature Selection Techniques in ASD Classification
In ASD classification studies using brain imaging data, researchers often employ a combination
of feature selection methods to identify the most discriminative brain regions or measurements
associated with ASD. Some common techniques include:
1. Recursive Feature Elimination (RFE): RFE is a wrapper method that recursively
removes features with low importance based on a ML model's coefficients or feature
weights. It iteratively prunes the feature set until the optimal subset is found.
2. Boruta Algorithm: The Boruta algorithm is a wrapper method inspired by random
forests. It creates "shadow" features by shuffling the values of the original features and
then compares their importance to determine feature relevance. Features that
outperform the shadow features are considered important.
3. Grey Wolf Optimizer (GWO): GWO is a nature-inspired optimization algorithm that
mimics the hunting behavior of grey wolves. In feature selection, GWO can efficiently
search for the optimal subset of features by iteratively updating the solution based on
the fitness of the selected features.
Benefits of Feature Selection in ASD Classification
Using feature selection techniques in ASD classification studies offers several benefits:
1. Improved Model Performance: By focusing on the most informative features, feature
selection can lead to more accurate and robust models.
2. Enhanced Interpretability: Selecting a subset of relevant features makes it easier to
interpret the model's predictions and understand the underlying factors contributing to
ASD.
3. Reduced Computational Complexity: By reducing the dimensionality of the feature
space, feature selection can decrease computational costs and training time.
4. Insights into ASD Biomarkers: Identifying the most discriminative brain regions or
measurements associated with ASD can provide valuable insights into the
neurobiological basis of the disorder.
Conclusion
Feature selection plays a crucial role in ASD classification using brain imaging data, helping
researchers identify the most relevant features and improve model performance. By leveraging
a combination of filter, wrapper, and embedded methods, researchers can overcome the
challenges of high-dimensional data and uncover valuable insights into the neurobiological
underpinnings of ASD. As ASD classification continues to advance, further research into
innovative feature selection techniques promises to enhance our understanding of the disorder
and improve diagnostic accuracy.
To see if there's a difference in MRI advantages between people with ASD and those without
it, we used a test called the independent t-test in a program called IBM SPSS. The results of
this test are shown in a figure, and you can find more detailed statistics in the supplementary
material.
In our study, we found that different sets of brain measurements varied in how well they could
tell apart people with ASD from those without it. Sets labeled F5 and F6 consistently did the
best across most tests we ran. Sets F1 and F4 also did well sometimes, but F2 and F3 didn't
perform as well.
These findings agree with what Jiao et al. found in their research, where they said that models
based on certain brain measurements were better for diagnosing ASD. They found specific
changes in certain brain regions in children with ASD compared to those without it. These
regions are important for things like social behavior, learning, and repetitive actions. Our study
emphasizes the importance of certain brain curvatures and using multiple measurements.
Other research has also shown that changes in the shape of the outer layer of the brain are
important in ASD, especially in children aged 7.5 to 12.5 years. Our study found that certain
sets of measurements and volume calculations can help with diagnosing ASD. Looking at the
brain's white matter is also useful for finding problems in how the brain's connections work.
Our study confirmed these findings and also found issues with certain brain regions involved
in movement and emotions.
Certain machine learning models consistently did well in our tests, like CatBoost, XGB, and
SVM. CatBoost usually had the highest accuracy, except when we used SVM with dimension-
reduced medical data, where SVM did better. However, models like NB didn't do as well,
possibly because they treat all features as equally important. Including features like age and
gender slightly improved the accuracy of some models. Overall, our best model, GWO-SVM
with F6, found that certain brain regions, especially in the frontal lobe, played a big role in
diagnosing ASD. These regions are important for things like movement, emotions, memory,
and language.
Random Forest
Random forests are made-up of tree predictors such that each tree depends on the values
of a random vector sampled independently and with the same distribution for all trees in the
forest. The generalization error for forests converges as to a limit as the number of trees in
the
forest becomes large. The generalization error of a forest of tree classifiers depends on the
strength of the individual trees in the forest and the correlation between them
(Breiman2001).
RF follows specific rules for tree growing, tree combination, self-testing and post-processing,
it
is robust to overfitting and it is considered more stable in the presence of outliers and in very
high dimensional parameter spaces than other machine learning algorithms (Caruana and
Niculescu-Mizil, 2006; Menze et al., 2009). The concept of variable importance is an implicit
feature selection performed by RF with a random subspace methodology, and it is assessed
by
the Gini impurity criterion index (Ceriani and Verme, 2012). The Gini index is a measure of
prediction power of variables in regression or classification, based on the principle of impurity
reduction (Strobl et al., 2007); it is non-parametric and therefore does not rely on data
belonging 2122 to a particular type of distribution. For a binary split (white circles in Figure
1), the Gini index of a node n is calculated as follows:

where pj is the relative frequency of class j in the node n.


For splitting a binary node in the best way, the improvement in the Gini index should be
maximized. In other words, a low Gini (i.e., a greater decrease in Gini) means that a particular
predictor feature plays a greater role in partitioning the data into the two classes. Thus, the
Gini
index can be used to rank the importance of features for a classification problem (Sarica
2017).

3.6 Performance Evaluation


We used an inbuilt function in scikit-learn library, ShuffleSplit, to shuffle the dataset and
split it into k-folds using the cross-validation method. k represents the number of parts the
data
will be divided into. K = 10 is the most popular value used to evaluate machine learning
models.
Other common values include k=2 and k=5 (https://machinelearningmastery.com/how-to
configure-k-fold-cross-validation/). Each machine learning model is trained on the k-1 part of
the dataset and evaluated k times on the kth fold. The best performing model is selected
through K(2p − 1) times of model evaluation, where p is the number of independent
variables, and 2p − 1 is the total number of possible models (Jung 2015)

.3.6.1 Evaluation Metrics


The evaluation methods used to measure the model performance include accuracy,
precision, recall, and roc_auc, as well as comparing performance on all predictors and the
best 5 predictors (Larabi-Marie-Sainte S 2019).
Accuracy refers to the percentage of all samples that have been predicted correctly. It is
the ratio of the sum of true positives and true negatives to the total number of predictions
made.
Precision refers to the percentage of all samples that have been correctly predicted as true
among all those which were predicted as true, even if they were false.
2324
Figure 5: ML-based studies workflow

Figure 6: DL-based studies workflow.


PROPOSED TECHNIQUES

Machine Learning Framework


The proposed framework leverages a combination of supervised and unsupervised machine
learning techniques, emphasizing their complementary roles in enhancing the accuracy and
efficiency of autism spectrum disorder (ASD) diagnosis. The process encompasses data
collection, preprocessing, feature extraction, model training, and evaluation.

1. Data Collection
• Diverse data sources, including clinical assessments, behavioral observations, and
potentially genetic and neuroimaging data, form the basis for the dataset.
• Data collected adheres to strict privacy regulations, ensuring confidentiality and informed
consent.
2. Data Preprocessing
• Rigorous data cleaning, handling missing values, and standardizing formats to create a
coherent and usable dataset.
• Feature selection techniques employed to identify the most relevant variables for model
training.
3. Feature Extraction
• Extraction of key features from the dataset, considering linguistic patterns, behavioural
nuances, and potential neural markers associated with ASD.
• Feature importance analysis aids in identifying the most influential variables.
4. Model Training
• Supervised learning models, such as Support Vector Machines (SVM) and Random
Forests, are employed for classification tasks based on labelled data.
• Unsupervised learning models, including clustering algorithms, uncover hidden
patterns within the dataset.
5. Model Evaluation
• The performance of the trained models is rigorously evaluated using metrics such as
accuracy, precision, recall, and F1 score.
• Ensemble learning techniques may be employed to combine predictions from multiple
models, enhancing overall accuracy.
References
[1]. Dawson, G., Rogers, S., Munson, J., Smith, M., Winter, J., Greenson, J., & Donaldson, A.
(2010). Randomized, controlled trial of an intervention for toddlers with autism: The Early
Start Denver Model. Pediatrics, 125(1), e17–e23. ISSN: 0031-4005
[2]. Thabtah, F. (2018). Machine learning in autistic spectrum disorder behavioral research: A
review and ways forward. Informatics for Health and Social Care, 43(1), 91–114. ISSN:
1753-8157
[3]. Haque, M. A., et al. (2020). A deep learning approach for early detection of autism spectrum
disorder based on electroencephalography. Biomedical Signal Processing and Control, 57,
101722. ISSN: 1746-8094
[4]. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental
disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. ISSN: 0890-
8567Satya Bhushan Verma, Abhay Kumar Yadav, Detection of Hard Exudates in
Retinopathy Images, ADCAIJ: Advances in Distributed Computing and Artificial
Intelligence Journal Regular Issue, Vol. 8 N. 4 (2019), 41-48 eISSN: 2255-2863 DOI:
http://dx.doi.org/10.14201/ADCAIJ2019844148
[5]. Pedregosa, F., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine
Learning Research, 12, 2825–2830. ISSN: 1533-7928
[6]. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. ISSN: 0885-6125
[7]. Lundberg, S. M., & Lee, S. I. (2017). A Unified Approach to Interpreting Model
Predictions. In Advances in Neural Information Processing Systems (pp. 4765–4774).
ISSN: 1049-5258
[8]. Satya B Verma, Shashi B V, Data Transmission in BPEL (Business Process Execution
Language), ADCAIJ: Advances in Distributed Computing and Artificial Intelligence
Journal Regular Issue, Vol. 9 N. 3 (2020), 105-117 eISSN: 2255-2863 DOI:
https://doi.org/10.14201/ADCAIJ202093105117 105
[9]. SB Verma, Brijesh P., and BK Gupta, Containerization and its Architectures: A Study,
ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, Vol. 11
N. 4 (2022), 395-409, eISSN: 2255-2863, DOI: https://doi.org/10.14201/adcaij.28351
[10]. Verma, S. B., & Saravanan, C. (2018, September). Performance analysis of various
fusion methods in multimodal biometric. In 2018 International Conference on
Computational and Characterization Techniques in Engineering & Sciences
(CCTES) (pp. 5–8). 2018, IEEE.
[11]. Verma, S.B., Yadav, A.K. (2021). Hard Exudates Detection: A Review., Emerging
Technologies in Data Mining and Information Security. Advances in Intelligent Systems
and Computing, vol 1286. Springer, Singapore. https://doi.org/10.1007/978-981-15-9927-
9_12

You might also like