THESIS (1) Ayush
THESIS (1) Ayush
A Thesis Submitted
Master of Technology
In
Computer Science and Engineering
Submitted by
Student Name: Ayush Anand Srivastava Roll No:202210101010002
a partial requirement for the award of the degree of the Master of Technology in Data
“………………”. I further declare that the work reported in this project has not been
submitted and will not be submitted either in part or in full for the award ofany other
Place: Lucknow
Date: 04/05/2024
Signature of student
Signature of Supervisor(s)
Name(s)
Department(s)
Month, Year
Signature of Head/Dean
Head-CSE
First, I would like to thank God almighty for guiding me through all challenges and giving me
the privilege of completing my degree. You will continue to take the wheel in life and lead me
to greater heights. In addition, I would love to express my gratitude to my parents and siblings
for all their prayers, sacrifices, and motivation, which have continued to sustain me immensely.
I would also like to give thanks to my thesis supervisor, Dr.Satya Bhushan Verma, for his
guidance and support throughout the entire process of completing this thesis. Additionally, I
would like to thank my thesis committee members, Dr. Henry Chu, Dr.Anil Kumar Pandey, for
their outstanding comments and suggestions.
Finally, I would like to thank my classmates and colleagues in the School of Computing and
Informatics at the University of Louisiana at Lafayette for their academic interactions and
support. I am indeed very grateful to you all.
Abstract
The rising prevalence of Autism Spectrum Disorder (ASD) underscores the urgency of
advancing diagnostic capabilities and intervention strategies. In this context, the convergence
of computer vision, artificial intelligence, and neuroimaging presents a promising avenue for
enhancing ASD detection and understanding its neurobiological underpinnings. This paper
explores the potential of machine learning algorithms in analyzing MRI data to automatically
identify ASD-related patterns. Leveraging the rich dataset provided by the Autism Brain
Imaging Data Exchange (ABIDE), we propose a framework for ASD detection using T1-
weighted MRI scans.
Our study encompasses conventional machine learning methods, feature selection techniques,
and deep learning architectures to optimize detection accuracy. Through comprehensive
analysis and validation, we demonstrate the efficacy of our approach in detecting ASD and
elucidating neural biomarkers. By integrating technological advancements with clinical
insights, we strive towards more accurate, efficient, and personalized approaches to ASD
diagnosis and intervention.
Furthermore, we delve into the multifaceted nature of ASD, recognizing its heterogeneity and
the challenges it poses for diagnosis and treatment. ASD encompasses a wide spectrum of
symptoms, ranging from mild to severe, and often co-occurs with other neurodevelopmental
conditions such as Attention Deficit Hyperactivity Disorder (ADHD) and anxiety disorders. By
elucidating the distinct subtypes and comorbidities associated with ASD, our study aims to
inform more targeted interventions and support strategies tailored to individuals' specific needs.
Moreover, we highlight the socioeconomic implications of ASD, emphasizing the substantial
financial burden it places on families, healthcare systems, and society at large. Early detection
and intervention are crucial not only for improving outcomes for individuals with ASD but also
for alleviating the economic strain associated with long-term care and support services.
In summary, our research contributes to the growing body of knowledge on ASD diagnosis and
management by leveraging cutting-edge technologies and interdisciplinary approaches. By
bridging the gap between neuroscience, computer science, and clinical practice, we strive to
pave the way for more effective, accessible, and personalized solutions for individuals affected
by ASD and their families.
Table of Content
Abstract……..…………………………………………………………………………………...…………………………………..i
Acknowledgement……………………………………………………………...…………...…..………………………..…..ii
List of
Figures………………………………………………………………………..………………………………………………ii
List of
Abbreviations…………………………………………………………………………..…………………………………iii
Chapter 1. Introduction……………………………………………………..……...………….……………………………..1
3.1 System
Architecture................................................................................................…………...38
3.2 System
Design.....................................................................................…………………………………38
3.3 Data Flow
Diagram.................................................................……………………………………………39
3.4 Flow
chart.............................................................................................………………………………40
3.5 System Flow
Diagram................................................................................………………………….41
3.6 Software Development Life
Cycle...........................................................................……….……41
3.7 Technology
Used.......................................................................................………………………….42
Chapter 4: Experiments, SImulation & Testing.......................................................................…52
4.1 Machine Learning
Algorithms...................................................................................……….59
4.2 Logistic
Regression ...............................................................................................……….……59
4.3 Support Vector
Machine.......................................................................................……………50
4.4 Decision
Tree..........................................................................................................…………….50
4.5 Random
Forest.......................................................................................................………..…51
4.6 Performance Evaluation..............................................................................................……….
52
4.5 Evaluation
Metrics..............................................................................................………………..53
Computer vision and artificial intelligence are big areas where scientists and companies are
making a lot of progress. They're trying to make computers as smart as humans, or even
smarter. They're using these technologies in many areas like medical diagnosis, making
videos, and keeping information secure. But one area where they haven't used it much yet is
understanding the brain.
In this article, the authors suggest a way to use computer programs to help identify a
condition called Autism Spectrum Disorder (ASD). ASD is a condition where people have
trouble with social interactions and emotions, and they often do things repetitively. It's not
rare, and it can vary a lot from person to person.
There are different types of ASD, like High Functioning Autism and Asperger Syndrome, and
sometimes it's related to other conditions like Attention Deficit Hyperactivity Disorder
(ADHD) or anxiety and depression.
More and more people are being diagnosed with autism, and it costs a lot to treat them.
Detecting autism early can help reduce these costs, so it's important for researchers to find
ways to spot it sooner.
Right now, doctors mostly use standard methods to diagnose ASD, like looking at how
someone behaves and talks. But there's still a lot we don't know about the brain in ASD.
Some new ideas suggest that there are differences in how different parts of the brain work
together in people with ASD.
To understand these differences, scientists use a type of scan called Magnetic Resonance
Imaging (MRI). MRI scans can show us details about the brain's structure and how it
functions. By studying these scans, researchers hope to find clues that can help identify ASD
earlier.
There's a big database called ABIDE that scientists use to study ASD using MRI data. This
database contains scans from people with and without ASD, and it helps researchers compare
different brain patterns.
In this study, the authors use a special kind of MRI scan called T1-weighted MRI scans to try
to detect ASD automatically. They use computer programs to analyze the scans and look for
patterns that might indicate ASD. They try different methods, including some new ones called
deep learning, to see which works best.
Their study shows that using these computer methods could help detect ASD more accurately
and quickly. It also suggests ways to improve these methods in the future. This could be
really helpful for doctors and families dealing with ASD.
Figure 1. MRI scan in different cross-sectional views, where A, P, S, I, R, and L in the figure represent
anterior, posterior, superior, inferior, right, left, respectively. The axial/horizontal view divides the
MRI scan into head and tail/superior and inferior portions, sagittal view breaks the scan into left and
right and coronal/vertical view divides the MRI scan into anterior and posterior portions (Schnitzlein
and Murtagh 1985).
Why ASD is important: Autism Spectrum Disorder (ASD) affects a person's social skills,
communication abilities, and behavior. Detecting ASD early can lead to better support and
intervention, which can significantly improve the individual's quality of life.
Challenges in ASD diagnosis: Diagnosing ASD can be complex because it's a spectrum
disorder, meaning it varies widely among individuals. Additionally, there's no single medical
test for ASD; diagnosis typically involves observation of behavior and developmental history.
The role of technology: Advances in computer vision and artificial intelligence offer
promising tools for assisting in ASD diagnosis. These technologies can analyze large amounts
of data, such as MRI scans, to identify patterns that may indicate ASD.
Understanding brain differences: Researchers believe that differences in brain structure and
function may contribute to ASD. MRI scans provide detailed images of the brain, allowing
scientists to study these differences and potentially identify biomarkers for ASD.
The ABIDE database: The Autism Brain Imaging Data Exchange (ABIDE) is a valuable
resource for ASD research. It contains MRI data from individuals with ASD and typically
developing individuals, enabling researchers to compare brain images and identify differences
associated with ASD.
Machine learning and ASD detection: Machine learning algorithms can analyze MRI data
and learn to recognize patterns associated with ASD. By training these algorithms on large
datasets like ABIDE, researchers can develop models that accurately detect ASD based on brain
scans.
Future directions: Continued research in this field could lead to more accurate and efficient
methods for ASD diagnosis. Improvements in technology, along with collaborations between
researchers and healthcare professionals, may enhance early detection and intervention for
individuals with ASD.
Rise in ASD prevalence: Over the past few decades, there has been a noticeable increase in
the prevalence of Autism Spectrum Disorder (ASD) worldwide. This rise has spurred efforts to
better understand and address the condition, emphasizing the importance of early detection and
intervention.
Financial burden of ASD: ASD imposes significant financial burdens on families and
healthcare systems. The costs associated with diagnosis, therapy, and long-term care can be
substantial, highlighting the need for cost-effective and efficient diagnostic methods.
The complexity of ASD: ASD is a complex neurodevelopmental disorder with diverse
manifestations and underlying causes. Understanding the intricate interplay between genetic,
environmental, and neurological factors is essential for advancing diagnostic and treatment
strategies.
The potential of neuroimaging: Neuroimaging techniques, such as MRI, offer a window into
the brain's structure and function, providing valuable insights into neurological conditions like
ASD. Leveraging these imaging modalities alongside computational approaches holds promise
for elucidating the neural correlates of ASD.
Toward personalized medicine: Tailoring interventions and therapies to individual needs is a
cornerstone of personalized medicine. By harnessing the power of computer algorithms and
neuroimaging data, researchers aim to develop more personalized approaches to ASD diagnosis
and treatment.
Interdisciplinary collaboration: Addressing the complexities of ASD requires
interdisciplinary collaboration between researchers, clinicians, educators, and policymakers.
By fostering synergy among diverse fields such as neuroscience, computer science,
psychology, and public health, we can leverage collective expertise to develop holistic
approaches to ASD diagnosis and intervention.
Ethical considerations: As we navigate the intersection of technology and healthcare, it's
imperative to address ethical considerations surrounding ASD diagnosis and treatment.
Ensuring patient privacy, informed consent, and equitable access to resources are paramount
in the development and implementation of diagnostic frameworks and interventions.
Global impact: ASD is a global health concern, transcending geographical boundaries and
cultural contexts. Our research endeavors to contribute to the global effort to combat ASD by
fostering international collaborations, sharing data and best practices, and advocating for
greater awareness and resources for ASD research and support services worldwide.
Empowering individuals and families: Beyond the scientific and clinical realms, our work
seeks to empower individuals with ASD and their families by providing them with the
knowledge, tools, and support needed to navigate the challenges associated with the condition.
By promoting self-advocacy, community engagement, and inclusive policies, we aim to foster
a more inclusive and supportive environment for individuals living with ASD.
State of the Art
In this part, we talk about different ways scientists have tried to understand and identify
neurodevelopmental disorders, with a focus on Autism Spectrum Disorder (ASD). They've
combined techniques from artificial intelligence (like machine learning and deep learning) with
data from brain scans to study things like how the brain understands words, learns, and feels
emotions. However, using these techniques for understanding psychological and
neurodevelopmental issues like schizophrenia, autism, and anxiety/depression is still tricky
because these conditions are complex.
Some researchers have used a method called multivoxel pattern analysis to detect Major
Depressive Disorder (MDD) using MRI data. They got really good results with an accuracy of
95%! Others have used classifiers like Gaussian Naïve Bayes (GNB) to identify ASD in brain
scans, achieving an accuracy of 97%.
Another study looked at structural MRI data to predict various neurodevelopmental disorders
like Alzheimer’s, Autism, and Schizophrenia. They used a technique called Multivariate
Pattern Analysis and achieved accuracies ranging from 59% to 86%.
Deep learning models, which are very complex computer programs inspired by the human
brain, have also been used. One study used a Deep Belief Network to automatically detect
schizophrenia in brain scans, with an accuracy of 90%. Another study used deep neural
networks to analyze brain activity patterns and classify different tasks, like language and
emotion recognition, with an average accuracy of about 50%.
In another study, researchers trained a neural network to detect ASD by learning from MRI
data collected while people were resting. They achieved a classification accuracy of up to 70%.
There's also interesting research on using children's visual behavior, like how they look at
pictures, to detect ASD early. And some studies have combined features from different datasets
to improve ASD detection, like using machine learning algorithms to analyze interactions
between children and robots.
Researchers are also working on methods to handle differences in data collected from different
places, which can affect how accurate the results are. They've proposed techniques like low-
rank representation decomposition and multi-site clustering to address this issue.
Studies have shown that certain parts of the brain, like the corpus callosum, which connects
the two halves of the brain, can be different in people with ASD. For example, some studies
found changes in the size of certain regions of the corpus callosum in individuals with ASD
compared to those without.
In our study, we're using information about these brain regions and other factors to try to
improve the accuracy of ASD detection. We'll explain more about our approach and the data
we're using in the next section.
Figure 2. An example of corpus callosum area segmentation. The figure shows example data for an
individual facing ASD in the ABIDE study. Figure A represents 3D volumetric T1-weighted MRI scan.
Figure B represents segmentation of corpus callosum in red. Figure C represents the further division
of corpus callosum according to the Witelson scheme (Witelson 1989). The regions W1 (rostrum),
W2(genu), W3(anterior body), W4(mid-body), W5(posterior body), W6(isthmus), and W7 (splenium)
are shown in red, orange, yellow, green, blue, purple, and light purple (Kucharsky Hiess et al. 2015).
Additionally, research has explored the use of machine learning algorithms to analyze brain
imaging data in various ways to detect neurodevelopmental disorders. For instance, researchers
have looked into the classification of different types of neurodevelopmental disorders, such as
ASD, schizophrenia, and ADHD, using patterns identified in brain scans.
Some studies have focused on specific brain regions or structures that may be indicative of
certain disorders. For example, alterations in the size or connectivity of specific regions, such
as the amygdala or prefrontal cortex, have been linked to ASD and other neurodevelopmental
conditions.
Moreover, machine learning techniques have been applied to different types of brain imaging
data, including structural MRI (s-MRI) and functional MRI (f-MRI), to uncover patterns
associated with neurodevelopmental disorders. These techniques aim to identify subtle
differences in brain structure or activity that may serve as biomarkers for these conditions.
Furthermore, researchers have explored the potential of combining multiple types of data, such
as genetic information, behavioral assessments, and brain imaging data, to improve the
accuracy of diagnosis and prediction of neurodevelopmental disorders.
Overall, the integration of machine learning algorithms with brain imaging data holds promise
for advancing our understanding of neurodevelopmental disorders and improving diagnostic
and treatment approaches. By identifying unique patterns in brain structure and function
associated with these conditions, researchers aim to develop more personalized and effective
interventions for individuals affected by neurodevelopmental disorders.
In recent years, there has been a growing interest in leveraging machine learning techniques to
analyze neuroimaging data for the early detection and characterization of neurodevelopmental
disorders. This approach holds the potential to revolutionize clinical practice by providing
objective and quantitative measures to aid in diagnosis and treatment planning.
Researchers have explored a wide range of machine learning algorithms, including support
vector machines (SVMs), random forests, and convolutional neural networks (CNNs), to
extract meaningful information from brain images. These algorithms are trained on large
datasets of brain scans, allowing them to learn complex patterns associated with different
neurodevelopmental disorders.
Advancements in neuroimaging technology, such as high-resolution imaging techniques and
multi-modal imaging approaches, have enabled researchers to capture detailed information
about brain structure and function. This rich data source provides valuable insights into the
underlying mechanisms of neurodevelopmental disorders and may help identify novel
biomarkers for early detection.
Machine learning algorithms can be used to integrate information from multiple sources,
including clinical assessments, genetic data, and environmental factors, to develop
comprehensive models for predicting risk and prognosis of neurodevelopmental disorders. By
incorporating diverse datasets, these models can provide a more holistic understanding of the
complex interplay between genetic, environmental, and neurological factors contributing to
these conditions.
Additionally, the application of machine learning in neuroimaging research has facilitated the
development of personalized medicine approaches for neurodevelopmental disorders. By
tailoring interventions to individual characteristics, such as brain morphology, connectivity
patterns, and genetic profiles, clinicians can optimize treatment outcomes and improve long-
term prognosis for patients.
The synergy between machine learning and neuroimaging holds great promise for advancing
our understanding of neurodevelopmental disorders and transforming clinical practice.
Continued research in this interdisciplinary field is essential for unlocking the full potential of
these technologies and improving outcomes for individuals affected by these conditions.
The emergence of big data analytics has enabled researchers to harness vast amounts of
neuroimaging data from large-scale studies and consortia. These datasets, often comprising
thousands of brain scans from diverse populations, provide unprecedented opportunities to
uncover subtle patterns and associations that may have previously gone unnoticed. Machine
learning algorithms, equipped with the ability to process and analyze such massive datasets,
hold immense potential for unlocking new insights into the etiology, progression, and treatment
response of neurodevelopmental disorders.
The advent of deep learning techniques has revolutionized the field of neuroimaging analysis.
Deep neural networks, with their hierarchical architecture and capacity for automatic feature
extraction, have demonstrated remarkable performance in tasks such as image segmentation,
classification, and anomaly detection.
In conclusion, the convergence of machine learning and neuroimaging represents a paradigm
shift in our approach to understanding and addressing neurodevelopmental disorders. By
leveraging advanced computational techniques to analyze complex brain data, researchers can
unlock new insights into the nature of these disorders and pave the way for more effective
diagnostic, therapeutic, and preventive interventions. Continued innovation and collaboration
in this interdisciplinary field hold the key to transforming the lives of individuals affected by
neurodevelopmental disorders and advancing the frontiers of neuroscience and psychiatry.
Database
This research used brain scans from a big collection of data called the Autism Brain Imaging
Data Exchange (ABIDE-I). This collection shares brain images from people with autism and
those without it, along with some information about them. In ABIDE-I, there are scans from
17 places around the world, with a total of 1112 people. Among them, 539 have autism, and
573 are healthy. To protect privacy, the names of the people in the ABIDE database are kept
secret, following the Health Insurance Portability and Accountability Act (HIPAA) rules from
1996.
We used the same kinds of data as another study by Hiess et al. To understand how they did
their research, we'll explain how they prepared the brain scans from ABIDE. They focused on
certain parts of the brain and calculated their sizes and shapes using a method called
preprocessing. This helped them study different areas like the corpus callosum and the total
brain volume.
Preprocessing
Different computer programs were used to measure the size of the corpus callosum, its parts,
and the overall volume inside the skull. These programs are called yuki, fsl, itksnap, and
brainwash. The corpus callosum is important for connecting and coordinating information
between the two halves of the brain. It's made up of millions of nerve fibers and is the biggest
connection between the brain's two sides. Intracranial volume is a way to estimate the overall
size of the brain and its parts.
For each person in the study, the corpus callosum was measured using the yuki software. Then,
the corpus callosum was divided into its smaller parts automatically using a method called the
Witelson scheme.
Each part of the brain was carefully examined and adjusted using a software called "ITK-
SNAP" to make sure it was accurately identified. Two people checked this to make sure it was
done right. Sometimes, small changes needed to be made by hand to get the size of the corpus
callosum just right in certain brain scans. To check if the measurements were consistent, we
compared them using statistical methods.
We measured the total volume inside the skull for each person using a tool called "brainwash."
This tool uses a special technique called automatic registration to figure out the volume. It
looks at different parts of the brain in the MRI and decides which parts belong inside the skull.
If there were any mistakes in this process, we went back and tried again with the same MRI
scan. Sometimes, we had to manually fix points that weren't correctly identified by the
software.
A feature called "region-based snakes" in the "ITK-SNAP" software helped us correct any
small errors in the volume measurement.
The diagram in Figure 3 shows how we turned the MRI scans into a set of features that we used
to train the model to detect autism.
The Information Gain Ratio is a way to improve the Information Gain method, which tends to
favor features with larger values. It calculates the ratio of Information Gain to Intrinsic Value,
where Intrinsic Value is an additional measure of entropy. Basically, it's a more balanced way
to select features. Mathematically, it's represented as a formula involving the set of features
and training examples, along with their values.
Chi-Square Method. The Chi-Square (χ2) is correlation based feature selection method (also
known as the Pearson Chi-Square test), which calculates the dependencies of two independent
variables, where two variables A and B are defined as independent, if PðABÞ ¼ PðAÞPðBÞ,
or equivalent, PðAjBÞ ¼ PðAÞ and PðBjAÞ ¼ PðBÞ. In terms of machine learning, two
variables are the
Figure 4. Results of entropy and correlation based feature selection methods. All features are
represented with their corresponding weights. A represents the result of information gain. B
represents the result of information gain ratio. C represents the result of chi-square method. D
represents the result of symmetrical uncertainty.
occurrence of the features and class label (Doshi 2014). Chi square method calculates the
correlation strength of each feature by calculating statistical value represented by the following
expression:
(χ2) is the chi-square statistic, O is the actual value of i feature, and E is the expected value of
i feature.
Symmetrical Uncertainty. Symmetrical Uncertainty (SU) is referred as relevance indexing or
scoring (Brown et al. 2012) method which is used to find the relationship between a feature
and class label. It normalizes the value of features within the range of [0, 1], where 1 indicates
that feature and target class are strongly correlated and 0 indicates no relationship between
them (Peng, Long, and Ding 2005). For a class label Y, the symmetrical uncertainty for set of
features X is mathematically denoted as
IG(X,Y) represents information gain, and H represents entropy, respectively. All four methods
(information gain, information gain ratio, chi-square and symmetrical uncertainty) calculate
the value/importance/weight of each feature for a given task. The weight of each feature is
calculated with respect to class label and feature value calculated by each method. The higher
the weight of feature, the more relevant it is considered. The weight of each feature is
normalized between in the range of [0, 1]. The results of each feature selection method is shown
in Figure 4
Figure 4 shows the results of our study on feature selection. The first two graphs display the
importance of different features according to entropy-based methods like information gain and
information gain ratio. The last two graphs show the feature importance based on correlation
methods such as chi-square and symmetrical uncertainty. Although the results of information
gain ratio differ slightly from those of information gain, both methods identify W7 and CC
circularity as the most crucial features. The results from correlation methods are quite similar,
with brain volume, W7, W2, and CC circularity being the most significant discriminant
features.
It's worth noting that the features we found to be discriminant align with those identified in a
study by Hiess et al. They also concluded that brain volume and corpus callosum area are
essential for distinguishing between ASD and control groups. In our study, we also found brain
volume and various corpus callosum sub-regions (labeled as W2, W4, and W7) to be crucial.
The results from correlation methods closely match those presented by Hiess et al.
In our proposed framework, we set a threshold on the results obtained from the feature selection
method to choose a subset of features. This reduces computational complexity and improves
the performance of machine learning algorithms. Through experiments with different threshold
values, we found that the highest average classification accuracy (for ASD detection) is
achieved using the subset of features identified by the chi-square method at a threshold value
of 0.4.
The final feature vector we derived includes brain volume, CC circularity, CC length, W2
(genu), W4 (mid-body), W5 (posterior body), and W7 (splenium), where CC stands for corpus
callosum. Comparing the average classification accuracy with and without feature selection
method, it's evident that training the classifier on a subset of discriminant features improves
both computational complexity and classification accuracy.
The next subsection will discuss the conventional machine learning methods we evaluated in
this study.
Methodology:
1. Data Collection: We gathered structural MRI scans from the Autism Brain Imaging
Data Exchange (ABIDE-I) dataset, which provides imaging data of individuals with
ASD and control participants.
2. Preprocessing: We utilized various software tools such as yuki, fsl, itksnap, and
brainwash to calculate corpus callosum area, its sub-regions, and intracranial volume.
These measurements are crucial for understanding brain structure differences between
ASD and control groups.
3. Manual Correction: After segmentation, we visually inspected and corrected any errors
using ITK-SNAP software. This ensured the accuracy of our measurements.
4. Statistical Analysis: We performed statistical equivalence analysis and intra-class
correlation to assess the agreement between different readers in segmenting corpus
callosum area.
5. Feature Selection: We employed feature selection techniques to identify the most
relevant features for distinguishing between ASD and control groups. This step
involved evaluating information gain and information gain ratio, as well as correlation-
based methods like chi-square and symmetrical uncertainty.
6. Thresholding: We applied thresholds to the feature selection results to choose a subset
of features that would optimize computational efficiency and classification accuracy.
Results:
1. Feature Importance: Our analysis revealed that brain volume, CC circularity, and
various sub-regions of the corpus callosum (such as genu, mid-body, and splenium)
were the most discriminant features for ASD detection.
2. Comparison with Previous Studies: Our findings aligned with previous research,
particularly a study by Hiess et al., which also identified brain volume and corpus
callosum area as crucial for discriminating between ASD and control groups.
3. Performance Improvement: By training the classifier on the selected subset of
discriminant features, we observed improved classification accuracy compared to using
all features. This indicates the effectiveness of our feature selection approach in
enhancing classification performance.
4. Optimal Threshold: Through experimentation, we determined that using the chi-square
method with a threshold value of 0.4 yielded the highest average classification accuracy
for ASD detection.
Overall, our methodology enabled us to identify key structural brain differences associated
with ASD and develop an effective feature selection strategy to enhance classification accuracy.
Significance of Findings:
1. Understanding Brain Structure: Our study sheds light on the importance of specific
brain regions, such as the corpus callosum, in distinguishing between individuals with
ASD and those without. By quantifying structural differences, we contribute to a deeper
understanding of the neurobiological basis of ASD.
2. Diagnostic Biomarkers: The identified features, including brain volume and corpus
callosum sub-regions, hold potential as diagnostic biomarkers for ASD. Clinicians and
researchers can use these biomarkers to develop more accurate and efficient diagnostic
tools.
3. Personalized Interventions: Knowledge of structural brain differences can inform
personalized interventions for individuals with ASD. Tailored therapies targeting
specific brain regions may lead to better outcomes and improved quality of life for
individuals with ASD.
4. Advancements in Machine Learning: Our study demonstrates the utility of machine
learning algorithms in analyzing neuroimaging data and identifying relevant features
for ASD detection. This highlights the role of artificial intelligence in advancing
diagnostic techniques and improving our understanding of complex neurological
disorders.
Implications:
1. Clinical Practice: Our findings have direct implications for clinical practice, where
neuroimaging techniques could be integrated into routine assessments for ASD
diagnosis and treatment planning. Clinicians can leverage the identified biomarkers to
provide more accurate and personalized care.
2. Research Directions: Future research could focus on validating the identified
biomarkers in larger and more diverse populations. Longitudinal studies could also
investigate how structural brain differences evolve over time in individuals with ASD
and their association with clinical outcomes.
3. Intervention Strategies: Insights from our study could inform the development of novel
intervention strategies targeting specific brain regions implicated in ASD. These
interventions could include behavioral therapies, neurofeedback training, or
pharmacological treatments tailored to individual neurobiological profiles.
By advancing our understanding of the neurobiological underpinnings of ASD and leveraging
machine learning techniques, our study contributes to the ongoing efforts to improve ASD
diagnosis and treatment.
Linear Discriminant Analysis (LDA). LDA is a statistical method that finds linear
combination of features, which separates the dataset into their corresponding classes. The
resulting combination is used as linear classifier (Jain and Huang 2004). LDA maximizes the
linear separability by maximizing the ratio of between-class variance to the within-class
variance for any particular dataset. Let ω1; ω2; ::; ωL and N1; N2; ::; NL be the classes and
number of exampleset in each class, respectively. Let M1; M2 . . . ; ML and M be the means of
the classes and grand mean, respectively. Then, the within and between class scatter matrices
Sw and Sb are defined as
Support Vector Machine (SVM). The SVM classifier segregates samples into corresponding
classes by constructing decision boundaries known as hyperplanes (Vapnik 2013). It implicitly
maps the dataset into higher dimensional
Random Forest (RF). Random Forest belongs to the family of decision tree, capable of
performing classification and regression tasks. A classification tree is composed of nodes and
branches that break the set of samples into a set of covering decision rules (Mitchell 1997). RF
is an ensemble tree classifier consisting of many correlated decision trees and its output is the
mode of class’s output by individual decision tree
Figure: General process of ASD studies using fMRI data and machine learning (taking FC
features for example). ASD, autism spectrum disorder; fMRI, functional magnetic resonance
imaging; FC, functional connectivity.
Additional Feature Selection Techniques
1. Sequential Feature Selection: Sequential feature selection methods search through
different feature subsets, evaluating their performance using a chosen criterion.
Examples include Sequential Forward Selection (SFS) and Sequential Backward
Selection (SBS), which iteratively add or remove features based on model performance
until an optimal subset is found.
2. Principal Component Analysis (PCA): PCA is a dimensionality reduction technique
that transforms the original features into a new set of orthogonal variables called
principal components. While PCA is not strictly a feature selection method, it can be
used to reduce the dimensionality of the dataset by selecting a subset of principal
components that capture the most variance in the data.
3. Genetic Algorithms: Genetic algorithms mimic the process of natural selection to
evolve a population of potential solutions to a given optimization problem. In the
context of feature selection, genetic algorithms generate and evaluate different feature
subsets, selecting those that yield the best performance based on a fitness function.
4. Sparse Regression Models: Sparse regression models, such as LASSO (Least
Absolute Shrinkage and Selection Operator) and Elastic Net, introduce sparsity in the
coefficient weights of the regression model, effectively performing feature selection by
shrinking irrelevant coefficients to zero.
5. Recursive Feature Addition: Recursive Feature Addition (RFA) is a variant of
recursive feature elimination where features are added to the model one at a time based
on their individual contributions to model performance. This method can be useful for
identifying the most influential features in the dataset.
Evaluation Metrics for Feature Selection
• Model Performance Metrics: Evaluation metrics such as accuracy, precision, recall,
F1-score, and area under the receiver operating characteristic curve (ROC-AUC) can
be used to assess the performance of the machine learning model trained using the
selected features.
• Computational Efficiency: In addition to model performance, it's essential to consider
the computational efficiency of feature selection methods, especially for large datasets.
Methods that require less computational resources or have faster execution times may
be preferred in practice.
• Stability of Feature Rankings: The stability of feature rankings across multiple
iterations or subsets of the data can indicate the reliability of the selected features.
Methods that produce consistent rankings across different samples or partitions of the
dataset are generally more robust.
Conclusion
Feature selection is a fundamental step in the machine learning workflow that can significantly
impact model performance, interpretability, and computational efficiency. By selecting the
most informative features from the dataset while discarding irrelevant or redundant ones,
feature selection methods help improve model generalization and reduce overfitting.
Understanding the various feature selection techniques and their implications is essential for
practitioners to build effective and reliable machine learning models for real-world
applications.
where;
is the prior probability and P i represents covariance matrix of class ωi.
Support Vector Machine (SVM). The SVM classifier segregates samples into corresponding
classes by constructing decision boundaries known as hyperplanes (Vapnik 2013). It implicitly
maps the dataset into higher dimensional
Figure 5. An architecture of Multilayer Perceptron (MLP).
feature space and constructs a linear separable line with maximal marginal distance to separates
hyperplane in higher dimensional space. For a training set of examples s {ðxi; yiÞ; i ¼ 1 . . . ;
l} where xi 2
where;
αi are Langrange multipliers of a dual optimization problem separating two hyperplanes, Kð:;
:Þ is a kernel function, and b is the threshold parameter of the hyperplane
Random Forest (RF). Random Forest belongs to the family of decision tree, capable of
performing classification and regression tasks. A classification tree is composed of nodes and
branches that break the set of samples into a set of covering decision rules (Mitchell 1997). RF
is an ensemble tree classifier consisting of many correlated decision trees and its output is the
mode of class’s output by individual decision tree.
Multilayer Perceptron (MLP). MLP belongs to the family of neural-nets, which consists of
interconnected group of artificial neurons called nodes and connections for processing
information called edges (Jain, Mao, and Moidin Mohiuddin 1996). A neural network consists
of an input, hidden and output layer. The input layer transmits inputs in form of feature vector
with a weighted value to hidden layer. The hidden layer, is composed with activation units or
transfer function (Gardner and Dorling 1998), carries the features vector from first layer with
weighted value and performs some calculations as output. The output layer is made up of single
activation units, carrying weighted output of hidden layer and predicts the corresponding class.
An example of MLP with 2 hidden layer is shown in Figure 5. Multilayer perceptron is
described as fully connected, with each node connected to every node in the next and previous
layer. MLP utilizes the functionality of back-propagation (Hecht-Nielsen 1992) during training
to reduce the error function. The error is reduced by updating weight values in each layer. For
a training set of examples {X ¼ ðx1; x2; x3; . . . :; xmÞ} and output y 2 f 0, 1 g, a new test
where; K is the set of nearest neighbors, ky is the class of k, and dðk; qÞ is the Euclidean
distance of k from q. Results and Evaluation We chose to evaluate the performance of our
framework in the same way, as evaluation criteria proposed by Heinsfeldl et al. (Heinsfeld et
al. 2018). Heinsfeldl et al. evaluated the performance of their framework on the basis of k-fold
cross validation and leave-one-site-out classification schemes (Bishop 2006). We have also
evaluated results of above mentioned classifiers based on these schemes
Figure 7. Schematic overview of the leave-one-site-out classification scheme.
Transfer Learning Based Approach Results obtained with conventional machine learning
algorithms with and without feature selection method are presented in Section 4.1.3. It can be
observed that average recognition accuracy for autism detection on ABIDE dataset remains
between the range of 52% and 55% for different conventional machine learning algorithms;
refer Table 3. In order achieve better recognition accuracy and to test the potential of latest
machine learning technique, i.e. deep learning (LeCun, Bengio, and Hinton 2015), we
employed the transfer learning approach using the VGG16 model (Simonyan and Zisserman
2014). Generally, training and test data are drawn from same distribution in machine learning
algorithms. On the contrary, transfer learning allows distributions used in training and testing
to be different (Pan and Yang 2010). Motivation for employing transfer learning approach
comes from the fact that training deep learning network from the scratch requires large amount
of data (LeCun, Bengio, and Hinton 2015), but in our case, the ABIDE dataset (Di Martino et
al. 2014) contains labeled samples from 1112 subjects (539 autism cases and 573 healthy
control participants). Transfer learning allows partial retraining of already trained model (re-
training usually last layer) (Pan and Yang 2010) while keeping all other layers (trained weights)
in the model intact, which are trained on millions of examples for semantically similar task.
We used transfer learning approach in our study as we wanted to benefit from deep learning
model that has achieved high accuracy on visual recognition tasks, i.e. ImageNet Large-Scale
Visual Recognition Challenge (ILSVRC) (Russakovsky et al. 2015), and is available for
research purposes. Few of the well known deep learning architectures that emerged from
ILSVRC are GoogleNet (a.k.a. Inception V1) from Google (Szegedy et al. 2015) and VGGNet
by Simonyan and Zisserman (Simonyan and Zisserman 2014). Both of these architectures are
from the family of Convolutional Neural
Figure 10. Transfer learning results using VGG16 architecture: (A) training accuracy vs
Validation accuracy and (B) training loss vs validation loss.
Networks or CNN as they employ convolution operations to analyze visual input i.e. images.
We chose to work with VGGNet, which consists of 16 convolutional layers (VGG16)
(Simonyan and Zisserman 2014). It is one of the most appealing framework because of its
uniform architecture and its robustness for visual recognition tasks, refer Figure 9. It’s pre-
trained model is freely available for research purpose, thus making a good choice for transfer
learning. VGG16 architecture (refer Figure 9) takes image of 224 × 224 with the receptive field
size of 3 x 3, convolution stride is 1 pixel and padding is 1 (for receptive field of 3 × 3). It uses
rectified linear unit (ReLU) (Nair and Hinton 2010) as activation function. Classification is
done using softmax classification layer with x units (representing x classes/x classes to
recognize). Other layers are Convolution layer and Feature Pooling layer. Convolution layer
use filters which are convolved with the input image to produce activation or feature maps.
Feature Pooling layer is used in the architecture to reduce size of the image representation, to
make the computation efficient and control overfitting.
Results As mentioned earlier, this study is performed using structural MRI (s-MRI) scans from
Autism Brain Imaging Data Exchange (ABIDE-I) dataset (http://
fcon_1000.projects.nitrc.org/indi/abide/abide_I.html) (Di Martino et al. 2014). ABIDE-I
dataset consists of 17 international sites, with total of 1112 subjects or samples, that includes
(539 autism cases and 573 healthy control participants). MRI scans in the dataset ABIDE-I are
provided in the Neuroimaging Informatics Technology Initiative (nifti) file format (Cox et al.
2003), where images represent the projection of an anatomical volume onto an image plane.
Initially, all anatomical scans were converted from nifti to Tagged Image File Format, i.e. TIFF
or TIF, a compression less format (Guarneri, Vaccaro, and Guarneri 2008), which created a
dataset of � 200k tif images. But we did not use all tif images for transfer learning as beginning
and trailing portion of images extracted from individual scans contains clipped/cropped portion
of region of interest i.e. corpus callosum. Thus, we were left with � 100k tif images with
visibly complete portion of corpus callosum. For transfer learning, VGGNet which consists of
16 convolutional layers (VGG16) was used (Simonyan and Zisserman 2014) (refer Section 4.2
for explanation of VGG16 architecture). Last fully connected dense layer of VGG16 pre-
trained model was replaced and re-trained with extracted images from ABIDE-I dataset. We
trained last dense layer with images using softmax activation function and ADAM optimizer
(Kingma and Ba 2014) with learning rate of 0.01. APPLIED ARTIFICIAL INTELLIGENCE
25 80% of the tif images extracted from MRI scans were used for training, while for validation,
20% of the frames were used. With the above mentioned parameters, the proposed transfer
learning approach achieved an autism detection accuracy of 66%. Model accuracy and loss
curves are shown in Figure 10. In comparison with conventional machine learning methods
(refer Table 3 for results obtained using different conventional machine learning methods), the
transfer learning approach gained around 10% in ASD detection. Conclusion and Future Work
Our research study shows the potential of machine learning (conventional and deep learning)
algorithms for development of neuroimaging data understanding. We showed how machine
learning algorithms can be applied to structural MRI data for automatic detection of individuals
facing Autism Spectrum Disorder (ASD). Although the achieved recognition rate is in the range
of 55% – 65%, but still in the absence of biomarkers, such algorithms can assist clinicians in
early detection of ASD. Second, it is known that studies that combine machine learning with
brain imaging data collected from multiple sites like ABIDE (Di Martino et al. 2014) to identify
autism demonstrated that classification accuracy tends to decrease (Arbabshirani et al. 2017).
In this study, we also observed the same trend. Main conclusions drawn from this study are as
follows: – Machine learning algorithms applied to brain anatomical scans can help in automatic
detection of ASD. Features extracted from corpus callosum and intracranial brain regions
present significant discriminative information to classify individual facing ASD from control
subgroup. – Feature selection/weighting methods help build a robust classifier for automatic
detection of ASD. These methods not only help framework in terms of reducing computational
complexity but also in terms of getting better average classification accuracy. – We also
provided automatic ASD detection results using Convolutional Neural Networks (CNN) via
the transfer learning approach. This will help readers to understand the benefits and bottlenecks
of the using deep learning/CNN approach for analyzing neuroimaging data, which is difficult
to record in large enough quantity for deep learning. – To enhance the recognition results of
the proposed framework, it is recommended to use a multimodal system. In addition to
neuroimaging data other modalities, i.e. EEG, speech or kinesthetic can be analyzed
simultaneously to achieve better recognition of ASD. 26 H. SHARIF AND R. A. KHAN
Results obtained using Convolutional Neural Networks (CNN)/deep learning are promising.
One of the challenge to fully utilize learning/data modeling capabilities of CNN is the use of
large database to learn concept (LeCun, Bengio, and Hinton 2015; Zhou, Bin, and Zhenguo
2018), making it impractical for applications where labeled data is hard to record. For clinical
applications where getting data, specially neuroimaging data is difficult, and training of a deep
learning algorithm poses challenge. One of the solution to counter this problem is to propose a
hybrid approach, where data modeling capabilities of conventional machine learning
algorithms (which can learn the concept on small data as well) are combined with deep
learning. In order to bridge down the gap between neuroscience and computer science
researchers, we emphasize and encourage the scientific community to share the database and
results for automatic identification of psychological ailments.
‘
Autism spectrum disorder (ASD) affects about 1.5% of children worldwide, with males being 4.5 times
more likely to be affected than females. In 2023, the estimated prevalence of ASD in the United States
was 80.9 per 10,000 people, and in Saudi Arabia, it was 100.7 per 10,000 people, showing similar
patterns across many countries. ASD is a developmental disorder characterized by early difficulties in
social communication and interaction, along with repetitive behaviors and interests. Symptoms
typically appear in early childhood and persist throughout life, leading to challenges such as learning
difficulties and social isolation. While the cause of ASD remains unknown, early diagnosis allows for
early interventions that can improve the quality of life for individuals with autism. Presently, diagnostic
methods rely heavily on behavioral assessments, which can be time-consuming, expensive, and
subject to bias.
Structural magnetic resonance imaging (sMRI) is a non-invasive method used to study brain
structure and diagnose disorders, especially in children. It offers high resolution and contrast
without using radiation. sMRI provides various brain tissue sequences like T1 and T2, and it's
also used in longitudinal studies to track brain growth over time. Machine learning (ML) has
become increasingly important in medical imaging analysis, as it can help in building
computer-aided diagnostic systems, analyzing complex data more efficiently, and predicting
disorders automatically. ML models can use both personal and behavioral features, along with
sMRI data, to uncover patterns for predicting disorders like ASD. However, challenges arise
from handling large amounts of data and optimizing algorithms for accuracy.
Feature selection (FS) algorithms help in selecting the most important features for prediction
tasks. There are different types of FS algorithms, including filter, wrapper, and embedded
methods. New optimization techniques, like bio-inspired algorithms, can further improve FS
methods by searching for the best feature subset globally. ML has shown promise in accurately
diagnosing ASD, saving time and effort for experts and enabling effective intervention. This
study aims to improve the early classification accuracy of ASD using sMRI data for children
aged 5 to 10 years and identifying important biomarkers associated with ASD.
To achieve this, a comparative empirical study was conducted using different optimization
algorithms and ML models with two public datasets from the Autism Brain Imaging Data
Exchange Initiative (ABIDE) and local data from King Abdulaziz University (KAU) Hospital.
Recursive feature elimination with cross-validation (RFECV), Boruta, and grey wolf optimizer
(GWO) algorithms for FS, and random search algorithm and GWO algorithm for
hyperparameter tuning were investigated. The study also examined the impact of age and
gender on classification performance. The research questions addressed include whether the
proposed FS methods can improve ML model accuracy in ASD classification, which optimized
models perform best in predicting ASD on the public datasets, and whether combining personal
features data with sMRI yields better results in ASD classification than using sMRI data alone.
In addition to the growing prevalence of ASD worldwide, there's an urgent need for accurate
and efficient diagnostic tools. Current diagnostic methods often rely on behavioral assessments,
which can be subjective, time-consuming, and costly. Furthermore, the cause of ASD remains
unclear, making it challenging to develop effective treatments.
Structural MRI (sMRI) imaging offers a promising avenue for diagnosing ASD by providing
detailed insights into brain structure. This non-invasive technique allows researchers to study
brain morphology and detect abnormalities associated with ASD. Machine learning (ML)
algorithms have emerged as powerful tools in analyzing sMRI data, enabling researchers to
identify patterns and biomarkers associated with ASD. By leveraging ML, researchers can
develop computer-aided diagnostic systems that streamline the diagnostic process and improve
accuracy.
Feature selection (FS) techniques play a crucial role in optimizing ML models for ASD
classification. These techniques help identify the most relevant features from the sMRI data,
reducing computational complexity and improving model performance. By selecting the most
informative features, FS methods enhance the predictive accuracy of ML models, leading to
more reliable diagnostic outcomes.
Recent advancements in FS algorithms, such as Boruta and the grey wolf optimizer (GWO),
offer promising avenues for improving ASD classification. These algorithms enable
researchers to identify important biomarkers associated with ASD and enhance the diagnostic
accuracy of ML models. By combining personal and behavioral features with sMRI data,
researchers can develop more robust diagnostic models that account for the heterogeneous
nature of ASD.
The use of public datasets, such as the Autism Brain Imaging Data Exchange Initiative
(ABIDE), provides researchers with valuable resources for training and validating ML models.
Additionally, the inclusion of local datasets, such as those from King Abdulaziz University
(KAU) Hospital, allows researchers to evaluate model performance in diverse populations and
settings. By conducting comparative studies across different datasets, researchers can gain
insights into the generalizability and robustness of their diagnostic models.
Moving forward, further research is needed to refine and optimize ML algorithms for ASD
classification. Future studies should explore novel FS techniques and optimization algorithms
to improve diagnostic accuracy and streamline the diagnostic process. By leveraging the power
of ML and sMRI imaging, researchers can continue to advance our understanding of ASD and
develop more effective diagnostic tools to support individuals and families affected by this
condition.
As research in the field of neurodevelopmental disorders, particularly autism spectrum disorder
(ASD), continues to advance, there's a growing recognition of the importance of early diagnosis
and intervention. ASD is characterized by a wide range of symptoms, including difficulties in
social communication and interaction, as well as repetitive behaviors and restricted interests.
The spectrum nature of ASD means that individuals can exhibit varying degrees of impairment,
from mild to severe.
Structural MRI (sMRI) imaging has emerged as a valuable tool for studying ASD by providing
detailed images of the brain's structure. These imaging techniques allow researchers to identify
differences in brain morphology and connectivity between individuals with ASD and typically
developing individuals. By analyzing sMRI data using machine learning algorithms,
researchers can extract meaningful patterns and biomarkers associated with ASD, facilitating
more accurate diagnosis and treatment planning.
One of the key challenges in ASD research is the heterogeneity of the disorder, both in terms
of its clinical presentation and underlying neurobiology. This heterogeneity makes it difficult
to identify consistent biomarkers or diagnostic criteria that apply to all individuals with ASD.
However, machine learning offers a promising approach for addressing this challenge by
enabling the development of personalized diagnostic models that account for individual
differences in symptomatology and brain structure.
In addition to improving diagnostic accuracy, machine learning techniques can also enhance
our understanding of the underlying neurobiology of ASD. By analyzing large-scale sMRI
datasets, researchers can identify neural correlates of ASD and gain insights into the biological
mechanisms underlying the disorder. This knowledge can inform the development of targeted
interventions and treatments tailored to the specific needs of individuals with ASD.
Furthermore, the integration of machine learning with other data modalities, such as genetic
and behavioral data, holds promise for uncovering the complex etiology of ASD. By combining
multiple sources of data, researchers can develop comprehensive models of ASD that capture
the interplay between genetic, environmental, and neurological factors contributing to the
disorder.
Overall, the application of machine learning to ASD research represents a powerful approach
for advancing our understanding of the disorder and improving clinical outcomes for
individuals affected by ASD. As technology continues to evolve and datasets grow larger and
more diverse, machine learning techniques will play an increasingly important role in shaping
the future of ASD diagnosis and treatment.
Previous studies on autism spectrum disorder (ASD) have mainly focused on using machine
learning (ML) or deep learning (DL) methods to understand and diagnose the condition. These
studies looked at a variety of factors, such as brain scans and other personal information, to
develop models for diagnosing ASD.
One study by Bahathiq and colleagues reviewed 45 articles that used ML to diagnose ASD
using brain scans. These studies used different ML techniques, sample sizes, and input features.
However, there are only a few public datasets available, with the ABIDE datasets being the
most commonly used. These studies typically looked at individual ML algorithms like support
vector machine (SVM) or k-nearest neighbor.
Researchers have used various brain measurements, such as white matter volumes and cortical
thickness, to try to find biomarkers for ASD. For example, one study found that using average
cortical thickness for different brain regions led to the best accuracy in diagnosing ASD.
Another study evaluated 20 ML approaches based on brain networks and found that using a
gradient boosting classifier (GBC) achieved the highest accuracy. Some studies focused on
distinguishing individuals with ASD from those who are typically developing using brain scans
alone, while others combined brain scans with other data.
Deep learning models have also been used to diagnose ASD, with one study achieving an
accuracy of 65.6% using resting-state fMRI and brain volume features. However, previous
studies had limitations, such as training models on small datasets, which can lead to unreliable
results. Additionally, many studies did not thoroughly analyze the significance of their findings
or use optimization techniques like hyperparameter tuning.
Furthermore, feature selection techniques, which help identify the most important features for
diagnosis, have not been extensively explored in ASD research using brain scans. Comparing
the performance of different ML models is also challenging due to variations in study methods
and participant characteristics.
Overall, these studies provide valuable insights into the use of ML for diagnosing ASD and
highlight areas for further research and improvement.
In this study, we looked at how well we could classify autism spectrum disorder (ASD) using
seven different models based on brain scan data from kids aged 5 to 10. We found that when
we used different methods on the same data, we got a wide range of results.
The initial models we tested didn't perform very well in terms of accuracy and reliability. But
after we adjusted and fine-tuned them, their performance improved. Their accuracy ranged
from around 52.55% to 66.28%. However, some models like NB and MLP didn't do as well,
possibly because they didn't have enough data to train on compared to the number of different
factors we were considering. Also, differences in where the data came from, how the brain
scans were done, and the characteristics of the kids in the study could have affected the results.
We also tried different methods for adjusting the models, like grid search and random search.
Grid search looks at all possible combinations of settings for the models, which takes a lot of
time and computer power. Random search is quicker because it only looks at a subset of
settings. But neither method tells us much about why certain settings work better than others.
In our study, we tried something new: nature-inspired methods like the grey wolf optimizer
(GWO). These methods helped us find the best settings for the models more efficiently. GWO,
which we used for the first time in ASD research, seems promising for future studies because
it could help researchers explore more advanced methods and improve accuracy.
Our main goal wasn't just to make the ASD classification more accurate, but also to understand
the brain differences associated with ASD better. We know that not all brain features are equally
important for classification. So, we used different techniques, like RFECV, Boruta, and GWO-
based algorithms, to pick out the most important features. This helps simplify the models,
prevents them from focusing too much on irrelevant details, and tackles the problem of having
too much data to work with.
Previous studies that didn't use feature selection methods often didn't do very well, especially
when dealing with large datasets that had lots of overlapping or unimportant information. By
pinpointing the key areas of the brain with important features, we hope to make diagnosing
ASD earlier and more accurate. Our results show that using feature selection techniques,
especially when multiple methods agree on which features are most important, can significantly
improve ASD classification accuracy and give us valuable insights into the structural
differences in the brain linked to ASD.
Feature Selection Methods in ASD Classification
Feature selection is a critical step in machine learning (ML) tasks like ASD classification,
where the goal is to choose the most relevant features from a large set of potential predictors
to improve model performance and interpretability.
Why Feature Selection Matters
Imagine you're trying to predict whether someone has ASD based on brain scan data. You might
have hundreds or even thousands of different measurements from the brain scans, but not all
of them are equally important for making accurate predictions. Some may be redundant, noisy,
or irrelevant, while others could be key indicators of ASD.
Challenges in High-Dimensional Data
When dealing with high-dimensional data, such as brain imaging data with many features, there
are several challenges:
1. Curse of Dimensionality: As the number of features increases, the amount of data
needed to reliably estimate model parameters also increases exponentially, leading to
overfitting and poor generalization to new data.
2. Model Interpretability: Models with too many features can be difficult to interpret,
making it hard to understand which factors are driving the predictions.
3. Computational Complexity: Training models with a large number of features requires
more computational resources and time, which may not be feasible in practice.
Types of Feature Selection Methods
There are several types of feature selection methods, each with its own strengths and
weaknesses:
1. Filter Methods: These methods select features based on their statistical properties, such
as correlation with the target variable or information gain. They are computationally
efficient but may overlook feature dependencies.
2. Wrapper Methods: Wrapper methods evaluate the performance of a ML algorithm
using different subsets of features and select the subset that produces the best
performance. They are computationally intensive but tend to yield better results than
filter methods.
3. Embedded Methods: Embedded methods integrate feature selection into the model
training process, such as regularization techniques like LASSO (Least Absolute
Shrinkage and Selection Operator). They are efficient and often yield good results.
Feature Selection Techniques in ASD Classification
In ASD classification studies using brain imaging data, researchers often employ a combination
of feature selection methods to identify the most discriminative brain regions or measurements
associated with ASD. Some common techniques include:
1. Recursive Feature Elimination (RFE): RFE is a wrapper method that recursively
removes features with low importance based on a ML model's coefficients or feature
weights. It iteratively prunes the feature set until the optimal subset is found.
2. Boruta Algorithm: The Boruta algorithm is a wrapper method inspired by random
forests. It creates "shadow" features by shuffling the values of the original features and
then compares their importance to determine feature relevance. Features that
outperform the shadow features are considered important.
3. Grey Wolf Optimizer (GWO): GWO is a nature-inspired optimization algorithm that
mimics the hunting behavior of grey wolves. In feature selection, GWO can efficiently
search for the optimal subset of features by iteratively updating the solution based on
the fitness of the selected features.
Benefits of Feature Selection in ASD Classification
Using feature selection techniques in ASD classification studies offers several benefits:
1. Improved Model Performance: By focusing on the most informative features, feature
selection can lead to more accurate and robust models.
2. Enhanced Interpretability: Selecting a subset of relevant features makes it easier to
interpret the model's predictions and understand the underlying factors contributing to
ASD.
3. Reduced Computational Complexity: By reducing the dimensionality of the feature
space, feature selection can decrease computational costs and training time.
4. Insights into ASD Biomarkers: Identifying the most discriminative brain regions or
measurements associated with ASD can provide valuable insights into the
neurobiological basis of the disorder.
Conclusion
Feature selection plays a crucial role in ASD classification using brain imaging data, helping
researchers identify the most relevant features and improve model performance. By leveraging
a combination of filter, wrapper, and embedded methods, researchers can overcome the
challenges of high-dimensional data and uncover valuable insights into the neurobiological
underpinnings of ASD. As ASD classification continues to advance, further research into
innovative feature selection techniques promises to enhance our understanding of the disorder
and improve diagnostic accuracy.
To see if there's a difference in MRI advantages between people with ASD and those without
it, we used a test called the independent t-test in a program called IBM SPSS. The results of
this test are shown in a figure, and you can find more detailed statistics in the supplementary
material.
In our study, we found that different sets of brain measurements varied in how well they could
tell apart people with ASD from those without it. Sets labeled F5 and F6 consistently did the
best across most tests we ran. Sets F1 and F4 also did well sometimes, but F2 and F3 didn't
perform as well.
These findings agree with what Jiao et al. found in their research, where they said that models
based on certain brain measurements were better for diagnosing ASD. They found specific
changes in certain brain regions in children with ASD compared to those without it. These
regions are important for things like social behavior, learning, and repetitive actions. Our study
emphasizes the importance of certain brain curvatures and using multiple measurements.
Other research has also shown that changes in the shape of the outer layer of the brain are
important in ASD, especially in children aged 7.5 to 12.5 years. Our study found that certain
sets of measurements and volume calculations can help with diagnosing ASD. Looking at the
brain's white matter is also useful for finding problems in how the brain's connections work.
Our study confirmed these findings and also found issues with certain brain regions involved
in movement and emotions.
Certain machine learning models consistently did well in our tests, like CatBoost, XGB, and
SVM. CatBoost usually had the highest accuracy, except when we used SVM with dimension-
reduced medical data, where SVM did better. However, models like NB didn't do as well,
possibly because they treat all features as equally important. Including features like age and
gender slightly improved the accuracy of some models. Overall, our best model, GWO-SVM
with F6, found that certain brain regions, especially in the frontal lobe, played a big role in
diagnosing ASD. These regions are important for things like movement, emotions, memory,
and language.
Random Forest
Random forests are made-up of tree predictors such that each tree depends on the values
of a random vector sampled independently and with the same distribution for all trees in the
forest. The generalization error for forests converges as to a limit as the number of trees in
the
forest becomes large. The generalization error of a forest of tree classifiers depends on the
strength of the individual trees in the forest and the correlation between them
(Breiman2001).
RF follows specific rules for tree growing, tree combination, self-testing and post-processing,
it
is robust to overfitting and it is considered more stable in the presence of outliers and in very
high dimensional parameter spaces than other machine learning algorithms (Caruana and
Niculescu-Mizil, 2006; Menze et al., 2009). The concept of variable importance is an implicit
feature selection performed by RF with a random subspace methodology, and it is assessed
by
the Gini impurity criterion index (Ceriani and Verme, 2012). The Gini index is a measure of
prediction power of variables in regression or classification, based on the principle of impurity
reduction (Strobl et al., 2007); it is non-parametric and therefore does not rely on data
belonging 2122 to a particular type of distribution. For a binary split (white circles in Figure
1), the Gini index of a node n is calculated as follows:
1. Data Collection
• Diverse data sources, including clinical assessments, behavioral observations, and
potentially genetic and neuroimaging data, form the basis for the dataset.
• Data collected adheres to strict privacy regulations, ensuring confidentiality and informed
consent.
2. Data Preprocessing
• Rigorous data cleaning, handling missing values, and standardizing formats to create a
coherent and usable dataset.
• Feature selection techniques employed to identify the most relevant variables for model
training.
3. Feature Extraction
• Extraction of key features from the dataset, considering linguistic patterns, behavioural
nuances, and potential neural markers associated with ASD.
• Feature importance analysis aids in identifying the most influential variables.
4. Model Training
• Supervised learning models, such as Support Vector Machines (SVM) and Random
Forests, are employed for classification tasks based on labelled data.
• Unsupervised learning models, including clustering algorithms, uncover hidden
patterns within the dataset.
5. Model Evaluation
• The performance of the trained models is rigorously evaluated using metrics such as
accuracy, precision, recall, and F1 score.
• Ensemble learning techniques may be employed to combine predictions from multiple
models, enhancing overall accuracy.
References
[1]. Dawson, G., Rogers, S., Munson, J., Smith, M., Winter, J., Greenson, J., & Donaldson, A.
(2010). Randomized, controlled trial of an intervention for toddlers with autism: The Early
Start Denver Model. Pediatrics, 125(1), e17–e23. ISSN: 0031-4005
[2]. Thabtah, F. (2018). Machine learning in autistic spectrum disorder behavioral research: A
review and ways forward. Informatics for Health and Social Care, 43(1), 91–114. ISSN:
1753-8157
[3]. Haque, M. A., et al. (2020). A deep learning approach for early detection of autism spectrum
disorder based on electroencephalography. Biomedical Signal Processing and Control, 57,
101722. ISSN: 1746-8094
[4]. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental
disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. ISSN: 0890-
8567Satya Bhushan Verma, Abhay Kumar Yadav, Detection of Hard Exudates in
Retinopathy Images, ADCAIJ: Advances in Distributed Computing and Artificial
Intelligence Journal Regular Issue, Vol. 8 N. 4 (2019), 41-48 eISSN: 2255-2863 DOI:
http://dx.doi.org/10.14201/ADCAIJ2019844148
[5]. Pedregosa, F., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine
Learning Research, 12, 2825–2830. ISSN: 1533-7928
[6]. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. ISSN: 0885-6125
[7]. Lundberg, S. M., & Lee, S. I. (2017). A Unified Approach to Interpreting Model
Predictions. In Advances in Neural Information Processing Systems (pp. 4765–4774).
ISSN: 1049-5258
[8]. Satya B Verma, Shashi B V, Data Transmission in BPEL (Business Process Execution
Language), ADCAIJ: Advances in Distributed Computing and Artificial Intelligence
Journal Regular Issue, Vol. 9 N. 3 (2020), 105-117 eISSN: 2255-2863 DOI:
https://doi.org/10.14201/ADCAIJ202093105117 105
[9]. SB Verma, Brijesh P., and BK Gupta, Containerization and its Architectures: A Study,
ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, Vol. 11
N. 4 (2022), 395-409, eISSN: 2255-2863, DOI: https://doi.org/10.14201/adcaij.28351
[10]. Verma, S. B., & Saravanan, C. (2018, September). Performance analysis of various
fusion methods in multimodal biometric. In 2018 International Conference on
Computational and Characterization Techniques in Engineering & Sciences
(CCTES) (pp. 5–8). 2018, IEEE.
[11]. Verma, S.B., Yadav, A.K. (2021). Hard Exudates Detection: A Review., Emerging
Technologies in Data Mining and Information Security. Advances in Intelligent Systems
and Computing, vol 1286. Springer, Singapore. https://doi.org/10.1007/978-981-15-9927-
9_12