Slides

Uploaded by

Hariprasad Yarrolla

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Slides

Uploaded by

Hariprasad Yarrolla

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

Classification of Autism Spectrum

Disorder using Machine Learning

Utkarsh Patel
Third-year Undergrad
Department of E&ECE, IIT Kharagpur

Mentor
Debasis Samanta
Associate Professor
Department of Computer Science, IIT Kharagpur
Abstract
• In this study, we implemented a data-driven approach to classify ASD
patients and typically developing (TD) participants by using resting-
state fMRI data.
• SVM and KNN were used for classification purpose.
• Neural Networks were also used (just for comparison)
• Each classifier was fine-tuned via grid search on hyper-parameters via
cross-validation.
• Finally, a stacked model was also used comprising of fine-tuned
classifier as its base models.
What is Autism Spectrum Disorder?
Autism Spectrum Disorder
• It is a complex developmental condition that involves persistent
challenges in social interaction, speech and nonverbal
communication, and restricted/repetitive behaviors.
• One in 59 children is estimated to have autism in United States.
Why Machine Learning?
Autism Spectrum Disorder
• There is no medical test for autism.
• Early diagnosis and treatment are important to reducing the
symptoms of autism and improving the quality of life for people with
autism and their families.
• This procedure usually takes a lot of time like 5-10 years.
• Hence, the objective is to use machine learning algorithms for
classification.
Why Functional Connectivity Analysis important?
Role of Functional Connectivity
• Past extensive brain imaging studies have reported that ASD is
associated with brain connectivity.
• It has been found that in case of autism, some brain regions have
weaker functional connectivity (than normal) which may be
responsible for their social behaviour, while some of them are highly
correlated (than normal) which may account for their exceptional
ability to concentrate.
• Studies have found that most of the gold medallists in IMO, IOI, IPhO,
etc. have autism.
The Dataset
ABIDE Dataset
• For this task, we used ABIDE dataset
• It contains aggregated functional and structural brain imaging data
collected from laboratories around the world to accelerate our
understanding of the neural bases of autism.
• The ABIDE 1 dataset contains fMRI images of 1112 subjects, 539 from
individuals with ASD, and 573 from typical controls.
Pre-processing
Pre-processing
• For this task, I used the data provided by NYU only. Reasons being:
• Data provided from different laboratories have different configuration for
their instruments, way of examining patients, etc.
• Of all the sources, NYU had maximum number of subjects (172)
• For pre-processing, I used CPAC pipeline (as the task involves
functional connectivity analysis).
nilearn
• nilearn is the library for doing ‘Statistics for Neuroimaging in Python’
• Uses sklearn as backend
• This library is used in this project for:
• Extracting pre-processed ABIDE 1 dataset
• Building connectivity matrices from the time-series signals obtained from
fMRI images
Brain MAP / Atlas
Brain Atlas
• As per Wikipedia, a brain atlas is composed of serial sections along
different anatomical planes of the healthy or diseased developing or
adult animal or human brain where each relevant brain structure is
assigned a number of coordinates to define its outline or volume.
• Brain atlases are contiguous, comprehensive results of visual brain
mapping and may include anatomical, genetical or functional
features.
• In this project, I used AAL atlas, which has 116 ROIs (functional
clusters).
Extracting data and building
connectivity matrices
Snippet
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import nilearn

from nilearn.datasets import fetch_abide_pcp

from nilearn.connectome import ConnectivityMeasure, sym_to_vec

data = fetch_abide_pcp(derivatives=['rois_aal'], SITE_ID=['NYU’])

conn_est = ConnectivityMeasure(kind='correlation')
conn_matrices = conn_est.fit_transform(data['rois_aal'])
Connectivity Matrix b/w ROIs given by
correlation of its time-series signal
Connectivity Matrix b/w ROIs given by
correlation of its time-series signal
This matrix is then flattened to a vector and is used in training the model
Model Selection
Model Selection
• After flattening the connectivity matrix, we have feature vectors
𝑥𝑖 , 𝑖 = 1, 2, … , 172 and each 𝑥𝑖 is a 6786 dimensional vector.
• We cannot use dimensionality reduction because every correlation is
important for classification. (Tried it, however doesn’t increase
accuracy)
• Tried various algorithm, however only Support Vector Machines and K
Nearest Neighbours classifier gave satisfactory results.
• Later tuned the SVM and KNN classifier.
Classification using Support Vector
Machines
Support Vector Machines
• Support Vector Machines are a class of supervised learning, in which
a hyperplane which produces maximum margin between the class is
used for classification.
• It has various hyperparameter:
• Kernel – linear, radial, sigmoid, polynomial
• Regularization factor C
• Fitting parameter 𝛾
• A careful tuning is required to get best results
Tuning SVM
• Tuning regularization constant C
• For this, we varied C from 2−5 to 215
• Tuning fitting parameter
• For this, we varied 𝛾 from 2−15 to 23

• Basically, a grid search is performed that give us the best possible

accuracy
Linear kernel
Linear Kernel
• For linear kernel, the 5-fold cross-validation accuracy was found to be
59.92% for all values of 𝐶 and 𝛾
Radial kernel
Radial Kernel
• For radial kernel, the maximum 5-fold cross-validation was found to
be 67.97% for 𝛾 = 0.000122 and 𝐶 ≥ 4.
Sigmoid kernel
Sigmoid kernel
• The maximum 5-fold cross-validation accuracy was found to be
61.04% for 𝛾 = 0.000122 and 𝐶 = 2.
Quadratic kernel
Quadratic kernel
• The 5-fold cross-validation accuracy of 63.41% was observed for
values of 𝛾 and 𝐶 satisfying

2𝑙𝑜𝑔2 𝛾 + 𝑙𝑜𝑔2 𝐶 + 25 = 0 → 𝐶𝛾 2 = 2−25

Summary for SVM
Summary for SVM
Kernel Best Accuracy C 𝜸
Linear 59.92 ℝ ℝ
Radial 67.97 ≥ 21 2−13
Sigmoid 61.04 21 2−13
Quadratic 63.41 23 2−14
Classification using
K Nearest Neighbours
K Nearest Neighbours
• In KNN, we have two hyper-parameters
• Value of 𝑘
• Distance function
• sklearn uses Minkowski metric for distance function, which has a
hyper-parameter 𝑝
1
𝑝
𝑝
𝑑𝑝 𝒙, 𝒚 = ෍ 𝑥𝑖 − 𝑦𝑖
𝑖
• 𝑝 = 1 corresponds to Manhattan distance, while 𝑝 = 2 corresponds
to Euclidean distance.
Tuning KNN classifier
• KNN classifier is tuned by varying:
• Value of 𝑘 from 1 to 50
• Value of 𝑝 in 1, 2
Accuracy observed for KNN
Optimal Hyper-parameters
𝑘 = 29, 𝑝 = 1

𝒑 Best Accuracy 𝒌
1 64.55 29
2 63.99 23
SVM vs KNN
SVM vs KNN
• The best accuracy observed for SVM is 67.97 for radial kernel.
• The best accuracy observed for KNN is 64.55 for k = 29, p = 1.

• Hence, SVM outperforms KNN over NYU data set.

SVM + KNN
Stacked Model
Stacked Model
• In stacked ensemble learning, we have
• Base models (level 0)
• Meta models (level 1)
• I used the fine-tuned SVM, KNN and logistic regression model as base
models
• Logistic regression was also used in meta model
Stacked Model Architecture

LR
SVM
Base Models
KNN

Meta Model
LR
Performance of Stacked model

Standalone fine-tuned
SVM classifier
outperforms the
stacked model
Classification using Neural Networks
Neural Networks
• The best ML classifier of the task is SVM giving accuracy as high as
67.97.
• Let’s see whether Neural Networks can outperforms the SVM or not.
Tuning Neural Networks
• Neural Networks have a large number of hyper-parameters
• # of hidden layers
• # of nodes in each hidden layer
• Non-linear activation used in hidden layers
• Learning rate
• Optimization algorithms (Adam, Momentum, RMSProp, Adagrad)
• Regularization algorithms (Dropout, L2-regularization)
• .
• .
• .
Tuning Neural Networks
• For this task, I am using sklearn’s MLP API.
• Some background information:
• It uses ‘Adam’ optimizer by default.
• It uses SGD for optimization.
• We can vary the (initial) learning rate and the architecture
• I trained the model for 3000 epochs.
Performance of NN

We can observe that model with zero

hidden layer (a logistic regression) is
performing far-better as compared to
complex models over a wide range of
learning rates. However, the peak
accuracy is 63.75 which is less than
that observed for SVM (67.97)
Conclusion
Conclusion
• It is clear than a standalone fine-tuned SVM classifier is
outperforming stacked model as well as the neural networks for this
task.
• However, we may be able to further push the performance of the
classifier if we can figure out a new way to extract functional
connectivity instead of relying on correlation-based approach.
Future Work
• Apply new ways to extract functional connectivity.
• Check the proposed algorithm with different people according to
their gender and age.
Thank You.

Assessment 1: Part A - Case Study
No ratings yet
Assessment 1: Part A - Case Study
11 pages
MSC Nastran 2021.4 Superelements and Modules User Guide
No ratings yet
MSC Nastran 2021.4 Superelements and Modules User Guide
1,014 pages
Autism Disorder
No ratings yet
Autism Disorder
31 pages
KNN - Feb 19
No ratings yet
KNN - Feb 19
42 pages
Unit 3 (B) NGP
No ratings yet
Unit 3 (B) NGP
84 pages
Data analysis ch1
No ratings yet
Data analysis ch1
13 pages
Optical Character Recognition Using Neural Network
No ratings yet
Optical Character Recognition Using Neural Network
21 pages
ml5
No ratings yet
ml5
35 pages
Assignment 3
No ratings yet
Assignment 3
8 pages
003-KNN Complete Updated
No ratings yet
003-KNN Complete Updated
72 pages
003 KNN Complete
No ratings yet
003 KNN Complete
66 pages
Unit 3
No ratings yet
Unit 3
110 pages
UNIT2SVMKNN
No ratings yet
UNIT2SVMKNN
31 pages
w5 Classification
No ratings yet
w5 Classification
34 pages
Supervised Learning 1 PDF
100% (1)
Supervised Learning 1 PDF
162 pages
MECH4403 NN Week05
No ratings yet
MECH4403 NN Week05
22 pages
KNN
No ratings yet
KNN
20 pages
Advanced Machine Learning CIE
No ratings yet
Advanced Machine Learning CIE
13 pages
L05-Predictive Analytics I
No ratings yet
L05-Predictive Analytics I
49 pages
EEG Seminar Final
No ratings yet
EEG Seminar Final
23 pages
Towards Better Analysis of Deep Convolutional Neural Networks
No ratings yet
Towards Better Analysis of Deep Convolutional Neural Networks
41 pages
What Are The Common Algorithms in Machine Learning
No ratings yet
What Are The Common Algorithms in Machine Learning
3 pages
Lecture_3_Machine_learning_Techniques_For_Predictive_Analytics
No ratings yet
Lecture_3_Machine_learning_Techniques_For_Predictive_Analytics
40 pages
lecture_06
No ratings yet
lecture_06
51 pages
SLRG
No ratings yet
SLRG
13 pages
lab report
No ratings yet
lab report
3 pages
UNIT IV Non Parametric Methods
No ratings yet
UNIT IV Non Parametric Methods
37 pages
06-knn
No ratings yet
06-knn
41 pages
UNIT 4 ML NN ,DL,CNN-1
No ratings yet
UNIT 4 ML NN ,DL,CNN-1
84 pages
Multilayer Neural Network
No ratings yet
Multilayer Neural Network
27 pages
minor project
No ratings yet
minor project
21 pages
AI ML Nov 15
No ratings yet
AI ML Nov 15
32 pages
Unit 4
No ratings yet
Unit 4
18 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
29 pages
Unit 2
No ratings yet
Unit 2
33 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
K-NN Algorithm in Machine Learning
No ratings yet
K-NN Algorithm in Machine Learning
11 pages
Presentation: Operating System Concept CS-582
No ratings yet
Presentation: Operating System Concept CS-582
13 pages
ML (Interview)
No ratings yet
ML (Interview)
20 pages
Lecture Slides-Week15,16
No ratings yet
Lecture Slides-Week15,16
50 pages
Instance Based Learning
No ratings yet
Instance Based Learning
21 pages
Basics of Deep Learning
No ratings yet
Basics of Deep Learning
20 pages
SUMSEM-2020-21 MEE6070 ETH VL2020210700842 Reference Material I 16-Jul-2021 K-Nearest Neighbors (KNN) Algorithm (Repaired) Week-3
No ratings yet
SUMSEM-2020-21 MEE6070 ETH VL2020210700842 Reference Material I 16-Jul-2021 K-Nearest Neighbors (KNN) Algorithm (Repaired) Week-3
40 pages
Machine Learning CNN
No ratings yet
Machine Learning CNN
28 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
DS&ML 2
No ratings yet
DS&ML 2
8 pages
unit-online-1.3
No ratings yet
unit-online-1.3
21 pages
Unit 2
No ratings yet
Unit 2
37 pages
Business Analytics
No ratings yet
Business Analytics
6 pages
mean shift clustering
No ratings yet
mean shift clustering
23 pages
2023246032-Backward Propagation and Other Differential Algorithms
No ratings yet
2023246032-Backward Propagation and Other Differential Algorithms
48 pages
DCGAN (Deep Convolution Generative Adversarial Networks)
No ratings yet
DCGAN (Deep Convolution Generative Adversarial Networks)
27 pages
chapter-4
No ratings yet
chapter-4
40 pages
Object Recognition
No ratings yet
Object Recognition
43 pages
Module 3
No ratings yet
Module 3
79 pages
knn
No ratings yet
knn
6 pages
KMEANS
No ratings yet
KMEANS
9 pages
-21-KNN
No ratings yet
-21-KNN
28 pages
KNN VS Kmeans
No ratings yet
KNN VS Kmeans
3 pages
Ann Cae-3
No ratings yet
Ann Cae-3
22 pages
Analysing 3 Networks
No ratings yet
Analysing 3 Networks
30 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Engagement Under Ssa District Level
No ratings yet
Engagement Under Ssa District Level
2 pages
GB-X18JR GU-X18JR GB-X24JR GU-X24JR GB-X36JR GU-X36JR: Indoor Unit Outdoor Unit
No ratings yet
GB-X18JR GU-X18JR GB-X24JR GU-X24JR GB-X36JR GU-X36JR: Indoor Unit Outdoor Unit
80 pages
Ciphers
No ratings yet
Ciphers
7 pages
Low Light Image Enhancement
No ratings yet
Low Light Image Enhancement
40 pages
Calculation of Pump
100% (8)
Calculation of Pump
16 pages
SQL Server Database Administrator's Roadmap - Database Tutorials - DBA Guia
No ratings yet
SQL Server Database Administrator's Roadmap - Database Tutorials - DBA Guia
9 pages
Ennuma1l-Activity 3
No ratings yet
Ennuma1l-Activity 3
7 pages
RTI Act 2005 IT ACT 2000 E-Governance Secure Electronic Records
No ratings yet
RTI Act 2005 IT ACT 2000 E-Governance Secure Electronic Records
11 pages
Game Graphics: General Concepts
No ratings yet
Game Graphics: General Concepts
13 pages
How To Create SAP SD Contract - Free SAP SD Training
No ratings yet
How To Create SAP SD Contract - Free SAP SD Training
23 pages
A Text Book of Research Papers On 4G & 5G Technologies and Its Applications On Online Learning and Banking
No ratings yet
A Text Book of Research Papers On 4G & 5G Technologies and Its Applications On Online Learning and Banking
106 pages
Paper on Releases of Radioactivity HTGR
No ratings yet
Paper on Releases of Radioactivity HTGR
109 pages
File 2017 11 24 18 06 22
No ratings yet
File 2017 11 24 18 06 22
186 pages
"Obscenity Is The Cleansing Process Whereas Pornography Only Adds To The Murk"-Henry Miller
No ratings yet
"Obscenity Is The Cleansing Process Whereas Pornography Only Adds To The Murk"-Henry Miller
10 pages
Career Objective: Onkar Sunil Sonavane
No ratings yet
Career Objective: Onkar Sunil Sonavane
4 pages
Tut02-Malicious Software
No ratings yet
Tut02-Malicious Software
14 pages
P 01 Intro
No ratings yet
P 01 Intro
70 pages
ADM1024 LifeGuard
No ratings yet
ADM1024 LifeGuard
2 pages
IC Project Schedule Template 10689
No ratings yet
IC Project Schedule Template 10689
3 pages
Calculation of Draft and Twist in Ring Spinning
100% (2)
Calculation of Draft and Twist in Ring Spinning
3 pages
Thesis Chapter 2 Related Studies
100% (3)
Thesis Chapter 2 Related Studies
5 pages
International Scientific Report On AI Safety
No ratings yet
International Scientific Report On AI Safety
132 pages
graphic designing keywords
No ratings yet
graphic designing keywords
2 pages
Technical Data: Truck-Mounted Concrete Pump S 39 SX
No ratings yet
Technical Data: Truck-Mounted Concrete Pump S 39 SX
2 pages
Learning Module MIL 12
No ratings yet
Learning Module MIL 12
5 pages
Unit 3 Da
No ratings yet
Unit 3 Da
43 pages
PL-500 Exam Free Actual QampAs Page 14
No ratings yet
PL-500 Exam Free Actual QampAs Page 14
3 pages
SHH-G1 Data Sheet - Ver1
No ratings yet
SHH-G1 Data Sheet - Ver1
4 pages

Slides

Uploaded by

Slides

Uploaded by

Classification of Autism Spectrum

Disorder using Machine Learning

from nilearn.datasets import fetch_abide_pcp

data = fetch_abide_pcp(derivatives=['rois_aal'], SITE_ID=['NYU’])

• Basically, a grid search is performed that give us the best possible

2𝑙𝑜𝑔2 𝛾 + 𝑙𝑜𝑔2 𝐶 + 25 = 0 → 𝐶𝛾 2 = 2−25

• Hence, SVM outperforms KNN over NYU data set.

We can observe that model with zero

You might also like