Applying Eye Tracking With Deep Learning Techniques For Early-Stage Detection of Autism Spectrum Disorders - 2023
Applying Eye Tracking With Deep Learning Techniques For Early-Stage Detection of Autism Spectrum Disorders - 2023
Article
Applying Eye Tracking with Deep Learning Techniques for
Early-Stage Detection of Autism Spectrum Disorders
Zeyad A. T. Ahmed 1, * , Eid Albalawi 2 , Theyazn H. H. Aldhyani 3, * , Mukti E. Jadhav 4 , Prachi Janrao 5
and Mansour Ratib Mohammad Obeidat 6
Abstract: Autism spectrum disorder (ASD) poses a complex challenge to researchers and practitioners,
with its multifaceted etiology and varied manifestations. Timely intervention is critical in enhancing
the developmental outcomes of individuals with ASD. This paper underscores the paramount
significance of early detection and diagnosis as a pivotal precursor to effective intervention. To this
end, integrating advanced technological tools, specifically eye-tracking technology and deep learning
algorithms, is investigated for its potential to discriminate between children with ASD and their
typically developing (TD) peers. By employing these methods, the research aims to contribute to
refining early detection strategies and support mechanisms. This study introduces innovative deep
learning models grounded in convolutional neural network (CNN) and recurrent neural network
(RNN) architectures, employing an eye-tracking dataset for training. Of note, performance outcomes
Citation: Ahmed, Z.A.T.; Albalawi, have been realised, with the bidirectional long short-term memory (BiLSTM) achieving an accuracy
E.; Aldhyani, T.H.H.; Jadhav, M.E.; of 96.44%, the gated recurrent unit (GRU) attaining 97.49%, the CNN-LSTM hybridising to 97.94%,
Janrao, P.; Obeidat, M.R.M. Applying and the LSTM achieving the most remarkable accuracy result of 98.33%. These outcomes underscore
Eye Tracking with Deep Learning
the efficacy of the applied methodologies and the potential of advanced computational frameworks
Techniques for Early-Stage Detection
in achieving substantial accuracy levels in ASD detection and classification.
of Autism Spectrum Disorders. Data
2023, 8, 168. https://doi.org/
Keywords: autism spectrum disorder; typically developing; deep learning; CNN; LSTM; GRU;
10.3390/data8110168
eye-tracking technique
Academic Editors: Muhammad Irfan,
Thompson Sarkodie-Gyan and
Wahyu Caesarendra
ASD detection lies in its capacity to expedite precise diagnoses, thus enabling timely in-
terventions during critical developmental stages. This temporal alignment enhances the
effectiveness of therapeutic protocols and educational interventions. Moreover, the early
identification of ASD creates a foundation for families to avail themselves of specialised
services and engage with support networks, subsequently enhancing the development
trajectory of affected children. Importantly, support groups serve as crucial resources for
families navigating the challenges posed by Autism, offering advice, collective wisdom,
and emotional reinforcement (Øien; Salgado-Cacho; Zwaigenbaum, et al. [2–4]).
Traditional methods of diagnosing ASD include behavioural observations, historical
records, parental reports, and statistical analysis (Wan et al., 2019 [5]). Advanced technology
can be used such as eye-tracking technologies that can capture gaze patterns such as gaze
fixation, blinking, and saccadic eye movements. In light of this capability, our research
endeavours to make a distinctive contribution by developing a model tailored to scrutinise
divergences in gaze patterns and attentional mechanisms between individuals diagnosed
with autism and those without. The application of this model seeks to illuminate nuances
in gaze-related behaviours, shedding light on potential markers that differentiate the two
cohorts. Nonetheless, the diagnostic facet of autism presents a complex landscape. A
pressing demand emerges for refined diagnostic tools that efficaciously facilitate accurate
and efficient assessments. This diagnostic accuracy, in turn, serves as the bedrock for
informed interventions and personalised recommendations. Our investigation bridges
this exigency by delving into the symbiotic potential between eye-tracking technology and
deep learning algorithms.
A fundamental aspect of the contributions presented in this study encompasses the
deep learning-based model meticulously designed to discern autism through analysing
gaze patterns. This study is one of the first studies to use the dataset after it was released
as a standard dataset available to the public. The dataset comprises raw eye-tracking
data collected from 29 children diagnosed with autism spectrum disorder (ASD) and
30 typically developing (TD) children. The visual stimuli used to engage the children
during data collection included both static images (balloons and cartoon characters) and
dynamic videos. In total, the dataset contains approximately 2,165,607 rows of eye-tracking
data recorded while the children viewed the stimuli.
This study presented an impressive accuracy level of 98%, marking a notable stride
in innovative advancement. This achievement is primarily attributed to the strategic
implementation of the long short-term memory (LSTM) architecture, a sophisticated com-
putational framework. The discernible success achieved substantiates the effectiveness
of the rigorous preprocessing methodologies meticulously applied to the dataset, thereby
underlining the robustness and integrity of the research outcomes.
This paper is organised as follows: section two discusses related work and previous
studies that pertain to our research; section three presents the description of the data and
the architecture of the models; section four provides the experimental results of the models
and provides a detailed discussion. Finally, the conclusions of the study are presented.
2. Related Work
Neuroscience and psychology researchers employ eye-tracking equipment to learn
crucial things about human behaviour and decision making. This equipment also aids in
the identification and management of psychological problems like autism. For instance,
people with autism may exhibit atypical gaze patterns and attention, such as a protracted
focus on non-social things and issues synchronising their attention with social interactions.
Eye-tracking technology can detect three main types of eye movements: fixation, saccade,
and blinking. During fixation, the eyes briefly pause while the brain processes visual
information. Fixation duration usually lasts between 150 ms and 300 ms, depending on
the context. For instance, the duration differs when reading on paper (230 ms) compared
to on a screen (553 ms) or when watching a naturalistic scene on a computer (330 ms) [6].
Saccades are rapid eye movements that continuously scan the object to ensure accurate
Data 2023, 8, 168 3 of 27
perception, taking about (30–120 ms) each [7]. When the eye-tracking system fails to track
gaze, a blink occurs.
This section provides a comprehensive review of previous study efforts that have
utilised eye-tracking technology to examine disparities in gaze patterns and attentional
mechanisms between individuals who have received a diagnosis of ASD and those who
have not been diagnosed with ASD. This study focuses on the utilisation of artificial
intelligence algorithms to diagnose autism through the analysis of gaze patterns. By
combining eye-tracking technology with AI algorithms, this can help in the early detection
of autism by analysing and classifying these gaze patterns (Ahmed and Jadhav, 2020;
Kollias et al., 2021 [8,9]).
Eye-tracking technology has played a pivotal role in discerning the unique gaze and
face recognition patterns exhibited by individuals with ASD. This subsection delves into
how individuals with ASD and TD children differ in their visual attention toward faces.
A study investigated whether children with ASD exhibit different face fixation patterns
compared to TD children when viewing various types of faces, as proposed by (Kang
et al., 2020a [10]). The study involved 77 children with low-functioning ASD and 80 TD
children, all between the ages of (3 and 6) years. A Tobii TX300 eye-tracking system was
used to collect data. The children sat 60 cm away from the screen and viewed a series of
random facial photos, including own-race familiar faces, own-race unfamiliar faces, and
other-race unfamiliar faces. The features were extracted using the K-means algorithm and
selected based on minimal redundancy and maximal relevance. The SVM classifier was
then used, with the highest accuracy of 72.50% achieved when selecting 32 features out of 64
from unfamiliar other-race faces. For own-race unfamiliar faces, the highest accuracy was
70.63% when selecting 18 features and 78.33% when selecting 48 features. The classification
result AUC was 0.89 when selecting 120 features. The machine learning analysis indicated
differences in the way children with ASD and TD processed own-race and other-race faces.
An approach to identify autism in children using their eye-movement patterns dur-
ing face scanning was proposed by (Liu et al., 2016 [11]). A dataset was collected from
29 children with ASD and 29 TD children aged (4–11) years, using a Tobii T60 eye tracker
with a sample rate of 60 Hz. During the stimuli procedure, the children were asked to
memorise six faces and were then tested by showing them 18 faces and asking if they had
seen that face before. The eye tracker recorded the children’s eye-scanning patterns on the
faces. The K-means algorithm was used to cluster the eye-tracking data according to fixa-
tion coordinates. A histogram was then used to represent the features, and SVM was used
for classification, resulting in an accuracy of 88.51%. The study found that eye-movement
patterns during face scanning can be used to discriminate between children with ASD and
TD children.
The study differentiation of ASD and TD was based on gaze fixation times as pro-
posed by (Wan et al., 2019 [5]). The study included 37 participants with ASD and 37 TD
individuals, all between the ages of (4 and 6) years. The researchers employed a portable
eye-tracking system, specifically the SMI RED250 to collect data. The participants were
placed in a dark, soundproof room and asked to view a 10 s silent video of a young
Asian woman speaking the English alphabet on a 22-inch widescreen LCD display. The
researchers used an SVM classifier, which yielded an accuracy of 85.1%. Analysis of the
results revealed that the ASD group had significantly shorter fixation periods in various
areas, including the eyes, mouth, nose, person, face, outer person, and body.
A validation of eye tracking as a tool for detecting autism was proposed by (Murias
et al., 2018 [12]). Their study involved 25 children with ASD between the ages of (24 and 72)
months, and eye tracking was conducted using a Tobii TX300 eye tracker. The Tobii TX300
Eye Tracker is a product of Tobii AB (Stockholm, Sweden) a Swedish technology company
that develops and sells products for eye tracking and attention computing. The children
were seated on their parent’s lap while watching a 3 min video of an actor speaking in
a child-directed manner while four toys surrounded him. The researchers analysed the
children’s eye gaze on the actor, toys, face, eyes, and mouth. The results suggest that
Data 2023, 8, 168 4 of 27
fixations. The study found that there were significant differences between the two groups
for the mean fixation time in browsing tasks and the total fixation count and number of
transitions between items in synthesis tasks.
This subsection related to fixation time and visualising patterns within eye tracking
delves into the specific metrics and patterns that characterise how eyes engage with visual
stimuli. Carette et al. (2018 [18]) presented a method for transforming eye-tracking data
into a visual pattern. They collected data from 59 participants, 30 TD and 29 ASD, using
the SMI RED mobile eye tracker, which records eye movements at a 60 Hz frequency. Their
dataset contains 547 images: 328 belonged to the TD class and 219 to the ASD class. They
used these data to develop a binary classifier using a logistic regression model and achieved
an AUC of approximately 81.9%. Subsequent work on this dataset employed deep learning
and machine learning models, including random forests, SVM, logistic regression, and
naive Bayes, with AUCs ranging from 0.7% to 92%. Unsupervised techniques such as
deep autoencoders and K-means clustering were proposed by Elbattah et al. (2019 [19]),
with clustering results ranging from 28% to 94%. A CNN-based model by Cilia et al.
(2021 [20]) achieved an accuracy of 90%. Transfer learning models like VGG-16, ResNet,
and DenseNet were evaluated, with VGG-16 having the highest AUC at 0.78% (Elbattah
et al., 2022 [21]). The collective work underscores the growing interest and success in
employing eye-tracking technology and various machine learning techniques to detect
Autism, and highlights the need for improved sample sizes and dataset variations.
The eye movement was converted into text sequences using NLP as proposed by
(Elbattah et al., 2020 [22]). To achieve this, the authors employed a deep learning model
for processing sequence learning on raw eye-tracking data using various NLP techniques,
including sequence extraction, sequence segmentation, tokenisation, and one-hot encoding.
The authors utilised CNN and LSTM models. The LSTM model achieved 71% AUC, while
the CNN achieved an AUC of 84%.
The Gaze–Wasserstein approach for autism detection is presented by (Cho et al.,
2016 [23]). The data were collected using Tobii EyeX from 16 TD and 16 ASD children
aged between 2 and 10 years. During the experiment, the children were seated in front
of a screen and shown eight social and non-social stimuli scenes, each lasting for 5 s. The
study utilised the KNN classifier and performance was measured using F score matrices.
The overall classification scored 93.96%, with 91.74% for social stimuli scenes and 89.52%
for non-social stimuli scenes. The results suggest that using social stimuli scenes in the
Gaze–Wasserstein approach is more effective than non-social stimuli scenes.
The study employed machine learning to analyse EEG and eye-tracking (Kang et al.,
2020b [24]) data from children, focusing on their reactions to facial photos of different
races. Various features were analysed using a 128-channel system for EEG and the TX300
system for eye tracking. Feature selection was conducted using the minimum redundancy
maximum relevance method. The SVM classifier achieved 68% accuracy in EEG analysis,
identifying differences in power levels in children with autism spectrum disorder (ASD)
compared to typically developing children. Eye-tracking analysis achieved up to 75.89%
accuracy and revealed that children with ASD focused more on hair and clothing rather
than facial features when looking at faces of their own race.
In recent years, there have been substantial advancement researches for classification
and identifying ASD using different machine leaning algorithms based on the features of
ASD people like face and eye tracking [25–29]. Thabtah et al. [30] proposed a study that
collected a dataset from ASD newborns, children, and adults. The ASD was developed
based on the Q-CHAT and AQ-10 assessment instruments. Omar et al. [31] used the
methodology based on random forest (RF), regression tree (CART), and random forest
iterative for detecting the ASD. The system used AQ-10 and 250 real-world datasets using
the ID3 algorithm. Sharma et al. [32] proposed NB, stochastic gradient descent (SGD),
KNN, RT, and K-star, in conjunction with the CFS-greedy stepwise feature selector.
Satu et al. [33] used a number of methods to detect ASD in order to determine the
distinguishing features that differentiate between autism and normal development with
Data 2023, 8, 168 6 of 27
different ages from 16 to 30 years. Erkan et al. [34] used the KNN, SVM, and RF algorithms
to assess the efficacy of each technique in diagnosing autism spectrum disorders (ASDs).
Akter et al. [35] used the SVM algorithm to demonstrate the superior performance of both
toddlers and older children and adult datasets.
Despite the advancements in the amalgamation of eye-tracking methodologies with
artificial intelligence techniques, a discernible research lacuna persists. This underscores
the necessity for developing a model with superior performance capabilities to enhance
classification precision, thereby ensuring a reliable early diagnosis of autism. Table 1
displays the most important previous studies.
3.1. Dataset
In this study, a publicly available dataset named “Eye-Tracking Dataset to Support the
Research on Autism Spectrum Disorder” [36] has been used. It comprises a raw statistical
dataset of the eye-tracking dataset. The dataset was collected from 29 ASD and 30 TD
children, as shown in Table 2.
The dataset developers used an RED mobile eye tracker with a sampling rate of 60 Hz
for data collection. The eye tracker was connected to A 17-inch screen for a stimulating
display. The dataset developers [36] have followed some procedures, such as the special
place for the experiment. The participants sat 60 cm away from the screen to allow the eye
tracker to track their eye movements by reflecting infrared lights. For this investigation, the
visual stimuli including static and dynamic have been used. The dynamic stimuli included
video scenarios that incorporated a specific design to engage children, including balloons
and cartoon characters. The static stimulation comprises various visual elements such as
facial images, objects, cartoons, and other stimuli specifically intended to evoke a sense of
grabbing the participants’ attention. The experiment had a length of around five minutes,
during which the arrangement of items was subject to variation. The stimuli employed
in the study comprised movies with a human presenter delivering a speech. The primary
objective of these videos was to engage the participants’ attention toward different elements
displayed on the screen, regardless of their visibility. Through these videos, information
regarding eye contact, engagement level, and gaze focus can be obtained. This dataset
was used to investigate the differences in visual patterns of ASD and TD children, such as
fixations, saccades, and blink rate, to understand the subject’s visual attention [36]. After
recording all sessions, the dataset comprises approximately 2,165,607 rows. The dataset was
gathered from individuals both diagnosed with ASD and those with TD. Table 3 provides
an explanation of the features within the statistical dataset.
Table 3. Cont.
by replacing missing values with the most recent non-missing value discovered in the
associated column. Implementing this methodology guaranteed the reliability and accuracy
of the data analysis procedure (McKinney, 2012 [38]).
3.2.5. Data
The input data feature distribution is the frequency or pattern of different values or
value ranges within each feature. It shows how feature values are distributed across the
entire range. Figure 3 shows a graphical representation of this distribution. Understanding
the feature distribution in input data can help with feature selection, scaling, and model
selection in many machine learning tasks. Visualising feature distributions can also reveal
data outliers that may need to be addressed before modelling.
The relationship between two variables can be measured using a correlation coeffi-
cient. A heatmap is a valuable graphical depiction of the correlation coefficients between
two attributes within a dataset, rendering it a beneficial instrument for data analysis. The
heatmap effectively demonstrates the characteristics and orientations of the linkages among
nodes by employing a colour scheme. The heatmap depicted in Figure 4 illustrates the
presence of positive correlations through the use of darker colours, while brighter colours
visually represent negative correlations. This colour scheme enhances the visibility of a
comprehensive entity’s interrelated constituents. Utilising a heatmap facilitates the acqui-
sition of significant insights from a dataset by visually emphasising the interconnections
among different attributes.
Class Name Training Dataset before SMOTE Training Dataset after SMOTE
ASD 333,345 599,623
TD 599,623 599,623
As a result, more advanced variants such as long short-term memory (LSTM) and gated
recurrent units (GRUs) have been developed to tackle this challenge specifically.
34 neurons, and the output from this layer is employed to forecast whether the subject
has ASD or TD, as depicted in Figure 6. The LSTM architecture is intended to manage
long-term dependencies in sequential data, which makes it suitable for eye-tracking data
analysis in autism detection tasks. The interplay between the input and hidden layers and
the use of gates in the LSTM cells enables intricate analysis of the input data and effective
feature extraction for prediction purposes.
address the issue of vanishing and exploding gradients, which can make it challenging to
train deep networks using traditional RNNs. GRUs employ gates, similar to LSTMs, to
regulate the flow of information throughout the network. The GRU architecture comprises
two crucial gates: the update and reset gates. The update gate controls the utilisation of
the previous hidden state in the current hidden state, while the reset gate determines the
extent to which the previous hidden state should be forgotten. Here are the equations that
describe a GRU:
Update gate:
Reset Gate : zt = σ (Wz · [ht − 1, xt] + bz) (6)
ht = (1 − zt) ht − 1 + zt h ∼t (9)
where xt is the input at time tt, ht − 1 is the hidden state from the previous time step, σ
denotes the sigmoid function, denotes element-wise multiplication, Wz, Wr, WWz, Wr,
W are weight matrices for the update gate, reset gate, and candidate activation, respec-
tively, and bz, br, bbz, br, b are the biases for the update gate, reset gate, and candidate
activation, respectively.
These gates play a pivotal role in regulating the flow of information within the GRU
network. Our research utilises GRUs to identify autism via eye-tracking data. The GRUs
network is trained on eye-tracking data to detect Autism. The GRUs neural network com-
prises a GRU layer containing 1024 units, followed by a GRU layer comprising 512 units,
and a dense layer containing 34 units. Eventually, the model delivers a binary classification,
signifying whether the subject has ASD or TD. The model architecture is shown in Figure 8.
relationships between the past and future states of the input sequence. Here is a general
idea of how it works:
Forward Pass (ht→):
For the final representation at time tt, you would typically concatenate the forward
and backward hidden states:
Our research utilises BiLSTM to identify autism via eye-tracking data. The BiLSTM
network is trained on eye-tracking data to detect autism. The BiLSTM neural network
comprises a bidirectional layer containing 128 LSTM units, followed by an LSTM layer
comprising 32 units. Eventually, the model delivers a binary classification, signifying
whether the subject has ASD or TD. The model architecture is shown in Figure 9.
4. Experimental Results
This section outlines our study’s experimental setup, evaluation metrics, and model
performance.
4.4. Accuracy
The accuracy metric stands as a prevalent means for appraising the effectiveness
of deep learning models. This metric measures the model’s predictive performance by
determining the ratio of accurate predictions to the overall predictions made, as expressed
in formula (12).
TP + TN
Accuracy = × 100 (13)
TP + TN + FP + FN
4.4.1. F1 Score
The F1 score represents the harmonic average of the recall and precision metrics.
When evaluating the model’s effectiveness, precision and recall are considered. A high F1
score for the model indicates a well-balanced trade-off between precision and recall, which
translates to an elevated accuracy by minimising false positive and negative predictions.
The computation of the F1 score involves employing formula (13).
precision × Recall
F1 − score = 2 × × 100 (14)
precision + Recall
4.4.2. Sensitivity
Sensitivity relates to the true positive rate, signifying the proportion of accurately
detected positive instances by a binary classification model in relation to the total actual
positive instances. The computation of sensitivity is conducted using formula (14).
True Positives
Sensitivity = (15)
True Positives + False positives
4.4.3. Specificity
The idea of specificity revolves around a binary classification model’s capability to
precisely detect negative instances. It represents the proportion of correctly identified
negative instances compared to the total number of negative instances predicted by the
model. In simpler terms, this metric assesses the model’s effectiveness in avoiding false
positives, which are cases that are genuinely negative but are inaccurately labelled as
positive. The mathematical representation for specificity is as follows:
True Negatives
speci f icity = (16)
True Negatives + False Negatives
(FPR). The false positive rate (FPR) characterises the model’s instances of misclassifying
negative examples. Conversely, the true positive rate (TPR) signifies the model’s accuracy
in identifying positive examples. Through the ROC curve, the trade-off between true
positives and false positives in the model becomes apparent, as it displays the relationship
between TPR and FPR.
4.5. Results
4.5.1. Deep Learning Classification Results
This section provides the evaluation of the results produced from the deep learning
models used in the statistical eye-tracking dataset-based experimental study aiming at
detecting autism. LSTM, CNN-LSTM, GRU, and BiLSTM are all used as experimental
models in this investigation.
Parameters Values
LSTM units 64-unit first bidirectional LSTM layer
LSTM units 32-unit bidirectional LSTM layer
Activation function Sigmoid
Optimiser RMSprop with learning rate: 0.001
Callbacks Early stopping with patience = 5
Epochs 200
Batch size 1024
Validation split 0.1
The confusion matrix for ASD classification using the BiLSTM model, as shown in
Figure 10, reveals 80,642 true positives, 2695 false positives, 5610 false negatives, and
144,296 true negatives.
The BiLSTM model was trained for 200 epochs and achieved a training accuracy of
97.72% and a testing accuracy of 96.44%, as shown in Figure 11. The model showcased a
sensitivity of 93.50% and a specificity of 98.17%. Additionally, with an F1 score of 97.20%,
the model achieved an AUC of 97%.
Data 2023, 8, 168 20 of 27
Figure 11. The training and testing accuracy and loss of BiLSTM model: (a) accuracy; (b) loss.
Parameters Details
GRU 1024 units
GRU 512 units
Dense 34 units
Activation Function Sigmoid
Optimizer RMSprop with learning rate: 0.001
Callbacks Early stopping, model checkpoint
Epochs 200
Batch Size 1024
Validation Split 0.1
The confusion matrix for ASD classification using the GRU model, as shown in
Figure 12, reveals 80,959 true positives, 2378 false positives, 3473 false negatives, and
146,433 true negatives.
The GRU model was trained for 200 epochs with early stopping at 65 epochs. It
achieves a training accuracy of 98.02% and a testing accuracy of 97.49%, as shown in
Data 2023, 8, 168 21 of 27
Figure 13. The model demonstrates a sensitivity of 95.89%, a specificity of 98.40, and an F1
score of 98.04%. Additionally, the model achieves an AUC of 97%.
Figure 13. The training and testing accuracy and loss of GRU model.
Parameters Details
Conv1D Filters: 16, kernel Size: 2, activation: relu, batch normalisation
Conv1D Filters: 32, kernel Size: 2, activation: relu, batch normalisation
LSTM 1024 units
Recurrent Dropout 0.1
Dense 512 units
Dense 32 units
Activation Function Sigmoid
Optimizer RMSprop with learning rate: 0.001
Callbacks Early stopping with patience = 5
Training Epochs: 200
Batch Size 1024
Validation Split 0.1
The confusion matrix for ASD classification using the CNN-LSTM model, as shown
in Figure 14, reveals 81,537 true positives, 1800 false positives, 3008 false negatives, and
146,898 true negatives.
The CNN-LSTM model was trained for 200 epochs, and early stopping took place
at 66 epochs. It attains a training accuracy of 98.41% and a testing accuracy of 97.94%,
as shown in Figure 15. The model showcases a sensitivity of 96.44% and a specificity of
98.79%. With an F1 score of 98.70%, the model proves its effectiveness in autism detection.
Additionally, the model achieves an AUC of 98%.
Figure 15. The training and testing accuracy and loss of CNN-LSTM model: (a) accuracy; (b) loss.
Parameters Details
LSTM 1024, 512 units
Dense 34 units
Activation Function Sigmoid
Optimizer RMSprop with learning rate: 0.001
Callbacks Early stopping with patience = 5
Epochs 200
Batch Size 1024
Validation Split 0.1
The confusion matrix for ASD classification using the LSTM model, as shown in
Figure 16, reveals 81,760 true positives, 1577 false positives, 2315 false negatives, and
147,591 true negatives.
The LSTM model underwent training for 200 epochs, and early stopping occurred at
141 epochs. It achieves a training accuracy of 98.99% and a testing accuracy of 98.33%, as
indicated in Figure 17. The model demonstrates a sensitivity of 97.25% and a specificity of
98.94%. With an F1 score of 98.70%, the model proves its effectiveness in autism detection.
Additionally, the model achieves an AUC of 98%.
Figure 17. The training and testing accuracy and loss of LSTM model: (a) accuracy; (b) loss.
5. Discussion
This research paper aims to enhance the diagnostic process of the ASD by combining
eye-tracking data with deep learning algorithms. It investigates different eye-movement
patterns in individuals with ASD compared to TD individuals. This approach can enhance
ASD diagnosis accuracy and pave the way for early intervention programs that benefit
children with ASD. The experimental work investigated the effectiveness of deep learning
models, specifically LSTM, CNN-LSTM, GRU, and BiLSTM, in detecting autism using
a statistical eye-tracking dataset. The findings shed light on the performance of these
models in accurately classifying individuals with autism. Among the models, the LSTM
model achieved an impressive test accuracy of 98.33%. It successfully identified 81,760 true
positive samples, while maintaining a low false positive rate of 1577, out of the 84,075 ASD
samples. With a high sensitivity of 97.25%, the model demonstrated its ability to detect
individuals with autism accurately. Moreover, it exhibited a specificity of 98.94%, effectively
identifying non-autistic individuals. The F1 score of 98.70% further emphasises the model’s
effectiveness in autism detection.
The deep learning models performed well on a statistical eye-tracking dataset for
autism detection as presented in Table 9 and Figure 18. Accuracy rates, sensitivities,
and specificities all point to their usefulness in identifying autistic people. These results
proposed that deep learning models may be useful in autism diagnosis. They could help
advance efforts towards better screening for autism and more timely treatments for those
on the spectrum.
Sensitivity Specificity
Model Name Test Accuracy (%) TP FP FN TN AUC (%) F1 Score (%)
(%) (%)
BiLSTM 96.44 80,642 2695 5610 144,296 93.50 98.17 97 97.20
GRU 97.49 80,959 2378 3473 146,433 95.89 98.40 97 98.04
CNN-LSTM 97.94 81,537 1800 3008 146,898 96.44 98.79 98 98.39
LSTM 98.33 81,760 1577 2315 147,591 97.25 98.94 98 98.70
Data 2023, 8, 168 24 of 27
Table 10. Comparative analysis of our classification models with existing models.
Type of
Authors Participations Preprocessing Method Results
Study/Stimuli
Liu et al. 29 ASD and 29 TD Memorisation and K-means clustering,
SVM 88.51%
(2016) [11] children (4–11 years) recognition of faces histogram representation
37 ASD and 37 TD Video of a woman
Wan et al.
individuals speaking the English SVM 85.1%
(2019) [5]
(4–6 years) alphabet
55 ASD and 40 TD
toddlers (1–3 years), 80% classification
Kong et al. Video of a person
37 ASD and 41 TD SVM rate for toddlers, 71%
(2022) [13] moving their mouth
preschoolers for preschoolers
(3–5 years)
Data 2023, 8, 168 25 of 27
Type of
Authors Participations Preprocessing Method Results
Study/Stimuli
Eye-gaze patterns
Search task classifier:
18 TD and 18 ASD while browsing and Gaze features: time to Logistic regression
Yaneva et al. 75% accuracy.
individuals searching web pages first view, time viewed, (gaze and non-gaze
(2018) [14] Browse task classifier:
(<18 years) (e.g., Yahoo fixations, revisits features)
71% accuracy.
and Apple)
NLP techniques
extraction of sequences,
Elbattah et al., 30 TD and 29 ASD at Static and dynamic CNN 84% AUC and
segmentation of CNN and LSTM
2022 [22] age (≈3–13) years natural scenes LSTM 71% AUC
sequences, tokenisation,
and one-hot encoding
Our proposed 30 TD and 29 ASD at Data preprocessing, Testing accuracy of
LSTM
model age (≈3–13) years features selection LSTM 98%
6. Conclusions
In the dynamic tapestry of contemporary ASD research, this study firmly posits
itself as a beacon of innovative methodologies and impactful outcomes. Navigating the
intricate labyrinth of ASD diagnosis, the research underscores the critical imperative of early
detection, a tenet foundational to effective intervention strategies. At the methodological
core lies our deployment of deep learning techniques, uniquely integrating CNN and RNN
with an eye-tracking dataset. The performance metrics offer a testament to this integration’s
success. Specifically, the BiLSTM yielded an accuracy of 96.44%, the GRU achieved 97.49%,
the CNN-LSTM hybrid model secured 97.94%, and the LSTM model notably excelled with
an accuracy of 98.33%.
Upon a systematic scrutiny of extant literature, it becomes unequivocally evident that
our proposed model stands unparalleled in its efficacy. This prowess can be attributed to
our meticulous data preprocessing techniques and the discerning selection of features. It is
worth emphasising that the judicious feature selection played a pivotal role in accentuating
the distinctions between individuals with autism and their neurotypical counterparts,
leading our LSTM model to realize a remarkable 98% accuracy. This research converges
rigorous scientific exploration with the overarching goal of compassionate care for individ-
uals with ASD. It reiterates the profound significance of synergising proactive diagnosis,
community engagement, and robust advocacy. As a beacon for future endeavours, this
study illuminates a path where, through holistic approaches, children with ASD can truly
realize their innate potential amidst the multifaceted challenges presented by Autism.
There is a definite path forward for future research to include a larger and more
diverse sample, drawing from a larger population of people with ASD and TD individuals.
Increasing the size of the sample pool could help researchers spot more patterns and details
in the data. Importantly, a larger sample size would strengthen the statistical validity of the
results, increasing the breadth with which they can be applied to a wider population.
Author Contributions: Conceptualization, Z.A.T.A., E.A. and T.H.H.A.; methodology Z.A.T.A., E.A.
and T.H.H.A.; software, Z.A.T.A., E.A. and T.H.H.A.; validation Z.A.T.A., E.A. and T.H.H.A.; formal
analysis, M.E.J., P.J. and M.R.M.O.; investigation, M.E.J., P.J. and M.R.M.O.; resources, M.E.J., P.J. and
M.R.M.O.; data curation, M.E.J., P.J. and M.R.M.O.; writing—original draft preparation, M.E.J., P.J.
and M.R.M.O.; writing—review and editing, Z.A.T.A., E.A. and T.H.H.A.; visualization, M.E.J., P.J.
and M.R.M.O.; project administration, Z.A.T.A., E.A. and T.H.H.A.; funding acquisition, Z.A.T.A.,
E.A. and T.H.H.A. All authors have read and agreed to the published version of the manuscript.
Funding: This work was supported by the Deanship of Scientific Research, Vice President for
Graduate Studies and Scientific Research, King Faisal University, Saudi Arabia [Grant No.4714].
Data Availability Statement: The data presented in this study are available here: https://figshare.
com/articles/dataset/Eye_tracking_Dataset_to_Support_the_Research_on_Autism_Spectrum_Disorder/
20113592. Accessed date 22 August 2023.
Data 2023, 8, 168 26 of 27
References
1. Maenner, M.J.; Shaw, K.A.; Bakian, A.V.; Bilder, D.A.; Durkin, M.S.; Esler, A.; Furnier, S.M.; Hallas, L.; Hall-Lande, J.;
Hudson, A.; et al. Prevalence and Characteristics of Autism Spectrum Disorder among Children Aged 8 Years—Autism and
Developmental Disabilities Monitoring Network, 11 Sites, United States, 2020. MMWR Surveill. Summ. 2023, 72, 1. [CrossRef]
[PubMed]
2. Øien, R.A.; Vivanti, G.; Robins, D.L. Editorial SI: Early Identification in Autism Spectrum Disorders: The Present and Future, and
Advances in Early Identification. J. Autism Dev. Disord. 2021, 51, 763–768. [CrossRef]
3. Salgado-Cacho, J.M.; Moreno-Jiménez, M.P.; Diego-Otero, Y. Detection of Early Warning Signs in Autism Spectrum Disorders: A
Systematic Review. Children 2021, 8, 164. [CrossRef]
4. Zwaigenbaum, L.; Brian, J.A.; Ip, A. Early Detection for Autism Spectrum Disorder in Young Children. Paediatr. Child Health 2019,
24, 424–432. [CrossRef] [PubMed]
5. Wan, G.; Kong, X.; Sun, B.; Yu, S.; Tu, Y.; Park, J.; Lang, C.; Koh, M.; Wei, Z.; Feng, Z.; et al. Applying Eye Tracking to Identify
Autism Spectrum Disorder in Children. J. Autism Dev. Disord. 2019, 49, 209–215. [CrossRef]
6. Galley, N.; Betz, D.; Biniossek, C. Fixation Durations—Why Are They So Highly Variable? Das Ende von Rational Choice? Zur
Leistungsfähigkeit Der Rational-Choice-Theorie 2015, 93, 1–26.
7. MacKenzie, I.S.; Zhang, X. Eye Typing Using Word and Letter Prediction and a Fixation Algorithm. In Proceedings of the 2008
Symposium on Eye Tracking Research & Applications, Savannah, GA, USA, 26–28 March 2008; pp. 55–58.
8. Ahmed, Z.A.T.; Jadhav, M.E. A Review of Early Detection of Autism Based on Eye-Tracking and Sensing Technology. In
Proceedings of the 2020 International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, 26–28
February 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 160–166.
9. Kollias, K.-F.; Syriopoulou-Delli, C.K.; Sarigiannidis, P.; Fragulis, G.F. The Contribution of Machine Learning and Eye-Tracking
Technology in Autism Spectrum Disorder Research: A Systematic Review. Electronics 2021, 10, 2982. [CrossRef]
10. Kang, J.; Han, X.; Hu, J.-F.; Feng, H.; Li, X. The Study of the Differences between Low-Functioning Autistic Children and Typically
Developing Children in the Processing of the Own-Race and Other-Race Faces by the Machine Learning Approach. J. Clin.
Neurosci. 2020, 81, 54–60. [CrossRef] [PubMed]
11. Liu, W.; Li, M.; Yi, L. Identifying Children with Autism Spectrum Disorder Based on Their Face Processing Abnormality: A
Machine Learning Framework. Autism Res. 2016, 9, 888–898. [CrossRef] [PubMed]
12. Murias, M.; Major, S.; Davlantis, K.; Franz, L.; Harris, A.; Rardin, B.; Sabatos-DeVito, M.; Dawson, G. Validation of Eye-Tracking
Measures of Social Attention as a Potential Biomarker for Autism Clinical Trials. Autism Res. 2018, 11, 166–174. [CrossRef]
13. Kong, X.-J.; Wei, Z.; Sun, B.; Tu, Y.; Huang, Y.; Cheng, M.; Yu, S.; Wilson, G.; Park, J.; Feng, Z.; et al. Different Eye Tracking Patterns
in Autism Spectrum Disorder in Toddler and Preschool Children. Front. Psychiatry 2022, 13, 899521. [CrossRef] [PubMed]
14. Yaneva, V.; Ha, L.A.; Eraslan, S.; Yesilada, Y.; Mitkov, R. Detecting Autism Based on Eye-Tracking Data from Web Searching Tasks.
In Proceedings of the 15th International Web for All Conference, Lyon, France, 23–25 April 2018; pp. 1–10.
15. Eraslan, S.; Yaneva, V.; Yesilada, Y.; Harper, S. Web Users with Autism: Eye Tracking Evidence for Differences. Behav. Inf. Technol.
2019, 38, 678–700. [CrossRef]
16. Yaneva, V.; Temnikova, I.; Mitkov, R. A Corpus of Text Data and Gaze Fixations from Autistic and Non-Autistic Adults. In
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), Portorož, Slovenia, 23–28
May 2016; pp. 480–487.
17. Eraslan, S.; Yesilada, Y.; Yaneva, V.; Ha, L.A. ‘Keep It Simple!’: An Eye-Tracking Study for Exploring Complexity and Distin-
guishability of Web Pages for People with Autism. Univers Access Inf. Soc. 2021, 20, 69–84. [CrossRef]
18. Carette, R.; Elbattah, M.; Dequen, G.; Guerin, J.-L.; Cilia, F. Visualization of Eye-Tracking Patterns in Autism Spectrum Disorder:
Method and Dataset. In Proceedings of the 2018 Thirteenth International Conference on Digital Information Management
(ICDIM), Berlin, Germany, 24–26 September 2018.
19. Elbattah, M.; Carette, R.; Dequen, G.; Guérin, J.L.; Cilia, F. Learning clusters in autism spectrum disorder: Image-based
clustering of eye-tracking scanpaths with deep autoencoder. In Proceedings of the 2019 41st Annual international conference of
the IEEE engineering in medicine and biology society (EMBC), Berlin, Germany, 23–27 July 2019; IEEE: Piscataway, NJ, USA, 2019;
pp. 1417–1420.
20. Cilia, F.; Carette, R.; Elbattah, M.; Dequen, G.; Guérin, J.L.; Bosche, J.; Vandromme, L.; Le Driant, B. Computer-aided screening of
autism spectrum disorder: Eye-tracking studyusing data visualization and deep learning. JMIR Hum. Factors 2021, 8, e27706.
[CrossRef]
21. Elbattah, M.; Guérin, J.-L.; Carette, R.; Cilia, F.; Dequen, G. Vision-based Approach forAutism Diagnosis using Transfer Learning
and Eye-tracking. In Proceedings of the HEALTHINF 2022: 15th International Conference on Health Informatics, Online, 9–11
February 2022; pp. 256–263.
22. Elbattah, M.; Guérin, J.-L.; Carette, R.; Cilia, F.; Dequen, G. Nlp-based approach to detect autism spectrum disorder in saccadic eye
movement. In Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), Canberra, ACT, Australia,
1–4 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1581–1587.
Data 2023, 8, 168 27 of 27
23. Cho, K.W.; Lin, F.; Song, C.; Xu, X.; Hartley-McAndrew, M.; Doody, K.R.; Xu, W. Gaze-Wasserstein: A quantitative screening
approach to autism spectrum disorders. In Proceedings of the 2016 IEEE Wireless Health (WH), Bethesda, MD, USA, 25–27
October 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1–8.
24. Kang, J.; Han, X.; Song, J.; Niu, Z.; Li, X. The identification of children with autism spectrum disorder by SVM approach on EEG
and eye-tracking data. Comput. Biol. Med. 2020, 120, 103722. [CrossRef] [PubMed]
25. Satu, M.S.; Azad, M.S.; Haque, M.F.; Imtiaz, S.K.; Akter, T.; Barua, L.; Rashid, M.; Soron, T.R.; Al Mamun, K.A. Prottoy: A Smart
Phone Based Mobile Application to Detect Autism of Children in Bangladesh. In Proceedings of the 2019 4th International
Conference on Electrical Information and Communication Technology (EICT), Khulna, Bangladesh, 17–19 December 2019;
pp. 1–6.
26. Akter, T.; Ali, M.H.; Khan, M.I.; Satu, M.S.; Moni, M.A. Machine Learning Model to Predict Autism Investigating Eye-Tracking
Dataset. In Proceedings of the 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques
(ICREST), Online, 5–7 January 2021; pp. 383–387.
27. Akter, T.; Ali, M.H.; Khan, M.I.; Satu, M.S.; Uddin, M.; Alyami, S.A.; Ali, S.; Azad, A.; Moni, M.A. Improved Transfer-Learning-
Based Facial Recognition Framework to Detect Autistic Children at an Early Stage. Brain Sci. 2021, 11, 734. [CrossRef] [PubMed]
28. Alkahtani, H.; Aldhyani, T.H.H.; Alzahrani, M.Y. Deep Learning Algorithms to Identify Autism Spectrum Disorder in Children-
Based Facial Landmarks. Appl. Sci. 2023, 13, 4855. [CrossRef]
29. Alkahtani, H.; Ahmed, Z.A.T.; Aldhyani, T.H.H.; Jadhav, M.E.; Alqarni, A.A. Deep Learning Algorithms for Behavioral Analysis
in Diagnosing Neurodevelopmental Disorders. Mathematics 2023, 11, 4208. [CrossRef]
30. Thabtah, F.; Kamalov, F.; Rajab, K. A New Computational Intelligence Approach to Detect Autistic Features for Autism Screening.
Int. J. Med. Inform. 2018, 117, 112–124. [CrossRef]
31. Omar, K.S.; Mondal, P.; Khan, N.S.; Rizvi, M.R.K.; Islam, M.N. A Machine Learning Approach to Predict Autism Spectrum
Disorder. In Proceedings of the 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE),
Bazar, Bangladesh, 7–9 February 2019; pp. 1–6.
32. Sharma, M. Improved Autistic Spectrum Disorder Estimation Using Cfs Subset with Greedy Stepwise Feature Selection Technique.
Int. J. Inf. Technol. 2019, 14, 1251–1261. [CrossRef]
33. Satu, M.S.; Sathi, F.F.; Arifen, M.S.; Ali, M.H.; Moni, M.A. Early Detection of Autism by Extracting Features: A Case Study
in Bangladesh. In Proceedings of the 2019 International Conference on Robotics, Electrical and Signal Processing Techniques
(ICREST), Dhaka, Bangladesh, 10–12 January 2021; pp. 400–405.
34. Erkan, U.; Thanh, D.N. Autism Spectrum Disorder Detection with Machine Learning Methods. Curr. Psychiatry Res. Rev. Former.
Curr. Psychiatry Rev. 2019, 15, 297–308.
35. Akter, T.; Satu, M.S.; Khan, M.I.; Ali, M.H.; Uddin, S.; Lio, P.; Quinn, J.M.; Moni, M.A. Machine Learning-Based Models for Early
Stage Detection of Autism Spectrum Disorders. IEEE Access 2019, 7, 166509–166527. [CrossRef]
36. Cilia, F.; Carette, R.; Elbattah, M.; Guérin, J.-L.; Dequen, G. Eye-tracking dataset to support the research on autism spectrum
disorder. Res. Sq. 2022. [CrossRef]
37. Dasu, T.; Johnson, T. Exploratory Data Mining and Data Cleaning; John Wiley & Sons: Hoboken, NJ, USA, 2003.
38. McKinney, W. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython; O’Reilly Media, Inc.: Sebastopol, CA,
USA, 2012.
39. Buitinck, L.; Louppe, G.; Blondel, M.; Pedregosa, F.; Mueller, A.; Grisel, O.; Niculae, V.; Prettenhofer, P.; Gramfort, A.;
Grobler, J.; et al. API design for machine learning software: Experiences from the scikit- learn project. arXiv 2013, arXiv:1309.0238.
40. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Syntheticminority over-sampling technique. J. Artif. Intell. Res.
2002, 16, 321–357. [CrossRef]
41. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [CrossRef] [PubMed]
42. Graves, A. Generating sequences with recurrent neural networks. arXiv 2013, arXiv:1308.0850.
43. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [CrossRef]
44. Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv
2014, arXiv:1412.3555.
45. Zhou, P.; Shi, W.; Tian, J.; Qi, Z.; Li, B.; Hao, H.; Xu, B. Attention-based bidirectional long short-term memory networks for relation
classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 7–12
August 2016; Volume 2, Short Papers. pp. 207–212.
46. Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. {TensorFlow}:
A system for {Large-Scale} machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and
Implementation (OSDI 16), Savannah, GA, USA, 2–4 November 2016; pp. 265–283.
47. Hossin, M.; Sulaiman, M.N. A review on evaluation metrics for data classification evaluations. Int. J. Data Min. Knowl. Manag.
Process 2015, 5, 1.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.