0% found this document useful (0 votes)
8 views

ASD_ET_Paper__Revision____v2

This document presents a comprehensive eye-tracking method aimed at identifying visual features in children with Autism Spectrum Disorder (ASD). The study demonstrates significant differences in visual processing and attention between children with ASD and typically developing peers, achieving a classification accuracy of 90.91% using machine learning techniques. The proposed method is intended to assist experts in diagnosing and tailoring interventions for children with ASD, ultimately enhancing their learning and social integration.

Uploaded by

Tống Minh Trí
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

ASD_ET_Paper__Revision____v2

This document presents a comprehensive eye-tracking method aimed at identifying visual features in children with Autism Spectrum Disorder (ASD). The study demonstrates significant differences in visual processing and attention between children with ASD and typically developing peers, achieving a classification accuracy of 90.91% using machine learning techniques. The proposed method is intended to assist experts in diagnosing and tailoring interventions for children with ASD, ultimately enhancing their learning and social integration.

Uploaded by

Tống Minh Trí
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

September 2024

A Comprehensive Eye Tracking Based


Method to Provide Visual Features of
Children with Autism Spectrum Disorder
Thi Duyen Ngo a,1 , Thi Quynh Hoa Nguyen a , Duc Duy Le a , Dang Khoa Ta a ,
Thi Cam Huong Nguyen b , Nu Tam An Nguyen b and Thanh Ha Le a
a Human-Machine Interaction Laboratory, Faculty of Information Technology,
University of Engineering and Technology, Vietnam National University, Hanoi,
Vietnam
b Faculty of Special Education, Hanoi National University of Education

Abstract.
BACKGROUND: Atypical visual processing and social attention patterns are
commonly observed in children with Autism Spectrum Disorder (ASD), making
the application of eye-tracking technology an area of growing interest in the re-
search community. However, most existing studies focus on isolated aspects within
the fields of special education or psychology, without any technical approaches to
identify these visual features.
OBJECTIVE: The propose of this paper is developing an eye-tracking method to
provide detailed visual features of children with ASD.
METHODS: Designed the eye-tracking based method based on the requirements
of applying eye-tracking technology across various aspects to support children with
ASD. We have conducted an experiment on 29 typical development children and
26 ASD children.
RESULTS: The results demonstrate differences in visual processing and attention
between children with ASD and their typically developing peers when viewing
stimuli with and without social elements that have been evaluated by special ed-
ucation experts as beneficial in supporting ASD children. Additionally, an initial
pre-screening of children with ASD using the extracted statistical eye movement
features with an SVM model achieved an accuracy of 90.91%.
CONCLUSIONS: This method can be utilized by experts to analyze, support diag-
nosis, adjust intervention strategies, and evaluate the effectiveness of interventions
for children with ASD, thereby swiftly identifying and addressing their challenges,
enhancing their learning abilities, and improving their social integration as well as
quality of life.
Keywords. Autism Spectrum Disorder, eye-tracking technology, visual features,
classification

1 Corresponding Author: Thi Duyen Ngo, Human-Machine Interaction Laboratory, Faculty of Information

Technology, University of Engineering and Technology, Vietnam National University, Hanoi, Vietnam. E-mail:
[email protected].
September 2024

1. Introduction

Autism Spectrum Disorder (ASD) is an early-onset neurodevelopmental disorder char-


acterized by significant impairments in social interaction and communication, as well
as repetitive behaviors and interests [1]. In 2020, the Centers for Disease Control and
Prevention (CDC) estimated that the prevalence of ASD children among 8-year-olds in
the United States was 1 in 36, while globally, the prevalence was reported to be around
1 in 100 children [1, 2]. ASD typically manifests within the first three years of a child’s
life, with the earliest indicators being challenges in using language for communication.
By preschool age, deficits in social skills and repetitive or overly restrictive behaviors
become more pronounced. Children with ASD often do not seek out others when happy,
show or point to objects of interest, or call their parents by name [3]. This not only sig-
nificantly impacts the quality of life and learning process of the affected children but also
subjects their families to social stigma and underestimation [4]. Therefore, it is crucial
to have effective assessments to detect ASD early, along with methods to assist and in-
tervene in their communication and integration into society, with the aim of enhancing
their quality of life.
Children with ASD are considered to have a greater tendency to process visual in-
formation. They are frequently captivated by initial images when encountering objects
and are particularly attracted to those with vibrant colors or movements. Their perceptual
and observational processes are significantly influenced by attentional mechanisms [5].
Studies have indicated that, in comparison to children with typical development (TD),
those with ASD frequently exhibit atypical visual processing and social attention pat-
terns. Unlike their age-matched TD peers, children with ASD tend to avoid eye contact
or gazing at faces, which is evident in their reduced time spent looking at eyes and faces
and their tendency to focus on irrelevant objects [6, 7, 8]. Von Hofsten et al. [9] suggested
that children with ASD shift their attention more frequently during conversations than
when observing turn-taking objects. Their findings indicated a heightened tendency for
attention shifts in ASD children during social interactions. Thus, analyzing gaze patterns
can provide insights into the atypical behaviors associated with children with ASD and
help differentiate them from TD children.
In recent years, Eye Tracking (ET), a non-invasive and convenient measurement
tool, has garnered significant attention from researchers worldwide in the study and sup-
port of children with ASD. ET analyzes human behavior by tracking visual attention to
process various types of information from visual stimuli [10]. Published studies have em-
ployed ET to evaluate visual perception and social attention characteristics [15, 17, 18],
identify emotional recognition features [16, 19, 20], and investigate learning abilities in
children with ASD [21]. For instance, Hosozawa et al. [22] have identified that when
children with ASD viewed short video clips from children’s films or TV programs, they
tended to avert their gaze from the actors prematurely during speech segments and over-
all focused less on faces than TD children. Leveraging these specific perceptual features,
several studies have focused on diagnosing children with ASD using machine learning
techniques [23, 24, 25, 26]. Furthermore, studies have utilized eye-tracking technology
to support interventions [24, 25, 26, 27] and assess the effectiveness of interventions in
ASD children [28, 29]. However, most of these studies address isolated issues within the
fields of special education or psychology. To the best of our knowledge, there is a lack of
comprehensive technological methods focused on developing an integrated solution for
September 2024

the automatic collection, preprocessing, and extraction of visual features. Such informa-
tion is crucial for addressing various tasks, including the diagnosis of ASD, interventions
for children with ASD, and the evaluation of these interventions.
In this paper, we have proposed a comprehensive eye-tracking-based method aimed
at providing visual features of children with ASD. This method supports various appli-
cations aiming at supporting ASD children, including detection, severity classification,
interventions, and the evaluation of intervention effectiveness. Furthermore, these ex-
tracted visual features and their visualizations can assist experts in diagnosing and de-
signing more tailored intervention strategies for each child, providing recommendations
to address existing challenges, teach essential skills, and enhance opportunities for social
integration. The experimental findings demonstrate that the proposed method effectively
distinguishes differences in visual processing between children with ASD and typically
developing (TD) children when exposed to social and non-social visual stimuli, corrobo-
rating observations made in clinical practice. These differences, validated by special ed-
ucation experts, are recognized as valuable in supporting interventions for children with
ASD. Furthermore, by utilizing the extracted statistical eye movement features as input
for a machine learning algorithm, the results reveal the method’s significant potential for
the automated pre-screening of children with ASD.
The remainder of this paper is structured as follows: Section 2 reviews related pre-
vious research. Section 3 illustrates the proposed methodology for providing visual fea-
tures in children with ASD. Section 4 details the experimental procedures and presents
the results. Finally, Section 5 provides the conclusion and discusses potential directions
for future research.

2. Related works

In recent years, eye-tracking technology has proven effective in detecting atypical gaze
behavior related to social interactions in children with ASD. An individual’s interest and
attention to an object can be determined using various measures: the number of fixations,
the average duration of eye fixations, the amount of time spent on a visual stimulus, and
the ability to shift gaze; where fixation and saccade analysis has been applied in most
studies [11, 12, 13]. A fixation refers to a period during which our visual gaze remains
at a specific location, whereas a saccade describes the rapid eye movements between
two successive fixation points [14]. Some studies have applied eye-tracking technology
to analyze attentional patterns, cognitive development, learning abilities, and social in-
teractions, revealing a reduced preference for social stimuli in ASD children compared
to their TD peers. Sasson et al. [15] found that children with ASD showed a greater
decrease in fixation time and sustained fixation on faces when objects of circumscribed
interests (e.g., trains) were present, indicating a broad impact on several aspects of social
attention. Differences in visual attention were also evident, as TD children showed a sig-
nificant increase in both fixation counts and fixation duration on the eyes, whereas chil-
dren with ASD exhibited a significant increase in fixation duration on the mouth [17, 18].
In addition to findings consistent with theese studies, Julia Vacas et al. [16] discovered
that children with ASD exhibited heightened emotional sensitivity, demonstrating atyp-
ical visual orientation towards objects when these objects competed with neutral faces.
They tend to identify positive emotions, such as happiness, more easily than negative
September 2024

emotions, such as anger [19, 20]. In learning contexts, Thompson et al. [21] discovered
that ASD children engaged with e-books for only half the time they were displayed, with
half of that time focused on salient stimuli, and showed slightly better attention to print
when text was both read aloud and highlighted compared to when it was only presented
or read aloud. These findings underscore the potential of eye-tracking technology for the
early signs detection and assessment of ASD.
Based on the identification of distinctive characteristics in visual processing and
social attention in children with ASD, several studies have focused on applying eye-
tracking technology to support the diagnosis of this condition. Guobin Wan et al. [23]
showed a 10-second video of a woman speaking to a group of children aged 4–6 and used
fixation duration on the mouth and body, along with Support Vector Machine (SVM)
analysis, to distinguish between children with ASD and TD children, achieving an accu-
racy of 85.1%. Jessica S. Oliveira et al. [24] proposed a computational method that in-
tegrates concepts of Visual Attention Models (VAM), image processing techniques, and
artificial intelligence to develop a model for supporting the diagnosis of children with
ASD, utilizing ET data. Jiannan Kang et al. [25] combined Electroencephalogram and
ET data while ASD children viewed own-race and other-race stranger faces, using an
SVM classification model to achieve an accuracy of 85.44%. Ibrahim Abdulrab Ahmed
et al. [26] used fixations and saccades as ET input features, combining machine learning
and deep learning techniques to diagnose ASD in children, achieving a high accuracy of
99.8%.
Intervention is typically regarded as a process designed to support individuals facing
challenges in learning, behavior, emotional, and social domains due to deficiencies in
specific skills. Eye-tracking can be a valuable tool for designing plans to enhance learn-
ing in children with ASD, making the intervention process more effective. Fabienne Giu-
liani et al. [27] combined eye-tracking with the TEACCH intervention method to support
two adolescents with ASD, aged 14 and 16. ET data were used to assess visual charac-
teristics and make tailored adjustments to their intervention programs. Quan Wang et al.
[28] implemented gaze-contingent adaptive cueing to guide children with ASD towards
typical looking patterns of an actress in videos, finding that this approach effectively im-
proved their attention to social faces on screen. Additionally, some studies have proposed
combining eye-tracking systems with virtual reality (VR) to enhance communication,
joint attention, and learning in children with ASD [29, 30]. Eye-tracking technology has
the potential to offer an objective assessment of the effectiveness of intervention meth-
ods and to test hypotheses related to early intervention theories for children with ASD.
Kim et al. [31] conducted a quality review of technology-assisted reading interventions
for ASD students using eye-tracking technology, finding that these technologies can ben-
efit these students in learning various reading skills, such as word recognition through
images, and vocabulary comprehension. Additionally, Trembath et al. [32] investigated
the hypothesis that children with ASD learn more effectively when responding to visual
instructions (e.g., images) compared to verbal instructions.
However, most of these studies focus on addressing individual issues in psychology
and special education. Typically, they offer visual information that specialists can use to
propose specific solutions for supporting individuals with ASD without providing clear
technical and technological descriptions. In several countries, during autism interven-
tion, teachers and specialists need extensive visual information about the children, but
this information is mainly gathered through their observations. As a result, the outcomes
September 2024

depend on the observers’ skills and experience, leading to subjective, potentially inaccu-
rate, or incomplete results. Furthermore, to the best of our knowledge, there has been lit-
tle research aimed at developing a comprehensive solution for the automatic collection,
preprocessing, and extraction of visual features. Such information could address various
issues, including autism diagnosis, intervention, and intervention assessment.

3. Methodology

3.1. Method overview

In this paper, we have proposed an eye-tracking-based method designed to provide visual


features of children with ASD, with the aim of supporting various applications such as
detection, severity classification, interventions, and evaluation of intervention effective-
ness. Based on the analysis of applying eye-tracking technology across various aspects
to support children with ASD, it is evident that this method must meet three essential cri-
teria: (i) capturing eye movements of children with ASD, (ii) processing and calculating
visual features from raw ET data and mapping them to specific Areas of Interest (AOIs),
and (iii) visually presenting this information effectively.
To address the first criterion, capturing eye movements, previous studies have em-
ployed external eye trackers, where participants view visual stimuli such as images or
videos displayed on a screen [6, 15]. The raw eye movement data collected is then an-
alyzed to extract visual features, with fixation and saccade being the most commonly
used metrics. These features are favored because they have proven to be sufficient for
cognitive and attention research, negating the necessity of analyzing the entire spectrum
of eye movement data [40]. For the second criterion, processing and mapping visual fea-
tures to AOIs, various methods are utilized to calculate fixation and saccade features.
Spatial characteristics are derived using velocity-based, dispersion-based, and area-based
attributes, while temporal characteristics are calculated based on duration sensitivity and
algorithmic adaptation. Salvucci and Goldberg [36] conducted a comparative evalua-
tion of fixation identification algorithms, revealing that velocity-based and dispersion-
based algorithms yield similar performance, whereas area-based algorithms are generally
more restrictive. Crucially, the mapping of these eye movement features to specific AOIs
within visual stimuli is paramount, as AOI analysis provides semantically rich metrics
that are especially valuable for research focused on attentional processes [14]. Nonethe-
less, prior research has predominantly employed a limited set of stimuli, such as faces
and socially relevant images, for research purposes, allowing for manual AOI annotation
in such contexts. However, in the context of intervention, intervention assessment, and
personalization, it is essential to develop a method capable of accurately handling large
and varied sets of visual stimuli through automated or semi-automated processes. The
third criterion, visually presenting the processed information, is particularly critical for
supporting interventions and assessments for children with ASD. It is essential to ensure
that the visualized data is not only accessible but also reduces the workload for teach-
ers and experts, facilitating timely detection and appropriate intervention. Heatmaps and
scanpaths are the most widely used visualization techniques in eye-tracking research, as
they effectively capture and display the gaze patterns and focus areas of children on the
visual stimuli [33].
September 2024

As illustrated in Figure 1, the proposed method comprises the following compo-


nents: an eye tracker device for capturing ET signals; a monitor for displaying visual
stimuli (e.g., images, videos); a visual stimuli integration module; an eye movements
recording module; a data preprocessing module; an AOI recognition module; a feature
extraction module; and a visualization module. Specifically, when using the method, the
child views the monitor while the visual stimuli integration module displays content to
engage their visual information processing. The eye movement recording module cap-
tures ET signals from the eye tracker, which are then processed by the data preprocessing
module and forwarded to the feature extraction module. The AOI recognition module
identifies object regions within visual stimuli and transmits this information to the fea-
ture extraction module. Subsequently, the feature extraction module processes this data
to derive insights into the child’s visual information processing patterns. The extracted
data is then visualized through various formats, such as heat maps and scan paths, which
can be saved as image or video files or displayed on an auxiliary screen. Additionally,
these statistical eye movement features can serve as input for machine learning classifiers
within the classification module, facilitating early pre-screening of children with ASD.
This integrated approach enables experts and educators to gain critical insights for the
analysis, classification, and intervention in the treatment of children with ASD.

Figure 1. Method overview diagram

3.2. Visual stimuli integration

The visual stimuli integration module receives input in the form of a list of visual stimu-
lus images provided by special education experts or teachers. A visual stimulus can be an
image or video containing content (e.g., a human face, social scene, food, toy) designed
to stimulate the viewer’s visual attention, selected based on the specific objectives of
the task at hand. Given the tendency of children with ASD to show reduced attention to
social factors, the selection of visual stimuli content typically ensures a comprehensive
representation of both social elements (e.g., faces, conversations) and nonsocial elements
(e.g., toys, vehicles), allowing for a thorough and complete assessment of their visual
attention characteristics. The selected visual stimuli are resized to fit the screen dimen-
sions. They are then compiled into a video sequence with the following structure: ini-
tially, a red “+” symbol is displayed on a gray background for 0.5 seconds to capture the
child’s attention. This is followed by the presentation of the visual stimuli for t present = 5
seconds to ensure children’s sustained attention and sufficient gaze data collection. Af-
September 2024

terward, the red “+” symbol reappears on a gray background for another 0.5 seconds to
regain the child’s focus. This sequence is repeated until all the visual stimuli have been
presented. Figure 2 depicts an example of a visual stimuli video. To prevent the exper-
iment from becoming overly lengthy and to maintain the child’s focus, the number of
images is limited to 12.

Figure 2. An example of the visual stimuli video

3.3. Eye movement recording

The eye movement recording module captures the children’s eye movements in response
to each visual stimulus displayed on the monitor, storing the raw data within the system.
During the experimental process, the system must be connected to an eye-tracking de-
vice. This device features an infrared camera mounted on a lightweight frame that can
be attached to the monitor. The eye tracker emits infrared light onto the participant’s
eye, with a portion of the light reflecting off the cornea. The method then identifies the
center of the eye (the point where the gaze direction passes through) and the location
of the corneal reflection by capturing images of the participant’s eye using one or more
infrared cameras within the eye tracker. The gaze vector is determined from the infor-
mation about the eye’s center and the corneal reflection point. Finally, the coordinates
of the gaze point on the screen are identified as the intersection of the gaze vector with
the screen. Before starting ET recording, it is essential to calibrate the eye tracker for
each participant. During recording, the eye movement signals captured by the device are
represented within the system as a sequence of points (x, y, t), where x and y denote the
gaze coordinates on the screen, and t represents the timestamp for each data point. The
sequence of timestamps depends on the sampling rate of the eye tracker used.

3.4. Data preprocessing

The ET signals collected may contain noise, which can arise from imperfections in the
data collection setup, environmental interferences during the experiments, as well as
minor eye movements such as tremors or micro-saccades. This noise can significantly
affect the performance and accuracy of eye movement characteristic calculations. In this
study, we minimized noise by applying a moving average filter with a window size of
3 to the x and y coordinate sequences independently. Figure 3 illustrates an example of
the results obtained from this signal denoising process. With proper adjustments, this
method effectively eliminates high-frequency oscillations in the data; however, excessive
smoothing could potentially result in data loss.
September 2024

Figure 3. An example of ET signals before (left) and after (right) denoising. The signals become smoother,
while still preserving the key features of the sampled data after applying the moving average filter.

ET data loss can occur when the eye tracker is unable to estimate the pupil’s direc-
tion from the captured eye image, often due to children blinking (an action that may ac-
count for approximately 2% of the recorded data) or when the child’s gaze shifts outside
the tracking range of the device. In children with ASD, particularly those with impaired
attention spans, the likelihood of ET signal loss may be higher compared to TD children.
Reconstructing ET signals through interpolation is crucial, as it not only ensures smooth
transitions between adjacent data points but also preserves the key characteristics of pupil
size — namely, its temporal continuity and slow variance over hundreds of milliseconds
per second [37]. Several interpolation methods have been employed to manage missing
ET data (e.g., linear interpolation, polynomial interpolation, spline interpolation); how-
ever, linear interpolation is the most commonly used method due to its simplicity and
lowest mean error relative to the ground truth [38]. Let (x′ , y′ ,t ′ ) be the missing point,
and (x1′ , y′1 ,t1′ ), (x2′ , y′2 ,t2′ ) be the two closest known points in time to the missing point.
The linear interpolation algorithm is given by the Equation 1:

(t ′ − t1 )(x2 − x1 )
x ′ = x1 +
t2 − t1
. (1)
′ (t ′ − t1 )(y2 − y1 )
y = y1 +
t2 − t1

A record can be deemed of sufficient quality based on a threshold related to the


proportion of time the child spends focusing on the screen relative to the total time they
participate in the experiment. If the child’s attention time is less than 70% of the total
experimental duration, the record is considered of inadequate quality and is excluded.

3.5. AOI recognition

Area of Interest (AOI) analysis is a technique that assigns eye movements to specific
regions within a visual scene [14]. Each of these regions contains an object that may be
related to humans, such as a face, eyes, or mouth, or could represent animals or inan-
imate objects displayed at particular coordinates on the monitor. Unlike obtaining eye
movement measures across the entire scene, AOI analysis offers semantically targeted
eye movement metrics that are especially valuable for research focused on attention. In
September 2024

previous research utilizing eye-tracking technology to support children with ASD, AOIs
were delineated entirely manually. This approach heavily relied on the expertise and
knowledge of the individual segmenting the visual stimulus, which could potentially in-
troduce errors. Moreover, given that the proposed method aims to support intervention,
intervention assessment, and personalization, it is crucial to establish a method capa-
ble of accurately processing large and diverse sets of visual stimuli. In this study, we
have suggested two distinct methods for identifying AOIs using the Segment Anything
Model. These methods include automatic AOIs identification and semi-automatic AOIs
identification.

Segment Anything Model


The Segment Anything Model (SAM) [34] is an advanced image segmentation model
trained on an extensive dataset comprising one billion masks across 11 million images.
This dataset is significantly larger, containing 11 times more images and 400 times more
masks than the largest existing dataset, Open Images. Hence, SAM exhibits superior
accuracy compared to other models. Additionally, SAM can accept input prompts from
various systems, such as eye-tracking technology, where prompts are conveyed through
users’ eye gestures.
The architecture of the SAM comprises three main components: the Image Encoder,
the Prompt Encoder, and the Mask Decoder. Specifically, the Image Encoder converts
images into embeddings. The Prompt Encoder encodes points and bounding boxes as
positional encodings and processes text through convolutions, subsequently adding these
encodings to the image embeddings. The Mask Decoder then utilizes the image embed-
dings, the prompt embeddings, and an output token to generate the corresponding mask.

Automatic AOIs identification


Objects in an image can be automatically identified by employing a grid of 32×32 points
as prompts for the model, with each point corresponding to a set of predicted masks for
the objects to be segmented. Subsequently, the model’s IoU prediction module is used
to select confident masks, from which the most stable masks (those maintaining simi-
lar shapes when the threshold probability is between 0.5 − δ and 0.5 + δ ) are chosen.
Finally, duplicate masks are filtered out using non-maximal suppression (NMS). To en-
hance the quality of smaller masks, each section of the image is cropped and enlarged
for processing, with overlapping areas to minimize masks being split. This process is
illustrated in Figure 4. Additionally, AOIs can be automatically identified using chil-
dren’s gaze points as prompts for the model. Specifically, these gaze points can serve as
prompts, and similar to the method using a grid of points, each gaze point selects a mask.
The IoU prediction module is then employed to select the most stable masks. This auto-
matic AOIs identification method can be applied in various applications to provide edu-
cators and specialists with insights into children’s visual characteristics, especially when
dealing with complex data types (e.g., images with multiple objects, moving videos) or
unlabeled data.

Semi-automatic AOIs identification


For certain tasks such as classifying children with ASD and analyzing their visual pro-
cessing capabilities, it is essential to develop a standardized test that is applicable to a
September 2024

Figure 4. An example of using SAM to automatically recognize all internal objects in an image

Figure 5. Three steps of PeyeMMV algorithm to determine the fixations list

diverse range of children, reusable, and highly accurate. In this context, utilizing a semi-
automatic AOIs identification method yields superior results. Specifically, a labeler se-
lects an object, and the SAM then automatically determines the mask for that object. The
labeler subsequently refines this mask to achieve higher quality. According to research
on the SAM model, this semi-automatic labeling method is faster than manual labeling
from scratch, with the average time per mask decreasing from 34 seconds to 14 sec-
onds. Besides object selection by clicking, the SAM model can also receive prompts by
drawing a bounding box or sketching a rough mask over an object.

3.6. Features extraction

Given the input as a sequence of gaze points obtained from the eye movements recording
module, the features extraction module calculates the eye movement features of children.
These extracted features include fixation, identified using the PeyeMMV algorithm [35],
and saccade, determined through the I-VT algorithm. Both the velocity threshold and
dispersion threshold-based algorithms have been shown to achieve high accuracy and
computational efficiency when applied in practice [36].
Fixations are identified using the PeyeMMV algorithm, which utilizes a two-step
spatial threshold (denoted as parameters t1 and t2 ) along with a minimum duration thresh-
old. Figure 5 illustrates the three steps of PeyeMMV to determine the list of fixations.
Beginning from the initial point, the mean coordinates are computed as long as the Eu-
clidean distance between this mean point and the current point remains less than t1 . If
the distance exceeds t1 , a new fixation cluster is established. Subsequently, the distance
between each point within a cluster and the cluster’s mean point is calculated, with any
point exceeding the threshold t2 being excluded from the cluster. The duration of each
fixation is determined by the time difference between its start and end points. Fixation
clusters with a duration shorter than the minimum specified value are removed from the
list. A calculated fixation is formatted as (x, y, d,tstart ,tend , no gaze points), where x, y
represent the coordinates of the fixation point, d is the duration of the fixation point and
d = tend − tstart , and no gaze points represents the number of gaze points in that fixation
cluster.
The extracted fixation data can be further processed in the subsequent step for
visualization or integrated with the list of AOIs identified in the AOI recognition
September 2024

module. The calculated statistical features include fixation duration, fixation count,
time to first fixation. fixation duration is the total time, in seconds, spent on fixations
within the target AOI. fixation count refers to the number of fixations within the tar-
get AOI. time to first fixation denotes the interval, in seconds, between the onset of the
visual stimulus and the first fixation on the target AOI.
The I-VT algorithm for saccade calculation employs a velocity threshold to identify
which gaze points fall within a saccade. It starts by calculating the velocity between
points. If the velocity exceeds the threshold, the gaze point is considered the beginning
of a saccade. Conversely, if the velocity falls below the threshold and is the closest point
to the start, it marks the end of the saccade. Subsequently, the duration of the saccade
is calculated as the time difference between its end and start points. Saccades with a
duration shorter than the minimum duration threshold are excluded.

3.7. Visualization

The developed method aims to provide detailed visual characteristics of children with
ASD. One application of this method is to offer visual insights to assist professionals
and educators. Therefore, it is essential to present the results in a manner that enables
experts to easily observe eye movements, facilitating accurate diagnoses, classifications,
and intervention strategies for the children. The visualization module is employed to
display the features obtained from previous modules. In this module, the list of fixations
from the prior step is used to illustrate the information through heatmaps and scanpaths.
Heatmaps can illustrate the spatial distribution of a child’s eye movements in re-
sponse to visual stimuli. While they do not depict the sequence of fixations, they analyze
the spatial distribution of fixation points, highlighting areas within the visual stimulus
where the child focuses more or less. In the proposed method, heatmaps are generated
by applying a Gaussian mask to the input visual stimulus to visualize fixation frequency
through color representation. Initially, the system creates a Gaussian distribution repre-
senting fixation points as a 2D Gaussian mask of specified size. Using this Gaussian dis-
tribution and the fixation array, a heat map is produced, normalized, and converted into
an 8-bit grayscale representation. For the provided input stimulus image, the heatmap
is resized to match the image dimensions, and applied a color map to transform into a
colored heatmap. This color-encoded heatmap is then overlaid onto the input image.
Scanpaths illustrate the eye movement patterns during a task, consisting of a se-
quence of fixation points and saccades. Typically, scan-paths are represented as a series
of connected nodes (denoting fixations) and edges (indicating saccadic movements be-
tween consecutive fixations) overlaid on the visual stimulus image. The method gener-
ates a mask featuring fixations with sizes proportional to the duration of each fixation,
and lines connecting successive fixations representing saccadic movements. The size of
each fixation is determined relative to the duration of the longest fixation, with other fix-
ations scaled accordingly. By utilizing scanpaths, experts can analyze the areas of focus,
the initial points of gaze, the regions receiving the most attention, and the sequence of
eye movements within the visual stimulus data.

3.8. Automatic classification

Another critical aspect of the proposed method is its potential to assist specialists
in the early identification of children with ASD. Eye-tracking features have demon-
September 2024

Total Male Female Age Myopia Usable recordings


ASD 26 23 3 3-7 0 24
TD 29 16 13 6-7 1 26
Table 1. Data description

strated potential as biomarkers for identifying atypical social attention and possible
visual information-processing deficits in children with ASD. In this study, to evalu-
ate the effectiveness of the extracted system features in supporting the early diagno-
sis of children with ASD automatically, we utilized these statistical features of eye
movements derived from the features extraction module as input for the machine
learning classification method. The three primary eye movement features used include
time to f irst f ixation, f ixation duration, and f ixation count for each stimulus across
12 visual stimuli; these features were then combined into a single input feature vector
and fed into a machine learning model to classify children as either ASD or TD. We
employed the SVM classifier, with parameters selected through a grid search algorithm.
SVM is a supervised learning algorithm that has been previously utilized for distinguish-
ing between individuals with and without ASD [23, 39]. The primary objective of the
SVM classifier is to construct an optimal hyperplane within a multidimensional space
using labeled training data. Classification of testing samples is performed based on the
sign of their distance from the hyperplane, with the magnitude of this distance indicat-
ing the likelihood of the samples belonging to a particular category. Given the limited
amount of data, we trained this SVM model using a 5-fold cross-validation approach.

4. Experiment and Results

4.1. Participants

To evaluate the proposed method, we have conducted an experiment on a total of 55


Vietnamese children, comprising 29 TD children and 26 children with ASD. Specifi-
cally, the 29 TD children were second graders from a combined school, while the 26
children with ASD were diagnosed by experts and receiving interventions at a Special
Education Center in Vietnam. Experts with autism spectrum disorder diagnosed these
children according to the diagnostic criteria of the Diagnostic and Statistical Manual of
Mental Disorders, 5th Edition (DSM-5) [42] after being clinically examined using the
M-CHAT (Modified-Checklist for Autism in Toddlers) scale [43]. The cognitive abilities
of the ASD group in this study involved typically averaged around 60 IQ scores, and they
were accompanied by significant challenges in verbal communication. All participants
were aged between 3 and 7 years, with no severe psychiatric disorders (e.g., schizophre-
nia, bipolar disorder) and no visual impairments requiring treatment (e.g., strabismus).
Both groups of children participated in a “free-viewing” task that allow them to freely
observe objects displayed on a screen. This task was specifically designed to ensure that
it is suitable for participants across different age groups. All data collection procedures
were approved by the supervising teachers. After collecting data from these 55 children,
we obtained 50 usable recordings. Data excluded involved instances where the children
lacked sufficient concentration or had myopia. Detailed information regarding the data
is presented in Table 1.
September 2024

Figure 6. 12 images appear in the stimuli video. Each stimulus was labeled sequentially from ob j1 to ob j12 ,
arranged from left to right and top to bottom.

4.2. Visual stimuli

The experimental task was designed as a free-viewing activity, allowing children to freely
observe any stimuli displayed on the screen to capture their visual information and at-
tentional processes. The visual stimuli were selected based on specific criteria, which
included incorporating both familiar and unfamiliar objects for both groups of children,
covering social and non-social elements, and ensuring gender neutrality. These stimuli
were subsequently validated by special education experts to ensure their suitability for
evaluating children’s attention and interest levels. It comprised 12 images, each measur-
ing 900 × 900 pixels, with each image featuring an object positioned randomly. These
images were compiled into a single video, where each image was displayed for 5 sec-
onds. Figure 6 illustrates the images presented in the stimuli video.

4.3. Procedure

Eye movements were recorded using Tobii Eye Tracker with a sampling rate of 90Hz
and an eye-tracking distance of 50-95 cm. The visual stimuli were displayed on a 14-inch
screen with Full HD resolution (1920×1080 pixels) and a refresh rate of 60Hz.
Each child participating in the experiment was seated in a quiet room alongside two
individuals: a supervising teacher and a technician responsible for adjusting the equip-
ment. The experimental room was devoid of any objects that might attract the children’s
attention. For each child, the data collection device was adjusted so that the child’s eye
level was aligned with the height of the display screen, with the distance from the child’s
eyes to the center of the screen being approximately 60 cm. This setup was designed to
ensure consistency across all participants during the experiment.
September 2024

For each child, the eye-tracking device needed to be calibrated twice before com-
mencing the experiment. Initially, the teacher would use verbal cues and hand gestures
to direct the child’s attention to the screen, after which the technician would play the
visual stimulus video and allow the child to observe freely. Typically, for children with
typical development, data collection could be completed in a single session, lasting ap-
proximately 5 to 10 minutes. However, for children with ASD, who might exhibit dis-
tractibility or fail to focus on the screen, data collection might need to be repeated after
a rest period, extending the total duration to 20 to 30 minutes.

4.4. Data analysis

The metrics time to first fixation, fixation duration, and fixation count as visual features
were extracted for both groups: ASD children and TD children in 12 different AOIs us-
ing the proposed method. To examine visual attention patterns toward different objects
in both groups, mixed-design ANOVAs were conducted. ANOVA [44] is one of the most
widely applied statistical methods for hypothesis testing. Key values commonly used in
ANOVA include the F-statistic, which represents the ratio of differences between groups
(or conditions) to variability within groups, and the p-value, which indicates the probabil-
ity of observing an F-value equal to or greater than the actual value if the null hypothesis
is true. A p-value less than 0.05 is typically considered statistically significant, allowing
researchers to reject the null hypothesis and conclude that the factor under investigation
has an effect. Additionally, partial eta-squared (η p2 ) measures the effect size, indicating
the extent to which the independent variable influences the dependent variable. In this
study, the ANOVA included Dependent variables (Time to first fixation, fixation dura-
tion, fixation count), Group (ASD, TD) as the between-subject factor and the 12 AOIs as
the within-subject factor. Significant interactions between group and object type across
the twelve AOIs would indicate that children with ASD exhibit reduced visual attention
when observing different objects presented on the screen.

4.5. Results

Tables 2, 3, and 4 present the mean values of the eye-tracking metrics, including time
to first fixation, fixation duration, and fixation count, for both groups. Each stimu-
lus, depicted in Figure 6, is labeled from ob j1 to ob j12 . Additionally, Figure 7 illus-
trates the mean values and variances of these three metrics for selected representa-
tive AOIs. The mixed-design ANOVA analyses of visual attention metrics revealed dis-
tinct group-level differences between children with ASD and TD across various mea-
sures. For time to first fixation, a significant main effect of group was observed,
F(1, 48) = 7.11, p = .012, η p2 = .173, with children in the ASD group demonstrating sig-
nificantly longer times to first fixation compared to the TD group. Children with ASD
showed slower visual responses compared to TD children when stimuli appeared on the
screen (See Figure 7). The type of AOI significantly influenced first fixation times irre-
spective of group, p < .001, η p2 = .174, whereas the interaction between group and AOI
was non-significant, p = .647, η p2 = .023, indicating consistent fixation patterns across
AOIs for both groups. For fixation duration, group differences were also significant,
F(1, 48) = 6.35, p = .017, η p2 = .157, with children in the ASD group displaying re-
duced focus, as evidenced by shorter average fixation durations compared to TD chil-
September 2024

Figure 7. Estimated average values for Fixation Count (top), Fixation Duration (middle), and Time to First
Fixation (bottom) illustrating the visual attention patterns of ASD and TD groups while observing several
objects.
September 2024

dren. AOI type showed a substantial effect on fixation duration, p < .001, η p2 = .137,
while an interaction effect, p = .043, η p2 = .052, suggested that group differences varied
across specific AOIs. Finally, for fixation count, significant differences between groups
emerged, F(1, 48) = 7.21, p = .011, η p2 = .175, with lower fixation counts observed
in the ASD group. AOI type accounted for considerable variability in fixation counts,
p < .001, η p2 = .280, but no significant interaction was detected, p > .05, η p2 = .051.

ob j1 ob j2 ob j3 ob j4 ob j5 ob j6 ob j7 ob j8 ob j9 ob j10 ob j11 ob j12


ASD 0.48 0.98 0.28 1.02 0.90 0.87 0.48 0.59 0.41 0.62 0.29 0.69
TD 0.39 0.79 0.16 0.35 0.46 0.47 0.84 0.52 0.22 0.53 0.32 0.34
Table 2. Statistics of the average value of time to f irst f ixation

ob j1 ob j2 ob j3 ob j4 ob j5 ob j6 ob j7 ob j8 ob j9 ob j10 ob j11 ob j12


ASD 3.08 1.97 3.24 2.98 1.96 1.63 2.19 2.03 1.99 2.80 2.90 1.79
TD 3.57 3.03 4.08 3.75 2.67 3.19 2.88 3.02 2.94 2.32 3.82 2.20
Table 3. Statistics of the average value of f ixation duration

ob j1 ob j2 ob j3 ob j4 ob j5 ob j6 ob j7 ob j8 ob j9 ob j10 ob j11 ob j12


ASD 19.22 14.22 21.00 15.22 10.56 8.78 9.56 12.89 12.56 18.56 18.00 12.78
TD 22.67 17.33 23.67 16.47 12.93 13.47 11.93 19.07 16.33 15.67 19.07 12.93
Table 4. Statistics of the average value of f ixation count

Heatmaps are also extracted and averaged between the two groups of children, as il-
lustrated in Figure 8. For non-social objects (e.g., toys, spinning tops), children with ASD
exhibit gaze patterns that are largely similar to those of the TD group. When viewing
stimulus object 1, children with ASD tend to focus on the button that triggers interactive
surprises rather than on the toy bear that creates the surprises. From this observation, we
deduced that children with ASD are more interested in understanding the mechanisms
that produce actions and surprises than in the animals involved. In other words, chil-
dren with ASD, especially those with average and above-average cognitive abilities, can
comprehend causality. Conversely, when it comes to objects with social elements (e.g.,
puppet faces, eyes on toy cars), children with ASD demonstrate atypical gaze patterns
compared to TD children. Most children with ASD avoid looking at the eyes or faces
on these objects and exhibit slower reactions and less focus when viewing social objects
compared to non-social ones, unlike the TD group.
The findings indicate that the proposed method is effective in providing valuable
insights into the visual characteristics of children with ASD when observing stimuli im-
ages. Key screening traits in children with ASD can be identified through their prefer-
ences, interests, number of fixation points, and gaze patterns, including direction. The
visual processing capabilities of children with ASD are sufficiently developed to under-
stand basic life issues and needs. Observations reveal that characteristics typically seen
in direct interactions, such as aversion to eye contact, gaze avoidance, and reduced fo-
cus, are also present when children view images. For familiar objects, children show re-
sponses similar to those in direct environments. These results have been evaluated and
positively received by experts in the field of special education, who affirm that the visual
September 2024

(a) Heatmaps of TD children (b) Heatmaps of children with ASD

Figure 8. Heatmaps of children with ASD and TD children

characteristics and presentations of the solution are beneficial in supporting children with
ASD.
We also utilized the extracted eye movement features from a cohort of 55 chil-
dren to facilitate the early identification of ASD using a machine learning classifier.
The SVM model achieved a commendable average accuracy of 90.91%, with a sensi-
tivity of 86.67% (the proportion of correctly identified ASD children) and a specificity
of 96.67% (the proportion of correctly identified TD children). Furthermore, we investi-
gated whether any specific stimuli could effectively differentiate between ASD and TD
children by analyzing the discriminative weights of input features derived from SVM
coefficients. The analysis revealed that two stimuli—one featuring puppets and the other
animated cars—significantly distinguished ASD children from their TD counterparts.
However, this result, though promising, is lower than those reported in several state-of-
the-art studies utilizing eye-tracking technology for early ASD diagnosis. We identified
that a key factor contributing to this discrepancy is the selection of visual stimuli used
during the eye movement data recording. Previous studies have predominantly employed
stimuli with strong social elements, such as facial images or conversation videos, to em-
phasize the differences in social attention between ASD and TD children [23, 40, 41]. In
contrast, our study used 12 visual stimuli designed to support intervention, which did not
prominently feature social cues. Despite this, the preliminary results underscore the po-
tential of the proposed method in providing valuable eye movement features, particularly
September 2024

when children are exposed to stimuli containing both social and non-social elements,
thereby aiding in the automatic pre-screening of ASD using advanced machine learning
techniques.

5. Conclusions

We have developed a comprehensive eye-tracking-based method designed to provide de-


tailed visual features of children with ASD. This method was built upon existing re-
search; however, no previous studies have focused on integrating these findings into a
single solution that automatically provides visual characteristic information of children.
The method involves collecting eye movements data as children observe stimuli videos
displayed on a monitor. This data is then preprocessed and combined with AOI informa-
tion identified from the stimuli to extract various visual features. These features are visu-
alized using tools such as scanpaths and heatmaps. To evaluate the proposed method, we
conducted an experiment with 26 children with ASD and 29 TD children. The results in-
dicate that the proposed method can effectively provide visual features and visualizations
of ASD children, which reveals differences in visual processing and attention between
ASD children and their TD peers when observing visual stimuli with and without social
factors. Additionally, the statistical eye movement features extracted through this method
show significant potential for use in the automatic pre-screening of children with ASD
using machine learning classifiers. Therefore, our method can be used to support experts
in analyzing, supporting diagnosis, adjusting intervention strategies, and evaluating the
intervention effectiveness for children with ASD.

Acknowledgements

This research was funded by the research project QG.23.39 of Vietnam National Univer-
sity, Hanoi.

Author contributions

CONCEPTION: Thi Duyen Ngo, Thi Cam Huong Nguyen, Nu Tam An Nguyen, Thanh
Ha Le and Thi Quynh Hoa Nguyen.
PERFORMANCE OF WORK: Thi Duyen Ngo, Thi Cam Huong Nguyen, Nu Tam An
Nguyen, Thanh Ha Le and Thi Quynh Hoa Nguyen.
INTERPRETATION OR ANALYSIS OF DATA: Duc Duy Le, Thi Quynh Hoa Nguyen,
Dang Khoa Ta, Thi Duyen Ngo, Nu Tam An Nguyen and Thi Cam Huong Nguyen.
PREPARATION OF THE MANUSCRIPT: Duc Duy Le, Thi Duyen Ngo, Thi Quynh
Hoa Nguyen and Thanh Ha Le.
REVISION FOR IMPORTANT INTELLECTUAL CONTENT: Thi Duyen Ngo and
Thanh Ha Le.
SUPERVISION: Thi Duyen Ngo and Thanh Ha Le.
September 2024

Conflict of interest

The authors report there are no competing interests to declare.

References

[1] Zeidan J, Fombonne E, Scorah J, Ibrahim A, Durkin MS, Saxena S et al.


Global prevalence of autism: A systematic review update. Autism Research. 2022
May;15(5):778-790. doi:10.1002/aur.2696
[2] Maenner MJ, Warren Z, Williams AR, et al. Prevalence and Characteristics of
Autism Spectrum Disorder Among Children Aged 8 Years — Autism and Devel-
opmental Disabilities Monitoring Network, 11 Sites, United States, 2020. MMWR
Surveill Summ 2023;72(No. SS-2):1–14. doi:10.15585/mmwr.ss7202a1
[3] Lord C, Elsabbagh M, Baird G, Veenstra-Vanderweele J. Autism spectrum disorder.
Lancet. 2018;392(10146):508-520. doi:10.1016/S0140-6736(18)31129-2
[4] Cidav Z, Marcus SC, Mandell DS. Implications of childhood autism
for parental employment and earnings. Pediatrics. 2012;129(4):617-623.
doi:10.1542/peds.2011-2700
[5] Leekam S, Baron-Cohen S, Perrett DI, Milders M, Brown S. Eye-direction detec-
tion: a dissociation between geometric and joint attention skills in autism. British
Journal of Developmental Psychology. 1997 Mar;15:77-95.
[6] Fujioka T, Inohara K, Okamoto Y, et al. Gazefinder as a clinical supplementary
tool for discriminating between autism spectrum disorder and typical development
in male adolescents and adults. Mol Autism. 2016;7:19. doi:10.1186/s13229-016-
0083-y
[7] Falck-Ytter T, Fernell E, Hedvall AL, von Hofsten C, Gillberg C. Gaze performance
in children with autism spectrum disorder when observing communicative actions.
J Autism Dev Disord. 2012;42(10):2236-2245. doi:10.1007/s10803-012-1471-6
[8] Norbury CF, Brock J, Cragg L, Einav S, Griffiths H, Nation K. Eye-movement
patterns are associated with communicative competence in autistic spectrum
disorders. J Child Psychol Psychiatry. 2009;50(7):834-842. doi:10.1111/j.1469-
7610.2009.02073.x
[9] von Hofsten C, Uhlig H, Adell M, Kochukhova O. How children with autism
look at events. Research in Autism Spectrum Disorders. 2009;3(2):556–569.
doi:10.1016/j.rasd.2008.12.003
[10] Poole A, Ball L. Eye tracking in human-computer interaction and usability research:
Current status and future prospects. Encyclopedia of Human Computer Interaction.
2006;211-219.
[11] Landry R, Bryson SE. Impaired disengagement of attention in young children with
autism. J Child Psychol Psychiatry. 2004;45(6):1115-1122. doi:10.1111/j.1469-
7610.2004.00304.x
[12] Sasson NJ, Turner-Brown LM, Holtzclaw TN, Lam KS, Bodfish JW. Children with
autism demonstrate circumscribed attention during passive viewing of complex so-
cial and nonsocial picture arrays. Autism Res. 2008;1(1):31-42. doi:10.1002/aur.4
September 2024

[13] Sabatos-DeVito M, Schipul SE, Bulluck JC, Belger A, Baranek GT. Eye Tracking
Reveals Impaired Attentional Disengagement Associated with Sensory Response
Patterns in Children with Autism. J Autism Dev Disord. 2016;46(4):1319-1333.
doi:10.1007/s10803-015-2681-5
[14] Mahanama B, Jayawardana Y, Rengarajan S, Jayawardena G, Chukoskie L, Snider
J, Jayarathna S. Eye movement and pupil measures: A review. Frontiers in Com-
puter Science. 2022;3:1-22. doi:10.3389/fcomp.2021.733531
[15] Sasson NJ, Touchstone EW. Visual attention to competing social and object im-
ages by preschool children with autism spectrum disorder. J Autism Dev Disord.
2014;44(3):584-592. doi:10.1007/s10803-013-1910-z
[16] Vacas J, Antolı́ A, Sánchez-Raya A, Pérez-Dueñas C, Cuadrado F. Visual
preference for social vs. non-social images in young children with autism
spectrum disorders. An eye tracking study. PLoS One. 2021;16(6):e0252795.
doi:10.1371/journal.pone.0252795
[17] Bataineh E, Almourad MB, Marir F, Stocker J. Visual attention toward Socially
Rich context information for Autism Spectrum Disorder (ASD) and Normal Devel-
oping Children: An Eye Tracking Study. In: Proceedings of the 16th International
Conference on Advances in Mobile Computing and Multimedia. New York, NY,
USA: ACM; 2018. doi:10.1145/3282353.3282856
[18] Zhang K, Yuan Y, Chen J, Wang G, Chen Q, Luo M. Eye Tracking Research on
the Influence of Spatial Frequency and Inversion Effect on Facial Expression Pro-
cessing in Children with Autism Spectrum Disorder. Brain Sci. 2022;12(2):283.
Published 2022 Feb 18. doi:10.3390/brainsci12020283
[19] Tsang V. Eye-tracking study on facial emotion recognition tasks in individuals
with high-functioning autism spectrum disorders. Autism. 2018;22(2):161-170.
doi:10.1177/1362361316667830
[20] Matsuda S, Minagawa Y, Yamamoto J. Gaze Behavior of Children with ASD
toward Pictures of Facial Expressions. Autism Res Treat. 2015;2015:617190.
doi:10.1155/2015/617190
[21] Thompson JL, Plavnick JB, Skibbe LE. Eye-Tracking Analysis of Atten-
tion to an Electronic Storybook for Minimally Verbal Children With Autism
Spectrum Disorder. The Journal of Special Education. 2019;53(1):41-50.
doi:10.1177/0022466918796504
[22] Hosozawa M, Tanaka K, Shimizu T, Nakano T, Kitazawa S. How children with spe-
cific language impairment view social situations: an eye tracking study. Pediatrics.
2012;129(6):e1453-e1460. doi:10.1542/peds.2011-2278
[23] Wan G, Kong X, Sun B, et al. Applying Eye Tracking to Identify Autism
Spectrum Disorder in Children. J Autism Dev Disord. 2019;49(1):209-215.
doi:10.1007/s10803-018-3690-y
[24] Oliveira JS, Franco FO, Revers MC, et al. Computer-aided autism diagnosis
based on visual attention models using eye tracking. Sci Rep. 2021;11(1):10131.
doi:10.1038/s41598-021-89023-8
[25] Kang J, Han X, Song J, Niu Z, Li X. The identification of children with autism
spectrum disorder by SVM approach on EEG and eye-tracking data. Comput Biol
Med. 2020;120(103722):103722. doi:10.1016/j.compbiomed.2020.103722
September 2024

[26] Ahmed IA, Senan EM, Rassem TH, Ali MAH, Shatnawi HSA, Alwazer SM, Al-
shahrani M. Eye Tracking-Based Diagnosis and Early Detection of Autism Spec-
trum Disorder Using Machine Learning and Deep Learning Techniques. Electron-
ics. 2022;11(4):530. https://doi.org/10.3390/electronics11040530
[27] Giuliani F. Case Report: Using Eye- Tracking as Support for the TEACCH Program
and Two Teenagers with Autism- Spectrum Disorders. Journal of Clinical Neuro-
science. 2016;1. doi:10.4172/jnscr.1000104.
[28] Wang Q, Wall CA, Barney EC, et al. Promoting social attention in 3-year-olds
with ASD through gaze-contingent eye tracking. Autism Res. 2020;13(1):61-73.
doi:10.1002/aur.2199
[29] Feng Y, Cai Y. A Gaze Tracking System for Children with Autism Spectrum Dis-
orders. Gaming Media and Social Effects. 2017:137-145. doi:10.1007/978-981-10-
0861-0 10
[30] Mei C, Zahed BT, Mason L, Ouarles J. Towards Joint Attention Training for
Children with ASD - a VR Game Approach and Eye Gaze Exploration. 2018
IEEE Conference on Virtual Reality and 3D User Interfaces (VR). 2018:289-296.
doi:10.1109/VR.2018.8446242
[31] Kim SY, Rispoli M, Mason RA, Lory C, Gregori E, Roberts CA, Whitford D,
David M. A Systematic Quality Review of Technology-Aided Reading Interven-
tions for Students With Autism Spectrum Disorder. Remedial and Special Educa-
tion. 2022;43(6):404-420. doi:10.1177/07419325211063612
[32] Trembath D, Vivanti G, Iacono T, Dissanayake C. Accurate or assumed: visual
learning in children with ASD. J Autism Dev Disord. 2015;45(10):3276-3287.
doi:10.1007/s10803-015-2488-4
[33] Banire B, Al-Thani D, Qaraqe M, Khowaja K, Mansoor B. The Effects of Visual
Stimuli on Attention in Children With Autism Spectrum Disorder: An Eye-Tracking
Study. IEEE Access. 2020;8:225663-74. doi:10.1109/ACCESS.2020.3045042
[34] Kirillov A, Mintun E, Ravi N, Mao H, Rolland C, Gustafson L, et al. Segment
Anything. 2023; Available from: http://dx.doi.org/10.48550/ARXIV.2304.02643
[35] Vassilios K. PeyeMMV: Python implementation of EyeMMV’s fixa-
tion detection algorithm. Software Impacts. 2023;15(100475):100475.
doi:10.1016/j.simpa.2023.100475
[36] Salvucci DD, Goldberg JH. Identifying fixations and saccades in eye-tracking
protocols. In: Proceedings of the symposium on Eye tracking research &
applications - ETRA ’00. New York, New York, USA: ACM Press; 2000.
doi:10.1145/355017.355028
[37] Strauch C, Georgi J, Huckauf A, Ehlers J. Slow trends - A problem in analysing
pupil dynamics. In: Proceedings of the 2nd International Conference on Physiologi-
cal Computing Systems. SCITEPRESS - Science and and Technology Publications;
2015. doi:10.5220/0005329400610066
[38] Grootjen JW, Weingärtner H, Mayer S. Uncovering and addressing blink-related
challenges in using eye tracking for interactive systems. In: Proceedings of the CHI
Conference on Human Factors in Computing Systems. New York, NY, USA: ACM;
2024. p. 1–23. doi:10.1145/3613904.3642086
[39] Zhao Z, Tang H, Zhang X, Qu X, Hu X, Lu J. Classification of Children With
Autism and Typical Development Using Eye-Tracking Data From Face-to-Face
Conversations: Machine Learning Model Development and Performance Evalua-
tion. J Med Internet Res. 2021;23(8):e29328. doi:10.2196/29328
September 2024

[40] Ozturk MU, Arman AR, Bulut GC, Findik OTP, Yilmaz SS, Genc HA, et al. Sta-
tistical analysis and multimodal classification on noisy eye tracker and applica-
tion log data of children with autism and ADHD. Intell Automat Soft Comput.
2018;24(4):891-905. doi:10.31209/2018.100000058
[41] Minissi ME, Chicchi Giglioli IA, Mantovani F, Alcañiz Raya M. Assessment of
the Autism Spectrum Disorder Based on Machine Learning and Social Visual
Attention: A Systematic Review. J Autism Dev Disord. 2022;52(5):2187-2202.
doi:10.1007/s10803-021-05106-5
[42] American Psychiatric Association. Diagnostic and statistical manual of mental dis-
orders (5th ed.). 2013. doi:10.1176/appi.books.9780890425596
[43] Robins DL, Fein D, Barton ML, Green JA. The Modified Checklist for Autism
in Toddlers: an initial study investigating the early detection of autism and perva-
sive developmental disorders. J Autism Dev Disord. 2001 Apr;31(2):131-44. doi:
10.1023/a:1010738829569.
[44] St≫hle L and Svante W. Analysis of variance (ANOVA). Chemometrics and Intelli-
gent Laboratory Systems. 1989; 6(4):259-272.

You might also like