Computer Science Proposal-2
Computer Science Proposal-2
Design
Chapter 1
1. Introduction
1.1. Background
Data has grown exponentially, changing how information is viewed, analysed, and
communicated across fields (Liang et al., 2023). As data volume, velocity, and variety
increase, good visualisation approaches are essential (Kehrer & Hauser, 2013). Data
visualisation helps humans understand complex data by visualising it for exploration, pattern
Moreover, data visualisation originated in computer science, statistics, graphic design, and
cognitive psychology (Ware, 2019). Jacques Bertin (1983) and Edward Tufte (1983)
pioneered graphic data display. Bertin's major work, "Semiology of Graphics," presented
graphic semiology's methodology for building visual representations based on data attributes.
However, Tufte stressed the need for clarity and simplicity in data visualisation by
Data visualisation has evolved to address the difficulties of rising data source complexity and
diversity. Visual analytics, interactive visualisations, and dimensional reduction help analyse
dimensional areas to reveal patterns and structures (Tenenbaum et al., 2000). Visual analytics
helps people make informed decisions with automated data analysis and interactive
visualisations (Keim et al., 2008). Users can filter, zoom, and brush data in interactive
visualisations (Yi et al., 2007). Interactive features enhance data exploration, hypothesis
Recently, data visualisation and machine learning have enabled innovation (Wu et al., 2024).
Machine learning automates visualisation design, improves visual analytics, and analyses
complex datasets (Wongsuphasawat et al., 2017; Edent, 2017). Recommender systems can
suggest visual representations using data features (Wongsuphasawat et al., 2017; Moritz,
2018). Deep learning models automatically name and categorise visual patterns, speeding up
In addition, data visualisation user experience and engagement are becoming important due
to technology (Lam et al., 2012). Feedback and interaction data connect visualisations to
users' mental models and decision-making processes, improving usability and relevance.
Visualisation user preferences have been studied utilising eye-tracking, think-aloud, and
algorithms also improve visualisation design (Siirtola & Mäkinen, 2005). Inspired by natural
evolution, these algorithms iteratively build and assess solutions, selecting the fittest for
reproduction and mutation (Eiben & Smith, 2003). Hence, Evolutionary algorithms can
1.2. Objectives
This project aims to create data-driven visualisations that adapt to user needs. This data
visualisation technique combines data analysis, and machine learning. Research goals
includes:
1. Create a framework for generating and optimising visualisations using machine
2. Integrate user feedback and interaction data into visualisation design to improve user
experience.
3. Assess the successful and usable communication of complicated information using the
developed visualisations.
What machine learning algorithms and techniques can optimise visualisations based
How can user feedback and interaction data improve the user experience in
visualisation design?
What evaluation methods can evaluate the usefulness and usability of evolving
Data visualisation and machine learning have advanced, yet there are no systematic
visualisation design iterations reduce data potential (Wongsuphasawat et al., 2015). Data
sources and complexity may outpace traditional visualisation design methodologies (Endert
et al., 2017).
Recently, machine learning has been used to automate and enhance data visualisation,
boosting user engagement and handling complex data (Gumelar, 2019). Lack of systematic
approach to gathering and integrating user feedback and interaction data to develop
visualisations (Schmidt, 2020). The disparity can prevent user-centered visualisations that
meet needs.
Unsystematic data visualisation, machine learning, and UX design generate data- and user-
al., 2015). Traditional visualisation methods may fail as data sources and complexity grow
(Endert et al., 2017). Despite separating data analysis and user experience, Lam et al. (2012)
argue machine learning aids visual analytics and automation. But, visibility may ignore user
input.
Manual complex design iterations are laborious. This approach can significantly lessen data
insights (Eilemann, 2019). Traditional designs may suffer with data source variability and
Visualisation design badly integrates user input and interaction data. Schmidt (2020)
indicates that many systems lack user inputs, making visualisations less intuitive. Another
study (Luo et al., 2020) also highlights that a major challenge with automatic visualization
systems is that they can provide visualizations without understanding the user's intent,
leading to users being misled. This provides an opportunity to work with an user certric data
visualizaltion. Hence, such gap allows interaction-driven user-centered design. Data study
This research will use machine learning to develop user-focused data-driven visualisations.
The proposed method will simplify visualisation design, remove manual iterations, and
improve complex information visualisation. Machine learning and data visualisation will
boost both fields and enable multidisciplinary collaboration. Integrated data analysis,
visualisation design, and user experience could assist scientific research, corporate
intelligence, and decision support system visualisation (Endert et al., 2017). Create
customised visualisations for user goals and data characteristics to clarify complex
1. Literature Review
Visualising data involves computer science, statistics, graphic design, and cognitive
psychology. Charts, graphs, and maps simplify complex data (Munzner, 2014). Data
visualisations use the human visual system's ability to collect and evaluate visual information
Researchers show data using various visualisation methods and criteria. The data-driven
visual representation method graphic semiology was invented by Bertin (1983). Tufte (1983)
advised raising the data-ink ratio and minimising chartjunk to simplify data presentation.
Recent data visualisation has altered due to data source complexity and diversity (Minch,
2023). Visual analytics, interactive visualisations, and dimensional reduction examine huge
datasets (Tenenbaum et al., 2000). Hasugian et al. (2023) examine dimension reduction
approaches as PCA, LDA, t-SNE, and UMAP for visualising high-dimensional and
User research and data analysis drive data-driven design (Hartson & Pyla, 2012). It designs
using empirical data and user insights, not intuition or subjective choices. This method works
in UX, product, and marketing design. Visualisations communicate data and address user
goals (Wongsuphasawat et al., 2016). Analysing data structure, distribution, and linkages
helped designers choose the best graphics (Munzner, 2014). Design and user research may
improve visualisations. User interviews, usability testing, and think-alouds improve designer
Data visualisation design and analytics are automated by machine learning (Wongsuphasawat
et al., 2017). Deep learning, grouping, and dimensionality reduction shape visualisation. Data
features display recommender systems (Wongsuphasawat, 2017; Moritz, 2019). Pattern and
Machine learning aids pattern, anomaly, and decision assistance in visual analytics (Endert et
al., 2017). Labelling visual patterns speeds up complex dataset exploration and analysis in
User experience determines data visualisation success. Feedback and interaction data connect
Through eye-tracking, think-aloud, and interaction logs, studies visualise user behaviour and
preferences. These tips simplify visualisation design. Interactive visualisations enable real-
time data modification (Yi et al., 2007). Filtering, zooming, and brushing modify
Siirtola and Mäkinen (2005) apply nature-inspired algorithms to visualisation design. These
algorithms iteratively create and evaluate solutions to pick the fittest for reproduction and
Siirtola and Mäkinen (2005) say evolutionary algorithms can optimise visual representations
for many design goals and constraints. Visualisation information can be improved without
visual clutter, design limits, or user preferences via an evolutionary algorithm. They
Evolutionary methods can automate and optimise complex design tasks (Parmee, 2012). Data
visualisations can be customised using evolutionary algorithms, machine learning, and input
2. Research Methodology
Before processing input data, the framework will check for missing values, outliers, and
manipulations. The researcher will use imputation, outlier detection, and normalisation to
The data will next be analysed for statistical attributes (e.g., mean, variance, skewness), data
kinds (categorical, numerical), and domain-specific characteristics. These traits will reveal
The retrieved features will be used to build early visualisations using D3.js (Bostock et al.,
2011) and Vega-Lite (Satyanarayan et al., 2017). These visualisations will start optimisation.
According to objective functions and constraints, an evolutionary algorithm will evaluate and
The user experience refers to the usability, interactivity, and overall experience of
a visualisation, informed by user input and interaction data (Lam et al., 2012).
User research preferences, visualisation best practices (Tufte, 1983; Bertin, 1983),
To find novel solutions, the evolutionary algorithm will use mutation and crossover (Eiben &
Smith, 2015). The fittest candidates will be picked for the following generation based on
The evolutionary technique optimises visualisations using user feedback and interaction data.
Eye-tracking, think-aloud protocols, and interaction logs will record user preferences. Eye
tracking data can indicate user confusion or visual interest (Goldberg & Helfman, 2010).
Users can provide qualitative feedback and communicate their ideas with think-aloud
visualisations. Users filter, zoom, and brush visualisations in interactive logs can also be
helpful.
User data will change the evolutionary algorithm's objective functions and constraints to meet
user preferences. If users have trouble with visual encoding or interactivity, the system may
prioritise alternatives.
The evolving visualisations will be analysed and validated using various methods:
Quantitative evaluation: Use controlled trials and user studies to assess visualisation's
data delivery and analytical capabilities. Task completion, accuracy, and subjective
Interviewing, focus grouping, and surveying experts and end-users for qualitative
and traditional design models. This will show if the proposed technique outperforms
previous ones.
The data-driven visualisation design approach will improve with iteration and validation.
Iterative and gradual research will strengthen the framework. Steps in iterations include:
feedback.
Test the component with controlled trials, user research, and experts. Testing the
qualitative performance and usability statistics may be necessary (Han et al., 2022).
Improve the component using prior data. Changes may include algorithms, objective
the framework.
This iterative process will improve the framework utilising the newest research and ideas. It
will also allow progressive component creation and testing, reducing system development
The research is interdisciplinary, thus collaboration with other experts is crucial. Consultating
data visualisation experts and practitioners to guarantee best practises and domain-specific
researchers, designers, and domain professionals may be consulted (Munzner, 2014; Ware,
2019).
Working with machine learning experts to use cutting-edge techniques and methods.
needed (Eiben & Smith, 2015; LeCun et al., 2015; Ricci, 2011).
Used perception, cognition, and user-centered design with user experience and
cognitive psychologists. This will guarantee that data visualisations are intuitive and
Testing updated visualisations in real-world settings with domain experts and end-
healthcare, finance, and environmental science, therefore this may include working
Ethics will guide the research. This involves: • Ensure user data privacy and confidentiality in
accordance with data protection rules and institutional norms during research. Anonymous or
aggregated user data, safe data storage and transmission, and informed agreement from
Follow ethical criteria for human subject research, as established by IRBs or research
Addressing potential biases and limits of created methodologies candidly. Bias audits,
or high-stakes areas. Develop standards or best practices for deploying and applying
with ethics and responsible innovation specialists may help (Schiff et al., 2020).
This study can assist build trustworthy and responsible data visualisation systems that
Improved data visualisations lead to faster task completion, accuracy, and subjective
al., 2005).
Partner and stakeholder input will shape evaluation metrics. A complete framework review
https://sadbhavnapublications.org/research-enrichment-material/2-Statistical-Books/
Outlier-Analysis.pdf.
Amar, R., Eagan, J., & Stasko, J. (2005). Low-level components of analytic activity in
Madison: U of Wisconsin P.
Bostock, M., Ogievetsky, V., & Heer, J. (2011). D³ data-driven documents. IEEE
Brooke, J. (1996). SUS: A ‘Quick and Dirty’ Usability Scale. Usability Evaluation In
Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977). Maximum Likelihood from Incomplete
Data Via the EM Algorithm. Journal of the Royal Statistical Society: Series B
6161.1977.tb01600.x.
Dignum, V. (2018). Ethics in artificial intelligence: introduction to the special issue. Ethics
doi:https://doi.org/10.1007/s10676-018-9450-z.
Eiben, A.E. and Smith, J.E. (2015). Introduction to Evolutionary Computing. [online]
doi:https://doi.org/10.1007/978-3-662-44874-8.
doi:https://doi.org/10.5167/uzh-168812.
Endert, A., Ribarsky, W., Turkay, C., Wong, B.L.W., Nabney, I., Blanco, I.D. and Rossi, F.
(2017). The State of the Art in Integrating Machine Learning into Visual Analytics.
doi:https://doi.org/10.1111/cgf.13092.
Goldberg, J.H. and Helfman, J.I. (2010). Scanpath clustering and aggregation.
doi:https://doi.org/10.1145/1743666.1743721.
Han, J., Pei, J., & Tong, H. (2022). Data mining: concepts and techniques. Morgan
kaufmann.
Hart, S.G. and Staveland, L.E. (1988). Development of NASA-TLX (Task Load Index):
https://www.sciencedirect.com/science/article/pii/S0166411508623869.
Hartson, R. and Pyla, P.S. (2012). The UX Book: Process and Guidelines for Ensuring a
hl=en&lr=&id=w4I3Y64SWLoC&oi=fnd&pg=PP1&dq=Hartson.
Hasugian, P.M., Mawengkang, H., Sihombing, P. and Efendi, S. (2023). Review of High-
doi:https://doi.org/10.1109/icosnikom60230.2023.10364377.
Kehrer, J., & Hauser, H. (2013). Visualization and visual analysis of multifaceted scientific
495-513.
Keim, D.A., Mansmann, F., Schneidewind, J., Thomas, J. and Ziegler, H. (2008). Visual
doi:https://doi.org/10.1007/978-3-540-71080-6_6.
Lam, H., Bertini, E., Isenberg, P., Plaisant, C., & Carpendale, S. (2012). Empirical studies in
LeCun, Y., Bengio, Y. and Hinton, G. (2015). Deep Learning. Nature, 521(7553), pp.436–
444. doi:https://doi.org/10.1038/nature14539.
Liang, R., Huang, C., Zhang, C., Li, B., Saydam, S., & Canbulat, I. (2023). The fusion of data
visualisation and data analytics in the process of mining digitalisation. IEEE Access.
Luo, Y., Qin, X., Chai, C., Tang, N., Li, G., & Li, W. (2020). Steerable self-driving data
490.
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K. and Galstyan, A. (2021). A Survey on
Bias and Fairness in Machine Learning. ACM Computing Surveys, [online] 54(6),
pp.1–35. doi:https://doi.org/10.1145/3457607.
Minch, B. (2023). In search of the most efficient and memory-saving visualization of high
doi:https://doi.org/10.48550/arxiv.2303.05455.
Moritz, D., Wang, C., Nelson, G.L., Lin, H., Smith, A.M., Howe, B. and Heer, J. (2019).
Parmee, I.C. (2012). Evolutionary and Adaptive Computing in Engineering Design. Springer
doi:https://doi.org/10.5220/0009181903090316.
doi:https://doi.org/10.1057/palgrave.ivs.9500086.
Tenenbaum, J.B. (2000). A Global Geometric Framework for Nonlinear Dimensionality
doi:https://doi.org/10.1126/science.290.5500.2319.
Wang, J., Hazarika, S., Li, C. and Shen, H.-W. (2019). Visualization and Visual Analysis of
Wongsuphasawat, K., Moritz, D., Anand, A., Mackinlay, J., Howe, B., & Heer, J. (2015).
649-658.
Wongsuphasawat, K., Qu, Z., Moritz, D., Chang, R., Ouk, F., Anand, A., Mackinlay, J.,
Howe, B. and Heer, J. (2017). Voyager 2: Augmenting visual analysis with partial
Wu, Y., Wan, Y., Zhang, H., Sui, Y., Wei, W., Zhao, W., Xu, G. and Jin, H. (2024).
Automated Data Visualization from Natural Language via Large Language Models: