What is Data visualization?
Data visualization is the graphical display of abstract information for two
purposes: sense-making (also called data analysis) and communication.
Statistical information is abstract (it describes things that are not physical).
This translation of the abstract into physical attributes of vision (length,
position, size, shape, and color, to name a few) can only succeed if we
understand a bit about visual perception and cognition.
To visualize data effectively, we must follow design principles that are derived
from an understanding of human perception.
What is visual perception?
What is perception?
Perception is the sensory experience of the world. It involves both recognizing
environmental stimuli and actions in response to these stimuli. Through this process,
we gain information about the environment.
Five senses - Vision, touch, sound, smell, and taste.
Perception is the sensory experience of the world. It involves both recognizing
environmental stimuli and actions in response to these stimuli. Through this
process, we gain information about the environment.
Five senses - Vision, touch, sound, smell, and taste.
All perception involves signals that go through the nervous system (result from
physical or chemical stimulation of the sensory system).
Vision involves light striking the retina of the eye, smell is mediated by odor
molecules; and heating involves pressure waves.
Visual perception is the ability to perceive our surroundings through the light
that enters our eyes. The visual perception of colors, patterns, and structures
has been of particular interest in relation to graphical user interfaces (GUIs)
because these are perceived exclusively through vision.
Physiologically, visual perception happens when the eye focuses light on the retina. Within
the retina, there is a layer of photoreceptor (light-receiving) cells which are designed to
change light into a series of electrochemical signals to be transmitted to the brain. The
process can take a mere 13 milliseconds, according to a 2017 study at MIT in the United
States.
Different attributes of visual perception are widely used in GUI design. Many designers
apply Gestalt principles (i.e., how humans structure visual stimuli) to the design of GUIs so
as to create interfaces that are easy for users to perceive and understand.
There are some things that you can do that might help you perceive more in the
world around you—or at least focus on the things that are important.
Pay attention. Perception requires you to attend to the world around you.
Make meaning of what you perceive. The recognition stage is an essential part of
perception since it allows you to make sense of the world around you. By
placing objects in meaningful categories, you are able to understand and react
appropriately.
Take action. The final step of the perceptual process involves some sort of action
in response to the environmental stimulus. This could involve a variety of
actions, such as turning your head for a closer look or turning away to look at
something else.
Cognition refers to "the mental action or process of acquiring knowledge and
understanding through thought, experience, and the senses.
Cognition refers to a range of mental processes relating to the
acquisition, storage, manipulation, and retrieval of information.
The ability to reason logically is an excellent example of cognition, problem
solving and making judgments about information.
Vision begins in the eye which receives the inputs, in the
form of light, and finished in the brain which interprets
those inputs and gives us the information we need from
the data we receive.
What is the story contained in these numbers (trends, patterns…)
Sales Data
What is the story contained in these numbers (trends, patterns…)
Sales Data
Same information in the form of a
graph
Power of data visualization - verbal processing
vs visual communication
● Domestic sales are considerably and consistently higher than international.
● Domestic sales trends upward over the year as a whole.
● International sales, in contrast, remains relatively flat, with one exception: they decreased sharply
in August.
● Domestic sales exhibited a cyclical pattern - up, up, down - that repeated itself on a quarterly
basis, always reaching the peak in the last month of the quarter and then declining dramatically in
the first month of the next.
Data visualization is effective because it shifts the balance between
perception and cognition to take fuller advantage of the brain's
abilities.
Seeing (i.e visual perception) which is handled by the visual cortex
located in the rear of the brain, is extremely fast and efficient. We see
immediately, with little effort.
Thinking (i.e. cognition), which is handled primarily by the cerebral
cortex in the front of the brain, is much slower and less efficient.
Data visualization shifts the balance toward greater use of visual
perception, taking advantage of our powerful eyes whenever
possible.
Data: a definition
Data is a set of variables that capture various aspects of
the world: UG vs. PG, etc.
Student ID,
Male vs Female
CPI,
A dataset also contains a set of observations (also called
records) over these variables. For example:
ID= 20224856
CPI=9.3
Data: a definition
Each variable may be either independent or dependent:
• An independent variable is not controlled or affected by another
variable (e.g., time in a time-series dataset)
• A dependent variable is affected by a variation in one or more
associated independent variables (e.g., temperature in a region)
Multivariate data visualization based investigation of
projectiles in sports
Inspired by Gestalt basic principle that the whole is
greater than the sum of its parts
[Link]
E Tufte and S Few and T Munzner
Defining visualization (vis)
Computer-based visualization systems provide visual representations of datasets
designed to help people carry out tasks more effectively.
25
Defining visualization (vis)
Computer-based visualization systems provide visual representations of datasets
designed to help people carry out tasks more effectively.
Why?...
26
Why have a human in the loop?
Computer-based visualization systems provide visual representations of datasets
designed to help people carry out tasks more effectively.
27
Why have a human in the loop?
Computer-based visualization systems provide visual representations of datasets
designed to help people carry out tasks more effectively.
Visualization is suitable when there is a need to augment human capabilities
rather than replace people with computational decision-making methods.
28
Why have a human in the loop?
Computer-based visualization systems provide visual representations of datasets
designed to help people carry out tasks more effectively.
Visualization is suitable when there is a need to augment human capabilities
rather than replace people with computational decision-making methods.
● don’t need vis when fully automatic solution exists and is trusted
● many analysis problems ill-specified
○ don’t know exactly what questions to ask in advance
● possibilities
○ long-term use for end users (ex: exploratory analysis of scientific data)
○ presentation of known results (ex: New York Times Upshot)
○ stepping stone to assess requirements before developing models
○ help automatic solution developers refine & determine parameters
○ help end users of automatic solutions verify, build trust
29
Why use an external representation?
Computer-based visualization systems provide visual representations of datasets
designed to help people carry out tasks more effectively.
● external representation: replace cognition with perception
[Cerebral: Visualizing Multiple Experimental Conditions
on a Graph with Biological Context. Barsky, Munzner,
Gardy, and Kincaid. IEEE TVCG (Proc. InfoVis)
30
14(6):1253-1260, 2008.]
Why use an external representation?
Computer-based visualization systems provide visual representations of datasets
designed to help people carry out tasks more effectively.
● external representation: replace cognition with perception
[Cerebral: Visualizing Multiple Experimental Conditions
on a Graph with Biological Context. Barsky, Munzner,
Gardy, and Kincaid. IEEE TVCG (Proc. InfoVis)
31
14(6):1253-1260, 2008.]
Why depend on vision?
Computer-based visualization systems provide visual representations of datasets
designed to help people carry out tasks more effectively.
● human visual system is high-bandwidth channel to brain
○ overview possible due to background processing
■ subjective experience of seeing everything simultaneously
■ significant processing occurs in parallel and pre-attentively
● sound: lower bandwidth and different semantics
○ overview not supported
■ subjective experience of sequential stream
● touch/haptics: impoverished record/replay capacity
○ only very low-bandwidth communication thus far
● taste, smell: no viable record/replay devices
32
Why represent all the data?
Computer-based visualization systems provide visual representations of datasets
designed to help people carry out tasks more effectively.
● summaries lose information, details matter
○ confirm expected and find unexpected patterns
○ assess validity of statistical model
Anscombe’s Quartet
Identical statistics
x mean 9
x variance 10
y mean 7.5
y variance 3.75
x/y correlation 0.816 33
Why represent all the data?
Computer-based visualization systems provide visual representations of datasets
designed to help people carry out tasks more effectively.
● summaries lose information, details matter Anscombe’s Quartet
○ confirm expected and find unexpected patterns
○ assess validity of statistical model
Identical statistics
x mean 9
x variance 10
y mean 7.5
y variance 3.75
x/y correlation 0.816 34
What resource limitations are we faced with?
Vis designers must take into account three very different kinds of resource limitations:
those of computers, of humans, and of displays.
● computational limits
○ computation time, system memory
● display limits
○ pixels are precious & most constrained resource
○ information density: ratio of space used to encode info vs unused whitespace
■ tradeoff between clutter and wasting space
■ find sweet spot between dense and sparse
● human limits
○ human time, human memory, human attention
35
Why analyze?
● imposes structure on
huge design space
○ scaffold to help you think systematically
about choices
○ analyzing existing as stepping stone to
designing new
○ most possibilities ineffective for particular
task/data combination
36