2 Vis Basics
2 Vis Basics
Data Visualization
Review
• What is the purpose of visualization?
Data transfer
Data
Insight
(learning, knowledge extraction)
Method
Data transfer
Data
Insight
Map: ~Map-1:
data → visual visual → data insight
Visual transfer
Visualization
(communication bandwidth)
Visual Mappings
Visualization
PolarEyes
Visualization Pipeline
tas
k
User interaction
Data Table: Canonical data model
• Visualization requires structure, data model
• (All?) information can be modeled as data tables
Data Table
Attributes (aka: dimensions, variables, fields, columns, …)
Values
Data Types:
•Quantitative
•Ordinal
•Categorical
•Nominal
Items
(aka:
tuples, cases,
records,
data points,
rows, …)
Attributes
• Dependent variables (measured)
• Independent variables (controlled)
Visual marks:
• Points
• Lines
• Areas
• Volumes
• Glyphs
Visual Mapping: Step 2
1. Map: data items → visual marks
2. Map: data attributes → visual properties of marks
• Year → x
• Length → y
• Popularity → size
• Subject → color
• Award? → shape
Visual Mapping Definition Language
• Films → dots
• Year → x
• Length → y
• Popularity → size
• Subject → color
• Award? → shape
The Simple Stuff
• Univariate
• Bivariate
• Trivariate
Univariate
• Dot plot
• Bar chart (item vs. attribute)
• Tukey box plot
• Histogram
Bivariate
• Scatterplot
•
Trivariate
• 3D scatterplot, spin plot
• 2D plot + size (or color…)
Visualization Design
HCI Design Process
• # of items
• Value range
(e.g. bits/value)
User Tasks
• Easy stuff: Forms can do this
• Reduce to only 1 data item or value
• Stats: Min, max, average, %
• Search: known item
• Hard stuff: Visualization can do this!
• Require seeing the whole
• Patterns: distributions, trends, frequencies, structures
• Outliers: exceptions
• Relationships: correlations, multi-way interactions
• Tradeoffs: combined min/max
• Comparisons: choices (1:1), context (1:M), sets (M:M)
• Clusters: groups, similarities
• Anomalies: data errors
• Paths: distances, ancestors, decompositions, …
Design the Visualization Pipeline
tas
k
User interaction
Design
• Methods:
• Optimize tasks on data, scenarios
• Apply principles
• Build on existing solutions
• Brainstorm
• Artifacts:
• Paper sketches
• Mockups (powerpoint, macromedia,…)
• Prototypes (VB, …)
• Implementation
HCI UI Evaluation Metrics
• User learnability:
• Learning time
• Retention time
• User performance: *** Measure while
• Performance time users perform
• Success rates benchmark tasks
• Error rates, recovery
• Clicks, actions
• User satisfaction:
• Surveys
• Effectiveness
• Cleveland’s rules
• Expressiveness
• Encodes all data
• Encodes only the data
Ranking Visual Properties
1. Position
2. Length Increased accuracy for
quantitative data
3. Angle, Slope
(Cleveland and McGill)
4. Area, Volume
5. Color
Categorical data:
1. Position
2. Color, Shape
Design guideline: 3. Length
• Map more important data attributes 4. Angle, slope
to more accurate visual attributes 5. Area, volume
(based on user task) (Mackinlay hypoth.)
Example
• Hard drives for sale: price ($), capacity (MB), quality rating (1-5)
Pie vs. Bar
• Data: population of the 50 states
• Pie: state and pop overloaded on circumf.
• Bar: state on x, pop on y
AK
AL
Stacked Bar
AR
CA
CO
…
Eliminate “Chart Junk” (Tufte)
• Attempt simplicity
(e.g. am I using 3d
just for coolness?)
Increase Data Density (Tufte)
• Calculate data/pixel
“A pixel
is a
terrible
thing to
waste.”
(Shneiderman)
Interaction Approach
• Direct Manipulation (Shneiderman)
• Visual representation
• Rapid, incremental, reversible actions
• Pointing instead of typing
• Immediate, continuous feedback
Information Visualization Mantra
(Shneiderman)
Show me
the data!
How (not) to Lie
with Visualization
Information Types
• Multi-dimensional: databases,…
• 1D: timelines,…
• 2D: maps,…
• 3D: volumes,…
• Hierarchies/Trees: directories,…
• Networks/Graphs: web, communications,…
• Document collections: digital libraries,…