Week 02.1 Chaptr002
Week 02.1 Chaptr002
— Chapter 2 —
Arslan Anjum
[email protected]
1
Quartile Deviation
A measure similar to the special range (Q) is the inter-
quartile range . It is the difference between the third quartile
(Q3) and the first quartile (Q1). Thus
Q Q3 Q1
Where ‘n’ is the number of observations.
Solution:
After arranging the observations in ascending order, we
get
1040, 1080, 1120, 1200, 1240, 1320, 1342, 1360, 1440,
1470, 1600, 1680, 1720, 1730, 1750, 1755, 1785, 1880,
1885, 1960.
Quartile Deviation
Boxplot Analysis
Five-number summary of a distribution
Minimum, Q1, Median, Q3, Maximum
Boxplot
Data is represented with a box
The ends of the box are at the first and third
quartiles, i.e., the height of the box is IQR
The median is marked by a line within the
box
Whiskers: two lines outside the box extended
to Minimum and Maximum
Outliers: points beyond a specified outlier
threshold, plotted individually
5
Boxplot in Matlab
>> d = [30, 36, 47, 50, 52, 52, 56, 60, 63, 70, 70,
110];
>> boxplot(d);
6
Visualization of Data Dispersion: 3-D Boxplots
9
Quantile Plot
Displays all of the data (allowing the user to assess both
the overall behavior and unusual occurrences)
Plots quantile information
For a data x data sorted in increasing order, f
i i
indicates that approximately 100 fi% of the data are
below or equal to the value xi
11
Scatter plot
Provides a first look at bivariate data to see clusters of
points, outliers, etc
Each pair of values is treated as a pair of coordinates and
plotted as points in the plane
12
Positively and Negatively Correlated Data
13
Uncorrelated Data
14
Chapter 2: Getting to Know Your Data
Data Visualization
Summary
15
Data Visualization
Why data visualization?
Gain insight into an information space by mapping data onto graphical
primitives
Provide qualitative overview of large data sets
Search for patterns, trends, structure, irregularities, relationships among
data
Help find interesting regions and suitable parameters for further
quantitative analysis
Provide a visual proof of computer representations derived
Categorization of visualization methods:
Pixel-oriented visualization techniques
Geometric projection visualization techniques
Icon-based visualization techniques
Hierarchical visualization techniques
Visualizing complex data and relations
16
Pixel-Oriented Visualization Techniques
For a data set of m dimensions,
The m dimension values of a record are mapped to m pixels at the
corresponding positions in the windows
The colors of the pixels reflect the corresponding values
(a) Income (b) Credit Limit (c) transaction volume (d) age
17
Geometric Projection Visualization Techniques
18
Landscapes
Used by permission of B. Wright, Visible Decisions Inc.
news articles
visualized as
a landscape
• • •
gender,
education, etc.
A 5-piece stick
figure (1 body
and 4 limbs w.
different
angle/length)
two attributes mapped to axes, remaining attributes mapped to angle or length of limbs
24
Hierarchical Visualization Techniques
25
Worlds-within-Worlds
Fix all other parameters at constant values - draw other (1 or 2 or 3
dimensional worlds choosing these as the axes)
Software that uses this paradigm
N–vision: Dynamic
interaction through data,
including rotation, scaling
(inner) and translation
(inner/outer)
Auto Visual
26
Dimensional Stacking
attribute 4
attribute 2
attribute 3
attribute 1
27
Tree-Map
Screen-filling method which uses a hierarchical partitioning of
the screen into regions depending on the attribute values
The x- and y-dimension of the screen are partitioned alternately
according to the attribute values (classes)
https://support.office.com/
28
InfoCube
A 3-D visualization technique where hierarchical
information is displayed as nested semi-transparent
cubes
The outermost cubes correspond to the top level
data, while the subnodes or the lower level data
are represented as smaller cubes inside the
outermost cubes, and so on
29