1152cs191 Data Visualization Unit III
1152cs191 Data Visualization Unit III
CO3
Engineering Knowledge
12/10/2024
Problem Analysis
Ethics
Visualization
Individual & Team Work
Data
Project Management & Finance
Mathematical Concepts
K2
taxonomy)
Software Development
revised Bloom’s
Level of learning
domain (Based on
Transferring Skills
Correlation of COs with Student Outcomes ABET
EAC and CAC
CO3 3 2 2
CO3 2 2
•For each record, a graphical representation, mark, or other aesthetic entity is drawn
at its associated k-dimensional point.
•Point plots can be defined to display individual records or summary records, and
can be structured by various projection techniques.
A scatter plot matrix is a grid (or matrix) of scatter plots used to visualize
bivariate relationships between combinations of variables. Each scatter plot in
the matrix visualizes the relationship between a pair of variables, allowing many
relationships to be explored in one chart.
Minitab Procedure
Select Graph >> Matrix plot...
Under Matrix of plots, select the Simple plot.
In the box labeled Graph variables, specify the variables you want included in
your plot.
Select OK. A new graph window should appear containing the scatter plot
matrix.
https://online.stat.psu.edu/stat501/lesson/create-simple-matrix-scatter-plots
2. Assuming that you are projecting the data into K dimensions (e.g., for display
purposes, K is usually between 1 and 3), create an M by K matrix, L, to contain
the locations for the projected points. These M locations can initially be randomly
chosen, or techniques such as principal component analysis (PCA) can be used to
create reasonable initial positions.
3. Compute an M byM matrix, Ls, that contains the similarities between all pairs
of points in L.
Department of Computer Science & Engineering Data
12/10/2024 13
Visualization
Multidimensional Scaling
4. Compute the value of stress, S, which is a measure of the differences between
Ds and Ls. Many such stress measures exist; most assume that the coordinate
systems have been normalized so that the maximum distance between points is
1.0.
7. Return to step 3.
Department of Computer Science & Engineering Data
12/10/2024 14
Visualization
Multidimensional Scaling
For example, the dimension representing the number of cylinders can be broken
down into 5 new dimensions: having 1 or 2 cylinders, having 3 or 4 cylinders,
having 5 or 6, having 7, or having 8. The number of new dimensions can be
determined algorithmically or manually.
This is similar to identifying bins in data (such as the grouping of low, medium,
and high for prices of cars).
Thus, for each record, each new vector of dimensions has exactly one dimension
with the value 1, and all the others have value zero.
Department of Computer Science & Engineering Data
12/10/2024 19
Visualization
Vectorized RadViz
•Line Graphs
• A line graph is a univariate
visualization technique where the
vertical axis represents the range of
values for the variable and the
horizontal axis represents some
ordering of the records in the data set.
Four versions of line graphs for a subset of the AAUP data set: superimposed, stacked, ordered superimposed,
and ordered stacked. Ordering is based on the first dimension, which represents salaries of full professor.
Department of Computer Science & Engineering Data
12/10/2024 22
Visualization
Parallel Coordinates
Parallel coordinates, also called ||-coords and PCP (for parallel coordinates
plot), were first introduced by Inselberg in 1985 as a mechanism for
studying high-dimensional geometry
• Circular area graphs—like a line graph, but with the area under
line filled in with a color or texture;
Bar Charts/Histograms
Examples of multivariate
glyphs
Treemaps and their many variants are the most popular form of rectangular
space-filling layout.
Pseudocode for
drawing a
hierarchy using a
treemap
The drawing of such trees is influenced the most by two factors: the
fan-out degree (e.g., the number of siblings a parent node can have)
and the depth (e.g., the furthest node from the root).
1. Slice the drawing area into equal-height slabs, based on the depth of
the tree.
2. For each level of the tree, determine how many nodes need to be drawn.
3. Divide each slice into equal-sized rectangles based on the number of nodes
at that level.
4. Draw each node in the center of its corresponding rectangle.
5. Draw a link between the center-bottom of each node to the center-top
of its child node(s).
Many enhancements can be made to this rather basic algorithm in order
to improve space utilization and move child nodes closer to their parents.
A Spread terminal nodes evenly across the drawing area and center parent
nodes above them.
A Position the root node in the center of the display and lay out child Nodes
radially, rather than vertically.
There are many other possibilities, including graphs with weighted edges,
undirected graphs, graphs with cycles, disconnected graphs, and so on.