0% found this document useful (0 votes)
3 views

data visualization

The document provides an overview of various data visualization techniques, including bar graphs, line graphs, pie charts, and scatter plots, each serving specific purposes in representing data. It also discusses the use of coordinate systems, colors, and methods for visualizing amounts, distributions, and proportions in data. Additionally, it highlights the importance of correctly conveying data through effective visualization strategies.

Uploaded by

Shrish Rajankar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

data visualization

The document provides an overview of various data visualization techniques, including bar graphs, line graphs, pie charts, and scatter plots, each serving specific purposes in representing data. It also discusses the use of coordinate systems, colors, and methods for visualizing amounts, distributions, and proportions in data. Additionally, it highlights the importance of correctly conveying data through effective visualization strategies.

Uploaded by

Shrish Rajankar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

•The first and foremost

objective of data
visualization is to
convey data correctly.
different types of data
visualizations
• Bar Graphs
• Bar graphs are one of the most commonly used types of graphs for data
visualization. They represent data using rectangular bars where the length
of each bar corresponds to the value it represents. Bar graphs are effective
for comparing data across different categories or groups.
• Side-by-side bar graph and stacked bar graph
• In a side-by side bar chart, the bars are split into colored bar segment. In a
stacked bar graph, the bar segments within a category bar are placed on
top of the other, and in a side-by-side bar graph, they are placed next to
every other field chosen
• Line Graphs
• Line graphs are used to display data over time or continuous intervals.
They consist of points connected by lines, with each point representing a
specific value at a particular time or interval
• Pie Charts
• Pie charts are circular graphs divided into sectors, where each sector
represents a proportion of the whole. The size of each sector corresponds
to the percentage or proportion of the total data it represents. Pie charts
are effective for showing the composition of a whole and comparing
different categories as parts of a whole.
• Scatter Plots
• Scatter plots are used to visualize the relationship between two variables.
Each data point in a scatter plot represents a value for both variables, and
the position of the point on the graph indicates the values of the
variables.
• Area Charts
• Area charts are similar to line graphs but with the area below the line
filled in with color. They are used to represent cumulative totals or stacked
data over time.
• Radar Charts
• A radar chart, also known as a spider chart or a web chart, is a graphical
method of displaying multivariate data in the form of a two-dimensional
chart.
• Histograms
• Histograms are similar to bar graphs but are used specifically to represent
the distribution of continuous data.
• The basic difference between a bar graph and a histogram is that a bar
graph is used to represent categorical data whereas a histogram is used to
represent numerical data.
• Bar graph: Compares data across categories
• Histogram: Shows the distribution of data in a set

• Bar graph: Shows the value for a category of data


• Histogram: Shows the frequency of data points in a range

• Bar graph: Uses rectangular bars with lengths proportional to the data
values
• Histogram: Uses rectangular bars with no gaps between them

• Bar graph: Bars can be rearranged in any order


• Histogram: Bars must be presented in numerical order from lowest to
highest
• Treemap Charts
• Treemap charts are a type of data visualization that represent hierarchical
data as a set of nested rectangles. Each rectangle, or "tile," in the treemap
represents a category or subcategory of the data, and the size of the
rectangle corresponds to a quantitative value, such as the proportion or
absolute value of that category within the dataset.
• Pareto Charts
• A Pareto chart is a specific type of chart that combines both bar and line
graphs.
• heatmap
• A heatmap is a graphical representation of data that uses a system of color
coding to represent different values
• Sinaplot and violin plots are both useful
visualization tools for displaying distributions
of data.
• Ridgeline plot (joyplot)
• useful alternative to violin plots and are often useful when visualizing very
large numbers of distributions or changes in distributions over.
• mosaic plot
• A mosaic plot is a graph that shows the relationship between two or more
categorical variables.
Data types in data visualization
Use of coordinate system in Data
Visualization
• Coordinate systems in data visualization are used to position data on a
graph, and to align and analyze it.
• Cartesian coordinates
• A Cartesian coordinate system can have two axes representing two
different units. This situation arises quite commonly whenever we’re
mapping two different types of variables to x and y.
• Nonlinear Axes
• Relationship between linear and square-root scales. The dots correspond
to data values 0, 1, 4, 9, 16, 25, 36, 49, which are evenly-spaced numbers
on a square-root scale, since they are the squares of the integers from 0 to
7. We can display these data points on a linear scale, we can square-root-
transform them and then show on a linear scale, or we can show them on
a square-root scale.
• Coordinate systems with curved axes
• All the coordinate systems we have encountered so far have used two
straight axes positioned at a right angle to each other even if the axes
themselves established a nonlinear mapping from data values to positions.

• There are other coordinate systems, however, where the axes themselves
are curved. (polar coordinate system)

• Polar coordinates can be useful for data of a periodic nature, such that
data values at one end of the scale can be logically joined to data values at
the other end. For example, consider the days in a year. December 31st is
the last day of the year, but it is also one day before the first day of the
year.
• Relationship between Cartesian and polar coordinates. (a) Three data
points shown in a Cartesian coordinate system. (b) The same three data
points shown in a polar coordinate system. We have taken
the x coordinates from part (a) and used them as angular coordinates and
the y coordinates from part (a) and used them as radial coordinates. The
circular axis runs from 0 to 4 in this example, and therefore x = 0 and x = 4
are the same locations in this coordinate system.
• Daily temperature normals for four selected locations in the
U.S., shown in polar coordinates. The radial distance from the
center point indicates the daily temperature in Fahrenheit,
and the days of the year are arranged counter-clockwise
starting with Jan.
Use of colors to represent data values
• color in data visualizations used
• i. to distinguish groups of data from each other,
• ii. to represent data values, and
• iii. to highlight.
• Color as a tool to distinguish: different countries on a map or
different
• manufactures of a certain product.

Example qualitative color scales


• Color as a tool to represent data values
• Color can also be used to represent quantitative data values, such as
income, temperature, or speed. In this case, we use a sequential color
scale. Such a scale contains a sequence of colors that clearly indicate
which values are larger or smaller than which other ones, and how distant
two specific values are from each other.

Example sequential color scales. The ColorBrewer Blues scale is a


monochromatic scale that varies from dark to light blue. The Heat and
Viridis scales are multi-hue scales that vary from dark red to light
yellow and from dark blue via green to light yellow, respectively.
• Color as a tool to highlight
• Color can also be an effective tool to highlight specific elements in the
data.

Example accent color scales, each with four base colors and three accent
colors. Accent color scales can be derived in several different ways: (top)
we can take an existing color scale and lighten and/or partially desaturate
some colors while darkening others; (middle) we can take gray values and
pair them with colors; (bottom) we can use an existing accent color scale,
e.g. the one from the ColorBrewer project
• From 2000 to 2010, the
two neighboring
southern states Texas
and Louisiana have
experienced among the
highest and lowest
population growth
across the U.S.
• Representing - Amounts, Distribution, and
Proportions
• Visualizing amounts
• Inmany scenarios, we are interested in the magnitude of some set
of numbers. For example, we might want to visualize the total sales
volume of different brands of cars, or the total number of people
living in different cities, or the age of olympians performing
different sports. In all these cases, we have a set of categories (e.g.,
brands of cars, cities, or sports) and a quantitative value for each
category. I refer to these cases as visualizing amounts, because the
main emphasis in these visualizations will be on the magnitude of
the quantitative values. The standard visualization in this scenario is
the bar plot, which comes in several variations, including simple
bars as well as grouped and stacked bars. Alternatives to the bar
plot are the dot plot and the heatmap.
Visualizing amounts
• Visualizing distributions: Histograms and
density plots
• We frequently encounter the situation where we would like to
understand how a particular variable is distributed in a dataset. To
give a concrete example, we will consider the passengers of the
Titanic, There were approximately 1300 passengers on the Titanic
(not counting crew), and we have reported ages for 756 of them.
We might want to know how many passengers of what ages there
were on the Titanic, i.e., how many children, young adults, middle-
aged people, seniors, and so on. We call the relative proportions of
different ages among the passengers the age distribution of the
passengers.
Visualizing distributions: Histograms
and density plots
• Visualizing proportions
• We often want to show how some group, entity, or amount breaks
down into individual pieces that each represent a proportion of the
whole. Common examples include the proportions of men and
women in a group of people, the percentages of people voting for
different political parties in an election, or the market shares of
companies
• Proportions can be visualized as pie charts, side-by-side bars, or
stacked bars. As for amounts, when we visualize proportions with
bars, the bars can be arranged either vertically or horizontally. Pie
charts emphasize that the individual parts add up to a whole and
highlight simple fractions. However, the individual pieces are more
easily compared in side by side bars. Stacked bars look awkward for
a single set of proportions, but can be useful when comparing
multiple sets of proportions.
Visualizing proportions

You might also like