UNIT-5 (1)
UNIT-5 (1)
Python for Data Visualization: Visualization with matplotlib – box plot, line plots –
scatter plots – visualizing errors – density and contour plots – histograms, binnings,
and density – three-dimensional plotting – geographic data – data analysis using state
models and seaborn – graph plotting using Plotly – interactive data visualization using
Bokeh.
Output:
In Matplotlib, the figure (an instance of the class plt.Figure) can be thought of as a single
container that contains all the objects representing axes, graphics, text, and labels.
The axes (an instance of the class plt.Axes) are what we see above: a bounding box with ticks
and labels, which will eventually contain the plot elements that make up our visualization.
Example:
#Simple Line Plots
import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5, 6]
y = [1, 5, 3, 5, 7, 8]
plt.plot(x, y)
plt.show()
Output:
UNIT-5 Data Visualization with Matplotlib
Example:
#Plot a Line Plot Logarithmically in Matplotlib
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 5, 10) # [0, 0.55, 1.11, 1.66, 2.22, 2.77, 3.33, 3.88, 4.44, 5]
y = np.exp(x) # [1, 1.74, 3.03, 5.29, 9.22, 16.08, 28.03, 48.85, 85.15, 148.41]
plt.plot(x, y)
plt.show()
Output:
UNIT-5 Data Visualization with Matplotlib
Example:
#Customizing Line Plots in Matplotlib
import matplotlib.pyplot as plt
import numpy as np
x = np.random.randint(low=1, high=10, size=25)
plt.plot(x, color = 'blue', linewidth=3, linestyle='-.')
plt.show()
Output:
Example:
UNIT-5 Data Visualization with Matplotlib
Output:
Similarly, we can adjust the line style using the linestyle keyword
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 10, 1000)
plt.plot(x, x + 4, linestyle='-') # solid
plt.plot(x, x + 5, linestyle='--') # dashed
plt.plot(x, x + 6, linestyle='-.') # dashdot
plt.plot(x, x + 7, linestyle=':'); # dotted
plt.show()
UNIT-5 Data Visualization with Matplotlib
Output:
Example:
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 10, 1000)
plt.plot(x, x + 0, '-g') # solid green
plt.plot(x, x + 1, '--c') # dashed cyan
plt.plot(x, x + 2, '-.k') # dashdot black
plt.plot(x, x + 3, ':r'); # dotted red
plt.show()
Output:
UNIT-5 Data Visualization with Matplotlib
Labeling Plots
Labeling of plots includes titles, axis labels, and simple legends.
Titles and axis labels are the simplest such labels.
Example:
#Labeling Plots
import matplotlib.pyplot as plt
import numpy as np
plt.title("Labeling Plots")
plt.xlabel("X-Axis")
plt.ylabel("Y-Axis")
plt.show()
Output:
UNIT-5 Data Visualization with Matplotlib
Scatter Plots
Another commonly used plot type is the simple scatter plot, a close cousin of the line plot.
Instead of points being joined by line segments, here the points are represented individually
with a dot, circle, or other shape.
Example:
#Scatter Plots with plt.plot
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 10, 30)
y = np.sin(x)
plt.plot(x, y, '*', color='blue')
plt.show()
Output:
UNIT-5 Data Visualization with Matplotlib
The third argument in the function call plt.plot is a character that represents the type of
symbol used for the plotting.
Additional keyword arguments to plt.plot specify a wide range of properties of the lines and
markers.
Example:
#Properties of lines and markers
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 10, 30)
y = np.sin(x)
plt.plot(x, y, '-s', color='blue',
markersize=15, linewidth=2,
markerfacecolor='red',
markeredgecolor='green',
markeredgewidth=2)
plt.ylim(-1.2, 1.2)
plt.show()
Output:
UNIT-5 Data Visualization with Matplotlib
Output:
UNIT-5 Data Visualization with Matplotlib
Visualizing Errors
For any scientific measurement, accurate accounting for errors is nearly as important, if not
more important, than accurate reporting of the number itself.
In visualization of data and results, showing these errors effectively can make a plot convey
much more complete information.
Basic Error bars
Error bars indicate how much each data point in a plot deviates from the actual value.
Error bars display the standard deviation of the distribution while the actual plot depicts the
shape of the distribution.
For graphs present in the research papers it is a necessary to display the error value along
with the data points which depicts the amount of uncertainty present in the data.
The function matplotlib.pyplot.errorbar() plots given x values and y values in a graph and
marks the error or standard deviation of the distribution on each point.
UNIT-5 Data Visualization with Matplotlib
Example:
#Visualizing Errors
#Basic Error Bars
import matplotlib.pyplot as plt
import numpy as np
x =[1, 2, 3, 4, 5, 6, 7]
y =[1, 2, 1, 2, 1, 2, 1]
plt.plot(x,y)
plt.errorbar(x, y, xerr=0.3,yerr=0.2, fmt='s', color='red',ecolor='blue');
plt.show()
Output:
Example:
#Visulizing errors
import numpy as np
import matplotlib.pyplot as plt
x=[1,2,3,4,5]
y=[2,4,6,8,10]
UNIT-5 Data Visualization with Matplotlib
plt.errorbar(x,y,xerr=0.3,yerr=0.5,fmt='o',color='red',ecolor='blue')
plt.show()
Output:
The x and y values represent positions on the plot, and the z values will be represented by the
contour levels.
The way to prepare such data is to use the np.meshgrid function, which builds two-dimensional
grids from one-dimensional arrays:
Example:
#contour plots
import numpy as np
import matplotlib.pyplot as plt
xlist = np.linspace(-3.0, 3.0, 100)
ylist = np.linspace(-3.0, 3.0, 100)
X,Y = np.meshgrid(xlist, ylist)
Z = np.sqrt(X**2 + Y**2)
fig,ax=plt.subplots(1,1)
cp = ax.contourf(X, Y, Z)
ax.set_title('Filled Contours Plot')
ax.set_xlabel('x (cm)')
ax.set_ylabel('y (cm)')
plt.show()
Output:
UNIT-5 Python for Data Visualization
Example:
#Contour Plot
import matplotlib.pyplot as plt
plt.figure()
cp = plt.contour(X, Y, Z, colors='magenta')
plt.title('Contour Plot')
plt.xlabel('x (cm)')
plt.ylabel('y (cm)')
plt.show()
Output:
UNIT-5 Python for Data Visualization
If 'horizontal', barh will be used for bar-type histograms and the bottom kwarg will be the left
edges.
color - color or array_like of colors or None, optional
Color spec or sequence of color specs, one per dataset. Default (None) uses the standard line color
sequence.
Default is None
label - str or None, optional. Default is None
Other parameter
**kwargs - Patch properties, it allows us to pass a variable number of keyword arguments to a
python function. ** denotes this type of function.
Example:
#Histograms
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-white')
data = np.random.randn(1000)
print(plt.hist(data))
Output:
UNIT-5 Python for Data Visualization
(array([ 6., 27., 73., 168., 234., 227., 167., 65., 25., 8.]), array([-3.04715424, -2.43211248, -
1.81707073, -1.20202897, -0.58698722, 0.02805453, 0.64309629, 1.25813804, 1.87317979,
2.48822155, 3.1032633 ]), <BarContainer object of 10 artists>)
Example:
#Histograms,Binnings and Density
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data = np.random.multivariate_normal([0, 0], [[5, 2], [2, 2]], size=2000)
data = pd.DataFrame(data, columns=['x', 'y'])
plt.hist(data["x"], alpha=0.5)
plt.hist(data["y"], alpha=0.5)
plt.show()
Output:
UNIT-5 Python for Data Visualization
Example:
#Kernal Density Estimation
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
data = np.random.multivariate_normal([0, 0], [[5, 2], [2, 2]], size=2000)
data = pd.DataFrame(data, columns=['x', 'y'])
sns.kdeplot(data["x"], shade=True)
sns.kdeplot(data["y"], shade=True)
plt.show()
UNIT-5 Python for Data Visualization
Output:
Three-Dimensional Plotting
Matplotlib was initially designed with only two-dimensional plotting in mind. Around the
time of the 1.0 release, some three-dimensional plotting utilities were built on top of
Matplotlib’s two-dimensional display, and the result is a convenient (if somewhat limited) set
of tools for three- dimensional data visualization.
We enable three-dimensional plots by importing the mplot3d toolkit from mpl_toolkits
import mplot3d
We can create a three-dimensional axes by passing the keyword projection='3d' to any of the
normal axes creation routines:
import numpy as np
import matplotlib.pyplot as plt fig = plt.figure()
ax = plt.axes(projection='3d')
Three-dimensional plotting is one of the functionalities that benefits immensely from viewing
figures interactively rather than statically in the notebook when running this code.
Different three-dimensional plots are
Three-Dimensional Points and Lines
(Three-dimensional Line or scatter Plot)
UNIT-5 Python for Data Visualization
Example 1:
#Three-Diemensional Plotting
import matplotlib.pyplot as plt
fig = plt.figure()
ax = plt.axes(projection='3d')
Output:
Example 2:
#3-D line plot
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits import mplot3d
UNIT-5 Python for Data Visualization
fig = plt.figure()
# Data for a three-dimensional line
ax = plt.axes(projection='3d')
z = np.linspace(0,15,1000)
x = np.sin(z)
y = np.sin(z)
ax.scatter3D(x,y,z,c=z,cmap='Greens')
plt.show()
Output:
Example 3:
#3-D Scatter plot
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits import mplot3d
fig = plt.figure()
UNIT-5 Python for Data Visualization
ax = plt.axes(projection='3d')
z = 15*np.random.randn(100)
x= np.sin(z) +0.1*np.random.randn(100)
y= np.sin(z) +0.1*np.random.randn(100)
ax.scatter3D(x,y,z,color='blue')
plt.show()
Output:
Example 4:
#3D-Contour Plot
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits import mplot3d
def f(x, y):
return np.sin(np.sqrt(x ** 2 + y ** 2))
x = np.linspace(-6, 6, 30)
UNIT-5 Python for Data Visualization
y = np.linspace(-6, 6, 30)
X, Y = np.meshgrid(x, y)
Z = f(X, Y)
ax = plt.axes(projection='3d')
ax.contour3D(X, Y, Z, 50, cmap='Reds')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.show()
Output:
Geographic Data
One common type of visualization in data science is that of geographic data.
Matplotlib’s main tool for geographic data visualization is the Basemap toolkit, which is
one of several Matplotlib toolkits that live under the mpl_toolkits namespace.
UNIT-5 Python for Data Visualization
Basemap feels a bit clunky to use, and often even simple visualizations take much longer
to render
Basemap is a useful tool for Python users to have in their virtual toolbelts.
Map Projections
Map projections are used to project a spherical map, such as that of the Earth, onto a flat
surface without somehow distorting it or breaking its continuity.
Depending on the intended use of the map projection, there are certain map features (e.g.,
direction, area, distance, shape, or other considerations) that are useful to maintain.
The Basemap package implements several projections, all referenced by a short format code. Some
of the more common ones are:
Cylindrical projections
Pseudo-cylindrical projections
Perspective projections
Conic projections
Cylindrical projection
The simplest of map projections are cylindrical projections, in which lines of constant
latitude and longitude are mapped to horizontal and vertical lines, respectively.
This type of mapping represents equatorial regions quite well, but results in extreme
distortions near the poles.
The spacing of latitude lines varies between different cylindrical projections, leading to
different conservation properties, and different distortion near the poles.
Other cylindrical projections are the Mercator (projection='merc') and the cylindrical
equal-area (projection='cea') projections.
The additional arguments to Basemap for this view specify the latitude (lat) and longitude
(lon) of the lower-left corner (llcrnr) and upper-right corner (urcrnr) for the desired map,
in units of degrees.
Example:
#Cylindrical Projection
fig = plt.figure(figsize = (10,8))
UNIT-5 Python for Data Visualization
m = Basemap(projection='cyl',llcrnrlat=-80,urcrnrlat=80,llcrnrlon=-180,urcrnrlon=180)
m.drawcoastlines()
m.fillcontinents(color='tan',lake_color='lightblue')
m.drawcountries(linewidth=1, linestyle='solid', color='k' )
m.drawmapboundary(fill_color='lightblue')
plt.title(" Cylindrical Equidistant Projection", fontsize=20)
Output:
Pseudo-cylindrical projections
Pseudo-cylindrical projections relax the requirement that meridians (lines of constant
longitude) remain vertical; this can give better properties near the poles of the projection.
The Mollweide projection (projection='moll') is one common example of this, in which all
meridians are elliptical arcs
It is constructed so as to preserve area across the map: though there are distortions near the
poles, the area of small patches reflects the true area.
Other pseudo-cylindrical projections are the sinusoidal (projection='sinu') and Robinson
(projection='robin') projections.
The extra arguments to Basemap here refer to the central latitude (lat_0) and longitude
(lon_0) for the desired map.
UNIT-5 Python for Data Visualization
Example:
#Psuedo-Cylindrical Projection
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
fig = plt.figure(num=None, figsize=(8, 8) )
m = Basemap(projection='moll',lon_0=0,resolution='c')
m.drawcoastlines()
m.fillcontinents(color='tan',lake_color='lightblue')
# draw parallels and meridians.
m.drawparallels(np.arange(-90.,91.,30.),labels=[True,True,False,False],dashes=[2,2])
m.drawmeridians(np.arange(-180.,181.,60.),labels=[False,False,False,False],dashes=[2,2])
m.drawmapboundary(fill_color='lightblue')
plt.title("Mollweide Projection");
Output:
Perspective projections
Perspective projections are constructed using a particular choice of perspective point,
similar to if you photographed the Earth from a particular point in space (a point which,
for some projections, technically lies within the Earth!).
UNIT-5 Python for Data Visualization
One common example is the orthographic projection (projection='ortho'), which shows one
side of the globe as seen from a viewer at a very long distance. Thus, it can show only half
the globe at a time.
Other perspective-based projections include the
gnomonic projection (projection='gnom') and
stereographic projection (projection='stere').
These are often the most useful for showing small portions of the map.
Example:
#Perspective Projections
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
map = Basemap(projection='ortho',lat_0=0, lon_0=0)
map.drawmapboundary(fill_color='aqua')
map.fillcontinents(color='coral',lake_color='aqua')
map.drawcoastlines()
x, y = map(0, 0)
map.plot(x, y, marker='1',color='m')
plt.show()
Output:
UNIT-5 Python for Data Visualization
Conic Projections
A conic projection projects the map onto a single cone, which is then unrolled.
This can lead to very good local properties, but regions far from the focus point of the cone
may become much distorted.
One example of this is the Lambert conformal conic projection (projection='lcc').
It projects the map onto a cone arranged in such a way that two standard parallels (specified
in Basemap by lat_1 and lat_2) have well-represented distances, with scale decreasing
between them and increasing outside of them.
Other useful conic projections are the equidistant conic (projection='eqdc') and the Albers
equal-area (projection='aea') projection
Example:
#ConicnProjections
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
UNIT-5 Python for Data Visualization
m =
Basemap(width=12000000,height=9000000,projection='lcc',resolution=None,lat_1=45.,lat_2=55
,lat_0=50,lon_0=-107.)
m.etopo()
plt.show()
Output:
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for
integers).
drawlsmask() - Draw a mask between the land and sea, for use with projecting images on one or
the other
drawmapboundary() - Draw the map boundary, including the fill color for oceans
drawrivers() - Draw rivers on the map
fillcontinents() - Fill the continents with a given color; optionally fill lakes with another color
Political boundaries
drawcountries() - Draw country boundaries drawstates() - Draw US state boundaries drawcounties()
- Draw US county boundaries
Map features
drawgreatcircle() - Draw a great circle between two points drawparallels() - Draw lines of constant
latitude drawmeridians() - Draw lines of constant longitude drawmapscale() - Draw a linear scale
on the map.
Whole-globe images
bluemarble() - Project NASA’s blue marble image onto the map shadedrelief() - Project a shaded
relief image onto the map etopo() - Draw an etopo relief image onto the map
warpimage() - Project a user-provided image onto the map.
Example:
import pandas as pd
import seaborn as sns
data = np.random.multivariate_normal([0, 0], [[5, 2], [2, 2]], size=2000)
data = pd.DataFrame(data, columns=['x', 'y'])
for col in 'xy':
sns.kdeplot(data[col], shade=True)
Output:
UNIT-5 Python for Data Visualization
Example:
# import packages import numpy as np import pandas as pd
import statsmodels.formula.api as smf
# loading the csv file
df = pd.read_csv('headbrain1.csv') print(df.head())
# fitting the model
df.columns = ['Head_size', 'Brain_weight']
model = smf.ols(formula='Head_size ~ Brain_weight', data=df).fit()
# model summary print(model.summary())
Output:
It allows us for the endless customization of our graphs that makes our plot more meaningful
and understandable for others.
Plotly has some amazing features that make it preferable over other visualization tools or
libraries which are:
Simplicity: The syntax is simple as each graph uses the same parameters.
Interactivity: This allows users to interact with graphs on display, allowing for
a better storytelling experience. Zooming in and out, point value display, panning
graphs, and hovering over data values allow us to spot outliers or anomalies in
massive numbers of sample points.
Attractivity: The visuals are very attractive and can be accepted by a wide range
of audiences.
Customizability: It allows endless customization of graphs which makes plots
more meaningful and understandable for others. It allows users to personalize
graphs.
Modules of Plotly
1. Plotly Express: The plotly express module is used to produce professional and easy plots.
import plotly.express as px
2. Plotly Graph Objects: This is used to produce more complex plots than that of plotly
express .
import plotly.graph_objects as go
Example:
#Data Visualization using plotly
import plotly.express as px
# Creating the Figure instance
fig = px.line(x=[1, 2, 3], y=[1, 2, 3])
# showing the plot
fig.show()
Output:
UNIT-5 Python for Data Visualization
With bokeh, we can easily visualize large data and create different charts in an attractive and
elegant manner.
Benefits of Bokeh:
Bokeh allows to build complex statistical plots quickly and through simple commands
Bokeh provides output in various medium like html, notebook and server
We can also embed Bokeh visualization to flask and django app
Bokeh can transform visualization written in other libraries like
matplotlib, seaborn, ggplot
Bokeh has flexibility for applying interaction, layouts and different styling option to
visualization.
Bokeh provides two visualization interfaces to users:
bokeh.models : A low level interface that provides high flexibility to application
developers.
bokeh.plotting : A high level interface for creating visual glyphs.
Visualization with Bokeh
Bokeh offers both powerful and flexible features which imparts
simplicity and highly advanced customization.
It provides multiple visualization interfaces to the user as shown below:
Output:
UNIT-5 Python for Data Visualization
Tutorial Questions:
1. What is matplotlib? Specify the two interfaces used by it with example python code.
2. Write python program to plot line plot by assuming your own data and explain the various
attributes of line plot.
3. Write python program to visualize the your own data using scatter plot and explain its
parameters.
4. Demonstrate the creating of line plot and scatter plot in matplotlib using different parameters
with corresponding python code.
5. Briefly explain about geographic data with basemap with example programs.
(or)Demonstrate different common map projections with example programs.
(or) Explain in details about the functions of mpl-toolkit for geographic data visualization.
6. Elaborate the usage of histogram for data exploration and explain its attributes.
7. Demonstrate Matplotlib functions that can be helpful to display three- dimensional data in two-
dimensions.
8. Explore Seaborn plots with example programs.
9. Elaborate error visualization methods in pyplot.
10. Showcase three dimensional drawing in matplotlib with corresponding python code.
(or) Discuss in detail about the three- dimensional plotting of matplot module.
11. Appraise the following i)Histogram ii) Binning iii) Density with appropriate python code.
12. Outline the data analysis using StatsModels with example programs.
13. Outline graph plotting using Plotly.
14. Describe interactive data visualization using Bokeh.