0% found this document useful (0 votes)
95 views

Unit V Data Visualization

The document discusses data visualization techniques using Matplotlib in Python. It covers topics like line plots, scatter plots, error bars, histograms, subplots, and three-dimensional plots. Specifically, it provides code examples for creating line plots with Matplotlib, customizing scatter plots by changing marker colors and styles, and using the errorbar function to add error bars to plots.

Uploaded by

nirmalamp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views

Unit V Data Visualization

The document discusses data visualization techniques using Matplotlib in Python. It covers topics like line plots, scatter plots, error bars, histograms, subplots, and three-dimensional plots. Specifically, it provides code examples for creating line plots with Matplotlib, customizing scatter plots by changing marker colors and styles, and using the errorbar function to add error bars to plots.

Uploaded by

nirmalamp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

CS3352 - UNIT V - DATA VISUALIZATION

Importing Matplotlib – Line plots – Scatter plots – visualizing errors – density and
contour plots – Histograms – legends – colors – subplots – text and annotation –
customization – three dimensional plotting – Geographic Data with Basemap –
Visualization with Seaborn.

DATA VISUALIZATION
Data visualization is the graphical representation of information and data.
By using visual elements like charts, graphs, and maps, data visualization tools provide an
accessible way to see and understand trends, outliers, and patterns in data.
Additionally, it provides an excellent way for employees or business owners to present data to non-
technical audiences without confusion.
ADVANTAGES

 Easily sharing information.


 Interactively explore opportunities.
 Visualize patterns and relationships.
DISADVANTAGES

 Biased or inaccurate information.


 Correlation doesn’t always mean action
 Core messages can get lost in translation.

Why data visualization is important


 The importance of data visualization is simple: it helps people see, interact with,
and better understand data. Whether simple or complex, the right visualization can
bring everyone on the same page, regardless of their level of expertise.
 It’s hard to think of a professional industry that doesn’t benefit from making data
more understandable. Every STEM field benefits from understanding data—and
so do fields in government, finance, marketing, history, consumer goods, service
industries, education, sports, and so on.

Importing Matplotlib

Most of the Matplotlib utilities lies under the pyplot submodule, and are usually
imported under the plt alias:

import matplotlib.pyplot as plt


Now the Pyplot package can be referred to as plt.

Matplotlib Plotting

Plotting x and y points


The plot() function is used to draw points (markers) in a diagram.

By default, the plot() function draws a line from point to point.

The function takes parameters for specifying points in the diagram.

Parameter 1 is an array containing the points on the x-axis.

Parameter 2 is an array containing the points on the y-axis.

If we need to plot a line from (0, 0) to (6, 250), we have to pass two arrays [0, 6]
and [0, 250] to the plot function.

LINE PLOT

Example

Draw a line in a diagram from position (0,0) to position (6,250)

import matplotlib.pyplot as plt


import numpy as np
xpoints = np.array([0, 6])
ypoints = np.array([0, 250])
plt.plot(xpoints, ypoints)
plt.show()
OUTPUT
The x-axis is the horizontal axis.

The y-axis is the vertical axis.

Plotting Without Line


Draw a line in a diagram from position (1,3) to (2,8) then to (6,1) and finally to
position (8,10)
import matplotlib.pyplot as plt
import numpy as np
xpoints = np.array([1, 8])
ypoints = np.array([3, 10])
plt.plot(xpoints, ypoints, 'o')
plt.show()

OUTPUT
Multiple Points
You can plot as many points as you like, just make sure you have the same number
of points in both axis.

Example

Draw a line in a diagram from position (1, 3) to (2, 8) then to (6, 1) and
finally to position (8, 10)

import matplotlib.pyplot as plt


import numpy as np
xpoints = np.array([1, 2, 6, 8])
ypoints = np.array([3, 8, 1, 10])
plt.plot(xpoints, ypoints)
plt.show()

OUTPUT
Default X-Points
If we do not specify the points on the x-axis, they will get the default values 0, 1, 2,
3 (etc., depending on the length of the y-points.

So, if we take the same example as above, and leave out the x-points, the diagram
will look like this:

import matplotlib.pyplot as plt


import numpy as np
ypoints = np.array([3, 8, 1, 10, 5, 7])
plt.plot(ypoints)
plt.show()
OUTPUT
Matplotlib Markers
We can use the keyword argument marker to emphasize each point with a specified
marker:
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, marker = 'o')
plt.show()
OUTPUT

Matplotlib Scatter

Creating Scatter Plots


With Pyplot, you can use the scatter() function to draw a scatter plot.

The scatter() function plots one dot for each observation. It needs two
arrays of the same length, one for the values of the x-axis, and one for
values on the y-axis:

import matplotlib.pyplot as plt


import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y)
plt.show()

OUTPUT

The observation in the example above is the result of 13 cars passing by.

The X-axis shows how old the car is.

The Y-axis shows the speed of the car when it passes.

Are there any relationships between the observations?

It seems that the newer the car, the faster it drives, but that could be a coincidence,
after all we only registered 13 cars.

Compare Plots
In the example above, there seems to be a relationship between speed and age, but
if we plot the observations from another day as well? Will the scatter plot tell us
something else?

import matplotlib.pyplot as plt


import numpy as np

#day one, the age and speed of 13 cars:


x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y)
#day two, the age and speed of 15 cars:
x = np.array([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])
y = np.array([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])
plt.scatter(x, y)
plt.show()

OUTPUT

Colors
You can set your own color for each scatter plot with the color or the c argument:

import matplotlib.pyplot as plt


import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y, color = 'hotpink')
x = np.array([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])
y = np.array([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])
plt.scatter(x, y, color = '#88c999')
plt.show()
OUTPUT
Color Each Dot
We can even set a specific color for each dot by using an array of colors as value for
the c argument:

Note: We cannot use the colorargumenr for this, only the c argument.

import matplotlib.pyplot as plt


import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors =
np.array(["red","green","blue","yellow","pink","black","orange","purple","beig
e","brown","gray","cyan","magenta"])
plt.scatter(x, y, c=colors)
plt.show()

OUTPUT
Matplotlib.pyplot,errorbar() in Python
Pyplot is a state-based interface to a Matplotlib module which provides a
MATLAB-like interface.
matplotlib.pyplot.errorbar() Function:
The errorbar() function in pyplot module of matplotlib library is used to plot
y versus x as lines and/or markers with attached errorbars.

Syntax: matplotlib.pyplot.errorbar(x, y, yerr=None, xerr=None, fmt=”,


ecolor=None, elinewidth=None, capsize=None, barsabove=False,
lolims=False, uplims=False, xlolims=False, xuplims=False, errorevery=1,
capthick=None, \*, data=None, \*\*kwargs)

Parameters: This method accept the following parameters that are described
below:
 x, y: These parameter are the horizontal and vertical coordinates of the
data points.
 fmt: This parameter is an optional parameter and it contains the string
value.
 xerr, yerr: These parameter contains an array.And the error array should
have positive values.
 ecolor: This parameter is an optional parameter. And it is the color of the
errorbar lines with default value NONE.
 elinewidth: This parameter is also an optional parameter. And it is the
linewidth of the errorbar lines with default value NONE.
 capsize: This parameter is also an optional parameter. And it is the length
of the error bar caps in points with default value NONE.
 barsabove: This parameter is also an optional parameter. It contains
boolean value True for plotting errorsbars above the plot symbols.Its
default value is False.
 lolims, uplims, xlolims, xuplims: These parameter are also an optional
parameter. They contain boolean values which is used to indicate that a
value gives only upper/lower limits.
 errorevery: This parameter is also an optional parameter. They contain
integer values which is used to draws error bars on a subset of the data.
Returns: This returns the container and it is comprises of the following:
 plotline:This returns the Line2D instance of x, y plot markers and/or line.
 caplines:This returns the tuple of Line2D instances of the error bar caps.
 barlinecols:This returns the tuple of LineCollection with the horizontal and
vertical error ranges.

Example

import numpy as np
import matplotlib.pyplot as plt
xval = np.arrange(0.1, 4, 0.5)
yval = np.exp(xval)
plt.errorbar(xval, yval, xerr = 0.4, yerr = 0.5)
plt.title(‘Matplotlib.pyplot.errorbar() example’)
plt.show()

OUTPUT

Errorbar graph in Python using Matplotlib


Error bars function used as graphical enhancement that visualizes the variability
of the plotted data on a Cartesian graph. Error bars can be applied to graphs to
provide an additional layer of detail on the presented data. As you can see in below
graphs.
Error bars help us to indicate estimated error or uncertainty to give a general sense of
how precise a measurement is this is done through the use of markers drawn over the
original graph and its data points.

To visualize this information error bars work by drawing lines that extend from the center
of the plotted data point or edge with bar charts

The length of an error bar helps to reveal uncertainty of a data point as shown in the
below graph.
A short error bar shows that values are concentrated signaling that the plotted averaged
value is more likely while a long error bar would indicate that the values are more spread
out and less reliable.

Also depending on the type of data. the length of each pair of error bars tends to be of
equal length on both sides, however, if the data is skewed then the lengths on each side
would be unbalanced.

# importing matplotlib
importmatplotlib.pyplot as plt
# making a simple plot
x =[1, 2, 3, 4, 5, 6, 7]
y =[1, 2, 1, 2, 1, 2, 1]
# creating error
y_error =0.2
# plotting graph
plt.plot(x, y)
plt.errorbar(x, y,
yerr =y_error,
fmt ='o')
OUTPUT

<ErrorbarContainer object of 3 artists>

Example 2: Adding Some error in x value.

# importing matplotlib
importmatplotlib.pyplot as plt
# making a simple plot
x =[1, 2, 3, 4, 5, 6, 7]
y =[1, 2, 1, 2, 1, 2, 1]
# creating error
x_error =0.5
# plotting graph
plt.plot(x, y)
plt.errorbar(x, y,
xerr =x_error,
fmt ='o')

OUTPUT

<ErrorbarContainer object of 3 artists>

Example 3: Adding error in x & y


# importing matplotlib
importmatplotlib.pyplot as plt
# making a simple plot
x =[1, 2, 3, 4, 5, 6, 7]
y =[1, 2, 1, 2, 1, 2, 1]
# creating error
x_error =0.5
y_error =0.3
# plotting graph
plt.plot(x, y)
plt.errorbar(x, y,
yerr =y_error,
xerr =x_error,
fmt ='o')

OUTPUT
<ErrorbarContainer object of 3 artists>
HISTOGRAMS
A histogram is the best way to visualize the frequency distribution of a dataset
by splitting it into small equal-sized intervals called bins. The Numpy histogram
function is similar to the hist() function of matplotlib library, the only difference
is that the Numpy histogram gives the numerical representation of the dataset
while the hist() gives graphical representation of the dataset.

Creating Numpy Histogram


Numpy has a built-in numpy.histogram() function which represents the frequency
of data distribution in the graphical form. The rectangles having equal horizontal
size corresponds to class interval called bin and variable height corresponding to
the frequency.
Syntax:
numpy.histogram(data, bins=10, range=None, normed=None, weights=None,
density=None)

Attributes of the above function are listed below:

Attribute Parameter

data array or sequence of array to be plotted

bins int or sequence of str defines number of equal width bins in a range, default is 10

range optional parameter sets lower and upper range of bins


Attribute Parameter

optional parameter same as density attribute, gives incorrect result for unequal bin
normed width

weights optional parameter defines array of weights having same dimensions as data

optional parameter if False result contain number of sample in each bin, if True
density result contain probability density function at bin
The function has two return values hist which gives the array of values of the histogram,
and edge_bin which is an array of float datatype containing the bin edges having length one
more than the hist.
Example:
# Import libraries
importnumpy as np
# Creating dataset
a =np.random.randint(100, size =(50))
# Creating histogram
np.histogram(a, bins =[0, 10, 20, 30, 40,
50, 60, 70, 80, 90,
100])
hist, bins =np.histogram(a, bins =[0, 10,
20, 30,
40, 50,
60, 70,
80, 90,
100])
# printing histogram
print()
print(hist)
print(bins)
print()

OUTPUT

Graphical representation
The above numeric representation of histogram can be converted into a graphical
form.Theplt() function present in pyplot submodule of Matplotlib takes the array of
dataset and array of bin as parameter and creates a histogram of the
corresponding data values.

# import libraries
frommatplotlib importpyplot as plt
importnumpy as np
# Creating dataset
a =np.random.randint(100, size =(50))
# Creating plot
fig =plt.figure(figsize =(10, 7))
plt.hist(a, bins =[0, 10, 20, 30,
40, 50, 60, 70,
80, 90, 100])
plt.title("Numpy Histogram")
# show plot
plt.show()

Pandas: How to Create and Customize Plot Legends


Plot legends give meaning to a visualization, assigning meaning to the various
plot elements. We previously saw how to create a simple legend; here we'll take
a look at customizing the placement and aesthetics of the legend in Matplotlib.

The simplest legend can be created with the plt.legend() command, which
automatically creates a legend for any labeled plot elements:
You can use the following basic syntax to add a legend to a plot in
pandas:
plt.legend(['A', 'B', 'C', 'D'], loc='center left', title='Legend Title')

The following example shows how to use this syntax in practice.


Example: Create and Customize Plot Legend in Pandas
Suppose we have the following pandas DataFrame:
import pandas as pd
#create DataFrame
df = pd.DataFrame({'A':7, 'B':12, 'C':15, 'D':17}, index=['Va
lues'])

We can use the following syntax to create a bar chart to visualize the
values in the DataFrame and add a legend with custom labels:

importmatplotlib.pyplotasplt
#create bar chart
df.plot(kind='bar')

#add legend to bar chart


plt.legend(['A Label', 'B Label', 'C Label', 'D Label'])
<matplotlib.legend.Legend at 0x7fbc84015820>

We can also use the loc argument and the title argument to modify the
location and the title of the legend:

importmatplotlib.pyplotasplt
#create bar chart
df.plot(kind='bar')
#add custom legend to bar chart
plt.legend(['A Label', 'B Label', 'C Label', 'D Label'],
loc='upper left', title='Labels')

Lastly, we can use the size argument to modify the font size in the legend:
importmatplotlib.pyplotasplt
#create bar chart
df.plot(kind='bar')
#add custom legend to bar chart
plt.legend(['A Label', 'B Label', 'C Label', 'D Label'], prop={'size': 20})

OUTPUT
<matplotlib.legend.Legend at 0x7fbc83fd9fa0>

Set Pandas dataframe background Color and font color in


Python
The basic idea behind styling is to make more impactful for the end-user readability. We can
make changes like the color and format of the data visualized in order to communicate insight
more efficiently. For the more impactful visualization on the pandas DataFrame, generally,
we DataFrame.style property, which returns styler object having a number of useful methods
for formatting and visualizing the data frames.

Using DataFrame.style property


 df.style.set_properties: By using this, we can use inbuilt functionality to manipulate data
frame styling from font color to background color.
# Importing the necessary libraries -->
importpandas as pd
importnumpy as np
# Seeding random data from numpy
np.random.seed(24)
# Making the DataFrame
df =pd.DataFrame({'A': np.linspace(1, 10, 10)})
df =pd.concat([df, pd.DataFrame(np.random.randn(10, 4),
columns=list('BCDE'))],
axis=1)
# DataFrame without any styling
print("Original DataFrame:\n")
print(df)
print("\nModifiedStlyingDataFrame:")
df.style.set_properties(**{'background-color': 'black',
'color': 'green'})
df.style.set_properties

 df.style.highlight_null : With the help of this, we can highlight the missing or null values
inside the data frame.

# Replacing the locating value by NaN (Not a Number)


df.iloc[0, 3] =np.nan
df.iloc[2, 3] =np.nan
df.iloc[4, 2] =np.nan
df.iloc[7, 4] =np.nan

# Highlight the NaN values in DataFrame


print("\nModifiedStlyingDataFrame:")
df.style.highlight_null(null_color='red')

OUTPUT
df.style.highlight_null

 df.style.highlight_min : For highlighting the minimum value in each column


throughout the data frame.

# Highlight the Min values in each column


print("\nModified Stlying DataFrame:")
df.style.highlight_min(axis=0)

output
 df.style.highlight_max : For highlighting the maximum value in each
column throughout the data frame.

# Highlight the Max values in each column


print("\nModified Stlying DataFrame:")
df.style.highlight_max(axis=0)

output

df.style.highlight_max

Using User-defined Function


 We can modify DataFrame using a user-defined function: With the help of this function,
we can customizing the font color of positive data values inside the data frame.
# function for set text color of positive
# values in Dataframes
defcolor_positive_green(val):
"""
Takes a scalar and returns a string with
the css property `'color: green'` for positive
strings, black otherwise.
"""
ifval> 0:
color ='green'
else:
color ='black'
return'color: %s'%color
df.style.applymap(color_positive_green)

OUTPUT

SUBPLOTS

How to Create Subplots in Matplotlib with


Python
How to add markers to a Graph Plot using Matplotlib with python.
Matplotlib:Matplotlib is a tremendous visualization library in python
for 2D plots of arrays.
Matplotlib is a multi-platform data visualization library built on Numpy
arrays and designed with the broader SciPy stack.
Subplots: Subplots matplotlib.pyplot.subplots() method provides a
way to plot multiple plots on a single picture.
Given the number of rows and columns, it returns a tuple of (fig, ax)
giving a single picture fig with an array of axes ax.
Approach
 Import Packages
 Import or create some data
 Create Subplot Objects
 Draw a Plot with it

Example 1:

# importing packages
import matplotlib.pyplot as plt
import numpy as np

# making subplots objects


fig, ax = plt.subplots(3, 3)

# draw graph
for i in ax:
for j in i:
j.plot(np.random.randint(0, 5, 5), np.ran
dom.randint(0, 5, 5))

plt.show()

OUTPUT
Example – 2

# importing packages
import matplotlib.pyplot as plt
import numpy as np
# making subplots objects
fig, ax = plt.subplots(2, 2)
# draw graph
ax[0][0].plot(np.random.randint(0, 5, 5), np.random.randint(0, 5, 5))
ax[0][1].plot(np.random.randint(0, 5, 5), np.random.randint(0, 5, 5))
ax[1][0].plot(np.random.randint(0, 5, 5), np.random.randint(0, 5, 5))
ax[1][1].plot(np.random.randint(0, 5, 5), np.random.randint(0, 5, 5))
plt.show()

OUTPUT

Example – 3

# importing packages
import matplotlib.pyplot as plt
import numpy as np
# making subplots objects
fig, ax = plt.subplots(2, 2)
# create data
x = np.linspace(0, 10, 1000)
# draw graph
ax[0, 0].plot(x, np.sin(x), 'r-.')
ax[0, 1].plot(x, np.cos(x), 'g--')
ax[1, 0].plot(x, np.tan(x), 'y-')
ax[1, 1].plot(x, np.sinc(x), 'c.-')
plt.show()
OUTPUT

Text and Annotation


Basic Annotation
The uses of the basic text() will place text at an arbitrary
position on the Axes. A common use case of text is to annotate
some features of the plot, and the annotate( ) method provides
helper functionality to make annotations easy.
In an annotation, there are two points to consider:
1. The location being annotated represented by the argument
xy and
2. The location of the text xytext.
Both of these arguments are (x, y) tuples.

Syntax: angle_spectrum(x, Fs=2, Fc=0, window=mlab.window_hanning,


pad_to=None, sides=’default’, **kwargs)
Parameters: This method accept the following parameters that are
described below:
 s: This parameter is the text of the annotation.
 xy: This parameter is the point (x, y) to annotate.
 xytext: This parameter is an optional parameter. It is The position (x,
y) to place the text at.
 xycoords: This parameter is also an optional parameter and contains
the string value.
 textcoords: This parameter contains the string value.Coordinate
system that xytext is given, which may be different than the coordinate
system used for xy
 arrowprops : This parameter is also an optional parameter and
contains dicttype.Its default value is None.
 annotation_clip : This parameter is also an optional parameter and
contains booleanvalue.Its default value is None which behaves as
True.
Returns: This method returns the annotation.

Example:1

# Implementation of matplotlib.pyplot.annotate()
# function
importmatplotlib.pyplot as plt
importnumpy as np
fig, geeeks=plt.subplots()
t =np.arange(0.0, 5.0, 0.001)
s =np.cos(3*np.pi*t)
line =geeeks.plot(t, s, lw=2)
# Annotation
geeeks.annotate('Local Max', xy=(3.3, 1),
xytext=(3, 1.8),
arrowprops=dict(facecolor='green',
shrink =0.05),)
geeeks.set_ylim(-2, 2)
# Plot the Annotation in the graph
plt.show()
OUTPUT
Python Pandas - Options and Customization
Pandas provide API to customize some aspects of its behavior, display is being mostly
used.
The API is composed of five relevant functions. They are −
 get_option()
 set_option()
 reset_option()
 describe_option()
 option_context()
Let us now understand how the functions operate.

get_option(param)
get_option takes a single parameter and returns the value as given in the output below −
display.max_rows

Displays the default number of value. Interpreter reads this value and displays the rows
with this value as upper limit to display.
import pandas as pd
print(pd.get_option("display.max_rows"))
Its output is as follows −
60
display.max_columns

Displays the default number of value. Interpreter reads this value and displays the rows
with this value as upper limit to display.
import pandas as pd
print(pd.get_option("display.max_columns"))
Its output is as follows −
20
Here, 60 and 20 are the default configuration parameter values.

set_option(param,value)
set_option takes two arguments and sets the value to the parameter as shown below −
display.max_rows

Using set_option(), we can change the default number of rows to be displayed.


import pandas as pd
pd.set_option("display.max_rows",80)
print(pd.get_option("display.max_rows"))
Its output is as follows −
80
display.max_columns

Using set_option(), we can change the default number of rows to be displayed.


import pandas as pd
pd.set_option("display.max_columns",30)
print(pd.get_option("display.max_columns"))
Its output is as follows −
30

reset_option(param)
reset_option takes an argument and sets the value back to the default value.
display.max_rows

Using reset_option(), we can change the value back to the default number of rows to be
displayed.
import pandas as pd
pd.reset_option("display.max_rows")
print(pd.get_option("display.max_rows"))
Its output is as follows −
60

describe_option(param)
describe_option prints the description of the argument.
display.max_rows

Using reset_option(), we can change the value back to the default number of rows to be
displayed.
import pandas as pd
pd.describe_option("display.max_rows")
Its output is as follows −
display.max_rows : int
If max_rows is exceeded, switch to truncate view. Depending on
'large_repr', objects are either centrally truncated or printed as
a summary view. 'None' value means unlimited.
In case python/IPython is running in a terminal and `large_repr`
equals 'truncate' this can be set to 0 and pandas will auto-detect
the height of the terminal and print a truncated object which fits
the screen height. The IPython notebook, IPythonqtconsole, or
IDLE do not run in a terminal and hence it is not possible to do
correct auto-detection.
[default: 60] [currently: 60]

option_context()
option_context context manager is used to set the option in with statement temporarily.
Option values are restored automatically when you exit the with block −
display.max_rows

Using option_context(), we can set the value temporarily.


import pandas as pd
withpd.option_context("display.max_rows",10):
print(pd.get_option("display.max_rows"))
print(pd.get_option("display.max_rows"))
Its output is as follows −
10
10
See, the difference between the first and the second print statements. The first statement
prints the value set by option_context() which is temporary within the with
context itself. After the with context, the second print statement prints the configured
value.
Frequently used Parameters

Sr.No Parameter & Description

1 display.max_rows
Displays maximum number of rows to display
2 display.max_columns
Displays maximum number of columns to display

3 display.expand_frame_repr
Displays DataFrames to Stretch Pages

4 display.max_colwidth
Displays maximum column width

5 display.precision
Displays precision for decimal numbers

3D plotting in Python using matplotlib


Data visualization is one such area where a large number of libraries have been
developed in Python.
Among these, Matplotlib is the most popular choice for data visualization.
While initially developed for plotting 2-D charts like histograms, bar charts, scatter
plots, line plots, etc., Matplotlib has extended its capabilities to offer 3D plotting
modules as well.
We begin by plotting a single point in a 3D coordinate space. Then we’ll move on to
more complicated plots like 3D Gaussian surfaces, 3D polygons, etc.

Plot a single point in a 3D space


Let us begin by going through every step necessary to create a 3D plot in Python,
with an example of plotting a point in 3D space.

Step 1: Import the libraries


import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

The first one is a standard import statement for plotting using matplotlib, which you
would see for 2D plotting as well.
The second import of the Axes3D class is required for enabling 3D projections. It is,
otherwise, not used anywhere else.
Step 2: Create figure and axes
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure(figsize=(4,4))
ax = fig.add_subplot(111, projection='3d')

Output:

Here we are first creating a figure of size 4 inches X 4 inches.


We then create a 3-D axis object by calling the add_subplot method and specifying the
value ‘3d’ to the projection parameter.
We will use this axis object ‘ax’ to add any plot to the figure.
Step 3: Plot the point

After we create the axes object, we can use it to create any type of plot we want in
the 3D space.
To plot a single point, we will use the scatter()method, and pass the three
coordinates of the point.
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure(figsize=(4,4))
ax = fig.add_subplot(111, projection='3d')
ax.scatter(2,3,4) # plot the point (2,3,4) on the figure
plt.show()

OUTPUT

Plotting a 3D continuous line


Now that we know how to plot a single point in 3D, we can similarly plot a
continuous line passing through a list of 3D coordinates.

We will use the plot() method and pass 3 arrays, one each for the x, y, and z
coordinates of the points on the line.
import numpy as np
x = np.linspace(-4*np.pi,4*np.pi,50)
y = np.linspace(-4*np.pi,4*np.pi,50)
z = x**2 + y**2
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot(x,y,z)
plt.show()

OUTPUT
We are generating x, y, and z coordinates for 50 points.
The x and y coordinates are generated usingnp.linspace to generate 50 uniformly
distributed points between -4π and +4π. The z coordinate is simply the sum of the
squares of the corresponding x and y coordinates.

Customizing a 3D plot
Let us plot a scatter plot in 3D space and look at how we can customize its
appearance in different ways based on our preferences
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
np.random.seed(42)
xs = np.random.random(100)*10+20
ys = np.random.random(100)*5+7
zs = np.random.random(100)*15+50
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(xs,ys,zs)
plt.show()

OUTPUT
Let us now add a title to this plot

Adding a title

We will call the set_title method of the axes object to add a title to the plot.

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
print(ax.set_title("Atom velocity distribution"))
np.random.seed(42)
xs = np.random.random(100)*10+20
ys = np.random.random(100)*5+7
zs = np.random.random(100)*15+50
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(xs,ys,zs)
plt.show()

OUTPUT

Text(0.5, 0.92, 'Atom velocity distribution')


Plotting data on a map (Example Gallery)

Following are a series of examples that illustrate how to use Basemap instance methods
to plot your data on a map. More examples are included in the examples directory of
the basemap source distribution. There are a number of Basemap instance methods for
plotting data:

 contour(): draw contour lines.


 contourf(): draw filled contours.
 imshow(): draw an image.
 pcolor(): draw a pseudocolor plot.
 pcolormesh(): draw a pseudocolor plot (faster version for regular
meshes).
 plot(): draw lines and/or markers.
 scatter(): draw points with markers.
 quiver(): draw vectors.
 barbs(): draw wind barbs.
 drawgreatcircle(): draw a great circle.

Many of these instances methods simply forward to the corresponding


matplotlib Axes instance method, with some extra pre/post processing and
argument checking. You can also plot on the map directly with the
matplotlib pyplot interface, or the OO api, using the Axes instance associated
with the Basemap.
For more specifics of how to use the Basemap instance methods, see The
Matplotlib Basemap Toolkit API.

Here are the examples (many of which utilize the netcdf4-python module to
retrieve datasets over http):

 Plot contour lines on a basemap

from mpl_toolkits.basemap import Basemap


import matplotlib.pyplot as plt
import numpy as np
# set up orthographic map projection with
# perspective of satellite looking down at 50N, 100W.
# use low resolution coastlines.
map = Basemap(projection='ortho',lat_0=45,lon_0=-100,resolution='l')
# draw coastlines, country boundaries, fill continents.
map.drawcoastlines(linewidth=0.25)
map.drawcountries(linewidth=0.25)
map.fillcontinents(color='coral',lake_color='aqua')
# draw the edge of the map projection region (the projection limb)
map.drawmapboundary(fill_color='aqua')
# draw lat/lon grid lines every 30 degrees.
map.drawmeridians(np.arange(0,360,30))
map.drawparallels(np.arange(-90,90,30))
# make up some data on a regular lat/lon grid.
nlats = 73; nlons = 145; delta = 2.*np.pi/(nlons-1)
lats = (0.5*np.pi-delta*np.indices((nlats,nlons))[0,:,:])
lons = (delta*np.indices((nlats,nlons))[1,:,:])
wave = 0.75*(np.sin(2.*lats)**8*np.cos(4.*lons))
mean = 0.5*np.cos(2.*lats)*((np.sin(2.*lats))**2 + 2.)
# compute native map projection coordinates of lat/lon grid.
x, y = map(lons*180./np.pi, lats*180./np.pi)
# contour data over the map.
cs = map.contour(x,y,wave+mean,15,linewidths=1.5)
plt.title('contour lines over filled continent background')
plt.show()

OUTPUT

Data Visualization using Seaborn (a Python library)

Data Visualization is the presentation of data in pictorial format.

It is extremely important for Data Analysis, primarily because of fantastic


ecosystem of data – centric python packages.

It helps to understand the data.


The significance of data by summarizing and presenting a huge amount of data
in a simple and easy to understand format and helps communicate information
clearly and effectively.

Pandas and Seaborn is one of those packages and makes importing and
analyzing data much easier.

Seaborn
Seaborn is an amazing visualization library for statistical graphics
plotting in python.
It is built on the top of matplotlib library and also closely integrated into
the data structures from pandas.
Seaborn is a Python data visualization library based on Matplotlib.
It provides a high-level interface for drawing attractive and informative statistical
graphics.
● Usage: Those who want to create amplified data visuals, especially in color.
About Seaborn’s Pros and Cons:
● Pro: Includes higher level interfaces and settings than does Matplotlib
● Pro: Relatively simple to use, just like Matplotlib.
● Pro: Easier to use when working with Dataframes.
● Con: Like Matplotlib, data visualization seems to be simpler than other tools.
The features help in –
● Built in themes for styling matplotlib graphics
● Visualizing univariate and bivariate data
● Fitting in and visualizing linear regression models
● Plotting statistical time series data
● Seaborn works well with NumPy and Pandas data structures
● It comes with built in themes for styling Matplotlib graphics
Seaborn - Installation
Installing Seaborn should also be straightforward.
The following command will help you import Pandas:.
# Pandas for managing datasets
import pandas as pd
Now, let us import the Matplotlib library, which helps us customize our plots .
# Matplotlib for additional customization
from matplotlib import pyplot as plt
We will import the Seaborn library with the following command:
# Seaborn for plotting and styling
import seaborn as sb
Sample code:
# import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn %matplotlib inline

Importing Data as Pandas DataFrame


we will import a dataset. This dataset loads as Pandas DataFrame by de fault. I f there is
any function in the Pandas DataFrame, it works on this DataFrame.
Example

# Seaborn for plotting and styling


import seaborn as sb
df = sb.load_dataset('tips')
print (df.head())

OUTPUT

total_bill tip sex smoker day time size


0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2

4 24.59 3.61 Female No Sun Dinner 4

To view all the available data sets in the Seaborn library, you can use the
following command with the get_dataset_names() function

# Seaborn for plotting and styling


import seaborn as sb
df = sb.load_dataset('tips')
print (df.head())
print (sb.get_dataset_names())
OUTPUT

total_bill tip sex smoker day time size


0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4

['anagrams', 'anscombe', 'attention', 'brain_networks',


'car_crashes', 'diamonds', 'dots', 'dowjones', 'exercise',
'flights', 'fmri', 'geyser', 'glue', 'healthexp', 'iris',
'mpg', 'penguins', 'planets', 'seaice', 'taxis', 'tips',
'titanic']

Visualizing data is one step and further making the visualized data more pleasing
is another step. Visualization plays a vital role in communicating quantitative
insights to an audience to catch their attention.

Aesthetics means a set of principles concerned with the nature and appreciation of
beauty, especially in art. Visualization is an art of representing data in effective and
easiest possible way. Matplotlib library highly supports customization, but
knowing what settings to tweak to achieve an attractive and anticipated plot is what
one should be aware of to make use of it. Unlike Matplotlib, Seaborn comes packed
with customized themes and a high -level interface for customizing and controlling
the look of Matplotlib figures.

import numpy as np
from matplotlib import pyplot as plt
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 5):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
sinplot()
plt.show()

OUTPUT
To change the same plot to Seaborn de faults, use the set() function:
import numpy as np
from matplotlib import pyplot as plt
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 5):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
import seaborn as sb
sb.set()
sinplot()
plt.show()

OUTPUT

The above two figures show the difference in the de fault Matplotlib and Seaborn plots.
The representation of data is same, but the repre sentation style varies in both.

Basically, Seaborn splits the Matplotlib parameters into two groups:

 Plot styles
 Plot scale

The interface for m anipulating the styles is set_style(). U sing this function you can set
the them e of the plot. As per the latest updated version, below are the five themes
available.

 Darkgrid

 Whitegrid

 Dark

 White

 Ticks

The de fault theme of the plot will be darkgrid which we have seen in the
previous example.
import numpy as np
from matplotlib import pyplot as plt
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 5):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
import seaborn as sb
sb.set_style("whitegrid")
sinplot()
plt.show()

OUTPUT
The difference between the above two plots is the background color.

Removing Axes Spines

In the white and ticks themes, we can rem ove the top and right axis spines using
the despine() function.

Example

import numpy as np
from matplotlib import pyplot as plt
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 5):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
import seaborn as sb
sb.set_style("white")
sinplot()
sb.despine()
plt.show()

OUTPUT

In the regular plots , we use left and bottom axes only. Using the despine() function, we
can avoid the unnecessary right and top axes spines, which is not supported in Matplotlib.
Overriding the Element

If you want to customize the Seaborn styles, you can pass a dictionary of parameters to
the set_style() function. Parameters available are viewed using axes_style() function.

Altering the values of any of the parameter will alter the plot style .

Example

import numpy as np
from matplotlib import pyplot as plt
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 5):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
import seaborn as sb
sb.set_style("darkgrid", {'axes.axisbelow': False})
sinplot()
sb.despine()
plt.show()

OUTPUT

Scaling Plot Elements

We also have control on the plot elements and can control the scale of plot using
the set_context() function.
We have four preset templates for contexts , based on relative size , the contexts
are named as follows:
 Paper
 Notebook
 Talk
 Poster
By default, context is set to notebook; and was used in the plots above.

Example

import numpy as np
from matplotlib import pyplot as plt
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 5):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
import seaborn as sb
sb.set_style("darkgrid", {'axes.axisbelow': False})
sinplot()
sb.despine()
plt.show()

OUTPUT

Color Palette

we will classify the different ways for using color_palette() types:

● qualitative

● sequential

● diverging

We have another function seaborn.palplot() which deals with color palettes.

This function plots the color palette as horizontal array.


Qualitative Color Palettes

Qualitative or categorical palettes are best suitable to plot the categorical data.

from matplotlib import pyplot as plt


import seaborn as sb
current_palette = sb.color_palette()
sb.palplot(current_palette)
plt.show()

OUTPUT

Sequential Color Palettes

Sequential plots are suitable to express the distribution of data ranging from relative lower
values to higher values within a range.

Appending an additional character ‘s’ to the color passed to the color parameter will plot
the Sequential plot.

Example

from matplotlib import pyplot as plt


import seaborn as sb
current_palette = sb.color_palette()
sb.palplot(sb.color_palette("Greens"))
plt.show()

OUTPUT

Diverging Color Palette


Diverging palettes use two different colors . Each color represents variation in the value
ranging from a common point in either direction.

Assume plotting the data ranging from -1 to 1. The values from -1 to 0 takes one color
and 0 to +1 takes another color.

By de fault, the values are centered from zero. You can control it with parameter center
by passing a value.

from matplotlib import pyplot as plt


import seaborn as sb
current_palette = sb.color_palette()
sb.palplot(sb.color_palette("BrBG", 7))
plt.show()

OUTPUT

You might also like