XII-IP - Data Visualisation
XII-IP - Data Visualisation
X-Val ue : Dataset for X-Axis. Default value is [O..N-1], if not given values for X Axis.
Dataset may be 1- D Array, List, Series or Column of Data f rame.
Y-Value: Dataset for Y-Axis. Number of values for X-Axis and Y-Axis should be equal.
Lin e / Marker-Color: Defines color code for line and Marker symbol.
Linew idth= < n > : A number indicating thickness of line.
Linestyle= <style > : Line style may be solid, dashed, dashdot or dotted as per
given code.
Marker = <style > : Defines Marker style code.
Markersize= < n > : Defines size of marker in number.
Markeredgecolor = <color code>: Defines color for marker edge.
Label= < t e x t > : Defines label text for legend.
Customizing Line Chart: plot( )
❖ Color code for lines/ Marker/ Marker edge:
You can also define Marker type with line code color. E x .’r+’ i.e. Red
line with + marker. But this will create Scatter graph
Line color is black (color ‘k’) Marker Type = small diamond (‘d’)
Marker Size=5 points Marker Color=‘red’.
Customizing Line Chart: plot( )
import matplotlib.pyplot as plt
p=[1,2,3,4]
q=[2,4,6,8]
plt.plot(p,q,’r+’,linestyle=‘solid’)
plt.show()
Line color and marker style combined so marker takes same color as
line and style as ‘+’.
Customizing Line Chart: plot( )
import matplotlib.pyplot as plt
p=[1,2,3,4]
q=[2,4,6,8]
plt.plot(p,q,’r+’,linestyle=‘solid’, markeredgecolor=‘b’)
plt.show()
X-Value: Dataset for X-Axis. Dataset may be 1-D Array, List, Series or Column of
Data frame.
Y-Value: Dataset for Y-Axis. Number of values for X-Axis and Y-Axis should be
equal.
Marker =<style>: Defines Marker style code.
S=<size(s)>:Defines size of marker points in number. You can also define
a list of sizes for each points, if required.
C= <color code(s)>: Defines color for markers. You can also define a list of
separate color code for each point
Label=<text>: Defines label text for legend.
Color code and Marker style characters can be used as per plot() method
Customizing Scatter Chart: scatter ()
❖ Example of scatter chart:
#Scatter chart for year & pass students
import matplotlib.pyplot as plt
year=[2018,2019,2020,2021]
pas= [28,34,36,30]
#plotting graph
plt.scatter(year,pas,marker="D",s=15,c='r')
plt.show()
As we ca n see that Scatter chart is similar to line chart except that in line chart all
marker points are connected through a line. So, if we disconnect all marker points
than we can get scatter chart through plot() function too.
In plot () function, if we give marker type with line color code, then it create
scatter graph. For example, 'rd' will give red color diamond shaped marker points.
X-Value: Dataset for X-Axis. Dataset may be 1-D Array, List, Series or Column of Data
frame.
Y-Value: Dataset for Y-Axis. Number of values for X-Axis and Y-Axis should be
equal.
width=<size(s) > : Defines width of bars in number. You can also define a list of sizes for
each bar, if required.
color =<color code(s)>: Defines color for bars. You can also define a list of separate
color code for each bar.
Label= <text>: Defines label text for legend.
Color code characters can be used as per plot() method
Customizing Bar Chart: bar()
Data-Value: Dataset for the chart. Dataset may be 1-D Array, List, Series
or Column of Dataf rame.
bins= <number(s)> : Defines the number or bins (in number) and range if given
as list of numbers.
histtype= <type>: Defines the type of plot. Type may be bar, barstacked, step
and stepfilled. Default is bar type. The barstacked is used for multiple data
sets in which one is stacked on top of other.
cumulative= <T/ F>: Creates cumulative graph if True. Default is false.
orientation = <'horizontal'I'vertical'>: Defines orientation of plot as
horizontal bars or vertical bars.
import numpy as np
import matplotlib.pyplot as
plt
data = [1,11,21,31,41]
plt.hist([5,15,25,35,15, 55], bins=[0,10,20,30,40,50, 60],
weights=[20,10,45,33,6,8],edgecolor="red")
plt.show()
# at interval (bin) 40 to 50 no bar because we have
not mentioned position from 40 to 50 in first
argument(list) of hist method. Where as in interval 10
to 20 width is being Displayed as 16 (10+6 both
weights are added) because 15 is twice In first
argument.
Customizing Histogram Chart: hist()
❖ Example of Histogram chart
# Hitogram for age of students
import matplotlib.pyplot as plt
data=[2,4,5,6,2,6,8,10,12,12,11,
10,4,3,2,5,7,9,11,12,13,14,
15,14,10,9,7,9]
# histogram with 5 bins
plt.hist(data, bins=5)
plt.show()
Cumulative Histogram
import numpy as np
import matplotlib.pyplot as plt
x=np.random.randn(1000)
plt.hist(x, bins=20)
plt.show()
import numpy as np
import matplotlib.pyplot as plt
x=np.random.randn(1000)
plt.hist(x, bins=50)
plt.show()
Customizing Histogram Chart: hist()
❖ Example of Histogram chart:
import numpy as np
import matplotlib.pyplot as plt
x=np.random.rand(100)
y=np.random.rand(100)
plt.hist([x,
y],histtype='barstacked'
,cumulative=True)
plt.show()
import numpy as np
import matplotlib.pyplot as plt
x=np.random.rand(100)
plt.hist(x,bins=20,
orientation='horizontal')
plt.show()
Frequency Polygon chart:
❖ Frequency Polygon Chart:
Frequency Polygon is also a frequency distribution chart in which mid
point of each range or interval is marked and connected with a line to show
comparison of two or more distributions on same axis.
Basically it is extension of Histogram, in which additional line is plotted to
connect mid point of each frequency bar. So, Frequency polygon can be
visualize as Histogram and Line chart .
Pyplot does not provide function / method for Frequency polygon, but we
can plot frequency polygon by following steps-
print (f) [ 1. 4. 5. 4. 2. ]
If we trace the values print (edges) [ 0 3 6 9 12 15 ]
of f, edges and mid print (edges[1:])
variable, then we will [ 3 6 9 12 15 ]
print (edges[:-1]) [ 0 3 6 9 12 ]
get the following array print(mid)
of values. [ 1.5 4 .5 7.5 10.5 13.5 ]
Pie chart: pie()
❖ pie( ): Creates pie chart for given data set.
Pie chart is a circle chart in which area of whole circle is divided into
sectors or slices to represent a part of the whole in percentage (%). Pie
chart takes single data range only i.e. it shows the share of individual
elements of given data range in respect to whole.
# creating simple pie chart
import matplotlib.pyplot as plt
data= [15,40,35,30,45]
# plotting graph
plt.pie(data)
plt .show()
< p y p lot obj > .p ie ( < Data-Value > [ , labels = < list of labels for
sectors )>] [ colors= < color co d e ( s ) >] [ explode = < e x p lode
sequence>] [ autopct = <format st r in g > ] )
Data-Value : Dataset for chart. Dataset may be 1-D Array, List, Series or Column of
DataFrame.
labels=<list of labels> : Defines texts to be displayed for each sector or partition.
Number of labels should be equal to number of data elements. color =<color
code(s)>: Defines color for sectors. You should define a list of separate color code
for each segment.
explode=<explode sequence>: You may pullout sectors to highlight data . Define a list
with 0 or distance in number for sector to be explode.
autopct=<format string>: Defines format string for data labels as “%<width>d“ or “%
<width>.<precision>f“
Example: “%5d“: defines 5 width integer number.
Example: “%6.2f: defines 6 digit number with 2 decimal place
Customizing pie Chart: pie( )
❖ Example of pie chart:
# pie chart for expenses by person on tour
import matplotlib.pyplot as plt
exp= [2500,5000,3000,2500]
head= ['Fooding ','lodging ','Traveling
','Misc']
# plotting graph
plt.pie(exp,labels=head,
colors= ['r','g','b','m' ])
plt.show()
You can also suffix 0/o sign with data labels by adding %%
In format string with autopct.
Box Plot chart: boxplot()
❖ boxplot(): Creates descri ptive graph with 5 descriptions The Box Plot chart is a
presentation of five descriptive indicators, which comprises of the following-
1. The minimum range of Data set (values)- min()
2. The maximum range of Data set (values)- max()
3. The median value of data set -Q2- median()
4. The upper quartile- Q3
5. The lower quartile- Ql
Customizing Box Plot: boxplot()
Pyplot's boxplot() function offers various setting to customize box plot
chart like orientation, notch and labels etc.
Data-Value: Dataset for chart. Dataset may be 1-D Array, List, Series or
Column of Dataf rame. Multiple data set may given in list.
notch=<True/False>: Produces notched box plot, if True. Otherwise
create simple box plot. Default is False.
vert=<True/False>: Produces vertical box plot, if True. Default is True.
meanline=<True/ False> : Shows mean line in box, if set True.
showbox=< True/ False> : Shows box, if set True. Default is True.
showmeans=<True/ False> : Shows arithmetical mean, if set True.
patch_artist= <True/ False> : Fills the box with color, if set True.
labels=<list of labels> : Defines labels to be displayed for each boxplot. Used
when multiple boxplots are being plotted on multiple data set.
Customizing pie Box Plot: boxplot()
❖ Example of box plot chart:
Simple Box plot
x=X-axis data : Specify DataFrame column to be used for X-axis. y=Y-axis data:
Specify DataFrame column to be used for Y-axis.
Kind=<Graph type>: Specify which graph to be plotted. Default chart type is line chart,
if no type is given.
Pandas Plot() Method:
❖ Line chart using Pandas plot() method:
import pandas as pd
import matplotlib.pyplot as plt
dct={'Male ':[60,65,70,67],
'Female ':[34,55,32,46]}
df=pd.DataFrame(dct,index= ['Assam','Tripura',
'Maghalaya','Manipur'])
df.plot(kind='line')
plt.show()
import pandas as pd
import matplotlib.pyplot as plt
dct={'Male':[60,65,70,67],
'Female':[34,55,32,46]}
df=pd.DataFrame(dct,index= ['Assam','Tripura','
Maghalaya','Manipur' ])
df.plot(x='Male',y='Female', kind='scatter')
plt .show()
Pandas Plot() Method:
❖ Bar chart using Pandas plot() method:
import pandas as pd
import matplotlib.pyplot as plt
dct={'Male ':[60,65,70,67],
'Female':[34,55,32,46]}
df=pd.DataFrame(dct,index= ['Assam','Tripura',
'Maghalaya' ,'Manipur '])
df.plot(kind='bar')
plt.show()