0% found this document useful (0 votes)
2 views

Chapter 2 - part 2 - (Histogram)

The document provides an overview of histograms, including their definition, construction, and various parameters used in the hist() function for plotting in Python. It explains how to create different types of histograms, such as simple, horizontal, step type, and cumulative histograms, along with examples and code snippets. Additionally, it mentions the importance of libraries like numpy, matplotlib, and pandas for data analysis in Python.

Uploaded by

Reshmi Manoj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Chapter 2 - part 2 - (Histogram)

The document provides an overview of histograms, including their definition, construction, and various parameters used in the hist() function for plotting in Python. It explains how to create different types of histograms, such as simple, horizontal, step type, and cumulative histograms, along with examples and code snippets. Additionally, it mentions the importance of libraries like numpy, matplotlib, and pandas for data analysis in Python.

Uploaded by

Reshmi Manoj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

7/27/2024

Histogram
 A histogram is a powerful technique in data visualization.

 It is an accurate graphical representation of the distribution of


numerical data.

 It was first introduced by Karl Pearson.

 It is an estimate of the distribution of a continuous variable


(quantitative variable).

 It is similar to bar graph.

 To construct a histogram:

 The first step is to “bin” the range of values — means divide


the entire range of values into a series of intervals — then
count how many values fall into each interval.

 The bins - (width of the bar) are usually specified as


consecutive, non overlapping intervals of a variable.

 The bins (intervals) must be adjacent, and are often (but are not
required to be) of equal size.

1
7/27/2024

2
7/27/2024

hist() function – to plot a simple histogram

 Total 20 readings are there in the list h, which is given as the argument of hist()
 By default, there will be 10 classes or bins.
 (180-150)/10 = 30/10 = 3 is taken as the class width
 Classes: 150-153, 153-156,…, 177-180 is represented on the X-axis
 No. of readings in each class (frequency) is represented on the Y axis

bin parameter of hist()

 bin parameter of hist() function allows us to specify the no. of


classes.

3
7/27/2024

facecolor and edgecolor parameters of hist()

To plot a simple histogram with random values & bins=8

 np.random.randn(n) returns an array of n numbers (+ve or –ve),


generated randomly.
 Here, 100 random values are generated

4
7/27/2024

To plot a simple histogram with 100 random values & 25 bins

Adding Title and Axes Labels

5
7/27/2024

hist() and its arguments:


 hist() is used to plot a histogram.

 Arguments of hist() are:

 bins
 weights
 label
 orientation
 align
 edgecolor
 facecolor

 By default, hist() uses a bin value of 10 (i.e., it represents 10


categories, by default) . To customize it you can use bins
argument

Eg: plt.hist (y, bins=25) makes a histogram that contains 25


intervals.

 weights parameter decides the height of each bin.

 orientation ‘horizontal’ for horizontal histogram. Default is


vertical.

6
7/27/2024

bins parameter
 bins parameter can be int or sequence
 If bins is an integer, it defines the number of equal-width bins
in the range. (default: 10)
 If bins is a sequence, it defines the bin edges.
 In this case, bins may be unequally spaced.
 All except the last (right hand-most) bin is half-open. i.e.,
Except last bin, all bins includes the left edge but excludes the
right edge.
 In other words, if bins = [1, 2, 3, 4] then the first bin
is includes 1, but excludes 2. The second includes 2, but
excludes 3. The last bin, however, includes both 3 and 4.

To plot a simple histogram with attributes and bin intervals

• Number of values in
variables dt and wt should
match.

7
7/27/2024

weights parameter
 It is an array of weights, of the same shape as x (let x be the
list of data items)
 Each value in x only contributes its associated weight
towards the bin count.
 This parameter can be used to draw a histogram of data
that has already been binned (by treating each bin as a
single point with a weight equal to its count)

align parameter

 align parameter sets the horizontal alignment of the


histogram bars to 'left', 'mid‘ or 'right‘.
 'left': bars are centered on the left bin edges.
 'mid': bars are centered between the bin edges.
 'right': bars are centered on the right bin edges.
 Default value is 'mid’

8
7/27/2024

To plot histogram of number of students and their


weights

To plot histogram for number of students and their


weights with no value for range from 40- 50

9
7/27/2024

To plot histogram for no: of students and their weights with


no value from 40- 50. Formatting ,title and label is required

SAVING FIGURE

• If you want to save the plotted figure use the following


statement

matplotlib.pyplot.savefig(“filenamewithpath”)
• For example:To save a figure in D drive as pchart
plt.savefig(“D:\\pchart.png”) OR plt.savefig(“D:/pchart.png”)
• Or you can click the save button on the GUI panel
• The file will be saved in the python folder.

10
7/27/2024

What will be the output of the following code?

Answer :

11
7/27/2024

What change should be made in this program to get


the following output?

Answer :

12
7/27/2024

rwidth parameter of hist()

• rwidth = 0.6 means that bar width is


60% of the bin width. Remaining
40% space will be left before and
after the bar

Types of histograms

1. Simple(vertical) histogram
2. Horizontal histogram
[orientation=‘horizontal’]
3. Step type histogram / frequency polygon
[histtype=“step”]
4. Cumulative histogram
[cumulative=True]

13
7/27/2024

CUMULATIVE HISTOGRAM
The cumulative histogram is a histogram in which the vertical axis
gives not just the counts for a single bin, but rather gives the total
count of that bin and all bins before it

STEP TYPE HISTOGRAM-


Step histogram is a type of histogram in which Bars are not filled with
color, instead only the edge is the representation of Histogram. It is in a
form of steps of stairs and therefore it is known as step histogram.There is
no effect of facecolor argument

14
7/27/2024

Qn. 31) Given the following set of data:


78, 72, 69, 81, 63, 67,65
79, 74, 71, 83, 71, 79, 80

a. Create a simple histogram from the above data.


b. Create a horizontal histogram from the above data.
c. Create a step type histogram from the above data.
d. Create a cumulative histogram from the above
data.

a) Simple histogram

15
7/27/2024

b) Horizontal histogram

c) Step type histogram

16
7/27/2024

d)Cumulative histogram

QN 24)
A list namely temp contains average
temperature for seven days of last week. you
want to see how the temperature changes in
last 7 days. Which chart type will you plot for
the same and why?

Ans. Line chart. To track continuous change of


data better choice is line chart.

17
7/27/2024

QN 25) Write the code to practically produce a chart for previous


qn using line chart.

Qn. 34)
kritika was asked to write the names of a few libraries in python used
for data analysis and one method of each . Help her write at least 3
libraries and their methods.

Libraries Methods

numpy array(), arange()

matplotlib plot(), bar(), hist()

pandas Series(), DataFrame()

18

You might also like