Python - Normal Distribution in Statistics
Last Updated :
19 Apr, 2024
A probability distribution determines the probability of all the outcomes a random variable takes. The distribution can either be continuous or discrete distribution depending upon the values that a random variable takes. There are several types of probability distribution like Normal distribution, Uniform distribution, exponential distribution, etc. In this article, we will see about Normal distribution and we will also see how we can use Python to plot the Normal distribution.
What is Normal Distribution
The normal distribution is a continuous probability distribution function also known as Gaussian distribution which is symmetric about its mean and has a bell-shaped curve. It is one of the most used probability distributions. Two parameters characterize it
- Mean(μ)- It represents the center of the distribution
- Standard Deviation(σ) - It represents the spread in the curve
The formula for Normal distribution is
Normal Distribution formulaProperties Of Normal Distribution
- Symmetric distribution - The normal distribution is symmetric about its mean point. It means the distribution is perfectly balanced toward its mean point with half of the data on either side.
- Bell-Shaped curve - The graph of a normal distribution takes the form bell-shaped curve with most of the points accumulated at its mean position. The shape of this curve is determined by the mean and standard deviation of the distribution
- Empirical Rule - The normal distribution curve follows the empirical rule where 68% of the data lies within 1 standard deviation from the mean of the graph, 95% of the data lies within 2 standard deviations from the mean and 99.7% of the data lies within 3 standard deviations from the mean.
Empirical rule in Normal distribution - Additive Rule - The sum of two or more normal distributions will always be a normal distribution.
- Central Limit Theoram - It states if we take the mean of large no data points collected from independent and identical distributed random variables then these mean will follow a normal distribution regardless of their original distribution.
Normal Distribution Using Python
Python programming language has several libraries which could be used to plot normal distribution and get the probability distributive function of data points.
Modules Needed For Plotting and Applying Normal Distribution
- Numpy – A Python library that is used for numerical mathematical computation and handling multidimensional ndarray, it also has a very large collection of mathematical functions to operate on this array.
- Pandas – A Python library built on top of NumPy for effective matrix multiplication and dataframe manipulation, it is also used for data cleaning, data merging, data reshaping, and data aggregation
- Matplotlib – It is used for plotting 2D and 3D visualization plots, it also supports a variety of output formats including graphs
- Scipy - A Python library that is used for solving mathematical equations and algorithms. It is one most used libraries for Statistics and calculus functions.
We can use these modules to plot the normal distribution curve of data points. Also We
Calculating the Probability distribution of single data points using Python
Python3
import numpy as np
def normal_dist(x, mean, sd):
prob_density = (np.pi*sd) * np.exp(-0.5*((x-mean)/sd)**2)
return prob_density
mean = 0
sd = 1
x = 1
result = normal_dist(x, mean, sd)
print(result)
Output:
1.9054722647301798
Python code for plotting Normal Distribution
Python3
import numpy as np
import matplotlib.pyplot as plt
# Mean of the distribution
Mean = 100
# satndard deviation of the distribution
Standard_deviation = 5
# size
size = 100000
# creating a normal distribution data
values = np.random.normal(Mean, Standard_deviation, size)
# plotting histograph
plt.hist(values, 100)
# plotting mean line
plt.axvline(values.mean(), color='k', linestyle='dashed', linewidth=2)
plt.show()
Output:
Normal Distribution graphNormal Distribution Example with Python
Suppose there are 100 students in the class and in one of the mathematics tests the average marks scored by the students in the subject is 78 and the standard deviation is 25. The marks of the student follow Normal probability distribution. We can use this information to answer some questions about the student's marks.
Python Code for Percentage of Students who got less than 60 marks
Here we will use the norm() function from scipy.stats module to make the probability distribution for the population's mean equal to 78 and the standard deviation equal to 25.
scipy.stats.norm() is a normal continuous random variable. It is inherited from the generic methods as an instance of the rv_continuous class. It completes the methods with details specific to this particular distribution.
q : lower and upper tail probability
x : quantiles
loc : Mean . Default = 0
scale : [optional]scale parameter. Default = 1
size : [tuple of ints, optional] shape or random variates.
Results : normal continuous random variable
Python3
# import required libraries
from scipy.stats import norm
import numpy as np
# Given information
mean = 78
std_dev = 25
total_students = 100
score = 60
# Calculate z-score for 60
z_score = (score - mean) / std_dev
# Calculate the probability of getting a score less than 60
prob = norm.cdf(z_score)
# Calculate the percentage of students who got less than 60 marks
percent = prob * 100
# Print the result
print("Percentage of students who got less than 60 marks:", round(percent, 2), "%")
Output:
Percentage of students who got less than 60 marks: 23.58 %
It specifies that approx 23% percent of children have scored fewer marks than 60 in mathematics.
Python Code for Percentage of Students who have scored More than 70
To get the percentage of people who have scored more than 70. We first find the probability of people who have scored less than 70 then we will subtract the probability from 1 to get the Number of people who have scored more than 70.
Python3
# import required libraries
from scipy.stats import norm
import numpy as np
# Given information
mean = 78
std_dev = 25
total_students = 100
score = 70
# Calculate z-score for 70
z_score = (score - mean) / std_dev
# Calculate the probability of getting a more than 70
prob = norm.cdf(z_score)
# Calculate the percentage of students who got more than 70 marks
percent = (1-prob) * 100
# Print the result
print("Percentage of students who got more than /
70 marks: ", round(percent, 2), " %")
Output:
Percentage of students who got more than 70 marks: 62.55 %
Python Code for Percentage of Students who have scored More than 75 and less than 85
Python3
# import required libraries
from scipy.stats import norm
import numpy as np
# Given information
mean = 78
std_dev = 25
total_students = 100
min_score = 75
max_score = 85
# Calculate z-score for 75
z_min_score = (min_score - mean) / std_dev
# Calculate z-score for 85
z_max_score = (max_score - mean) / std_dev
# Calculate the probability of getting less than 70
min_prob = norm.cdf(z_min_score)
# Calculate the probability of getting less than 85
max_prob = norm.cdf(z_max_score)
percent = (max_prob-min_prob) * 100
# Print the result
print("Percentage of students who got marks between 75 and 85 is", round(percent, 2), "%")
Output:
Percentage of students who got marks between 75 and 85 is 15.8 %
Similar Reads
Python Tutorial - Learn Python Programming Language Python is one of the most popular programming languages. Itâs simple to use, packed with features and supported by a wide range of libraries and frameworks. Its clean syntax makes it beginner-friendly. It'sA high-level language, used in web development, data science, automation, AI and more.Known fo
10 min read
Python Interview Questions and Answers Python is the most used language in top companies such as Intel, IBM, NASA, Pixar, Netflix, Facebook, JP Morgan Chase, Spotify and many more because of its simplicity and powerful libraries. To crack their Online Assessment and Interview Rounds as a Python developer, we need to master important Pyth
15+ min read
Python OOPs Concepts Object Oriented Programming is a fundamental concept in Python, empowering developers to build modular, maintainable, and scalable applications. By understanding the core OOP principles (classes, objects, inheritance, encapsulation, polymorphism, and abstraction), programmers can leverage the full p
11 min read
Python Projects - Beginner to Advanced Python is one of the most popular programming languages due to its simplicity, versatility, and supportive community. Whether youâre a beginner eager to learn the basics or an experienced programmer looking to challenge your skills, there are countless Python projects to help you grow.Hereâs a list
10 min read
Python Exercise with Practice Questions and Solutions Python Exercise for Beginner: Practice makes perfect in everything, and this is especially true when learning Python. If you're a beginner, regularly practicing Python exercises will build your confidence and sharpen your skills. To help you improve, try these Python exercises with solutions to test
9 min read
Python Programs Practice with Python program examples is always a good choice to scale up your logical understanding and programming skills and this article will provide you with the best sets of Python code examples.The below Python section contains a wide collection of Python programming examples. These Python co
11 min read
Python Introduction Python was created by Guido van Rossum in 1991 and further developed by the Python Software Foundation. It was designed with focus on code readability and its syntax allows us to express concepts in fewer lines of code.Key Features of PythonPythonâs simple and readable syntax makes it beginner-frien
3 min read
Python Data Types Python Data types are the classification or categorization of data items. It represents the kind of value that tells what operations can be performed on a particular data. Since everything is an object in Python programming, Python data types are classes and variables are instances (objects) of thes
9 min read
Input and Output in Python Understanding input and output operations is fundamental to Python programming. With the print() function, we can display output in various formats, while the input() function enables interaction with users by gathering input during program execution. Taking input in PythonPython input() function is
8 min read
Enumerate() in Python enumerate() function adds a counter to each item in a list or other iterable. It turns the iterable into something we can loop through, where each item comes with its number (starting from 0 by default). We can also turn it into a list of (number, item) pairs using list().Let's look at a simple exam
3 min read