Cumulative Distribution Function (CDF), is a fundamental concept in probability theory and statistics that provides a way to describe the distribution of the random variable. It represents the probability that a random variable takes a value less than or equal to a certain value. The CDF is a non-decreasing function that ranges from 0 to 1 capturing the entire probability distribution of the random variable.
What is a Cumulative Distribution Function?
The Cumulative Distribution Function (CDF) of a random variable is a mathematical function that provides the probability that the variable will take a value less than or equal to a particular number.
The CDF starts at 0 for the smallest possible value of X and increases to 1 as x approaches the largest possible value of X. It is a non-decreasing function that provides a complete description of the distribution of the random variable.
For example, if you're looking at the CDF for a test score of 80, and it gives you 0.75, this means there's a 75% chance that a random student's score will be 80 or less.
In short, the CDF helps you understand the likelihood of a random value being within a certain range by summing up probabilities as you go along.
Definition of CDF
The Cumulative Distribution Function F(x) of the random variable X is defined as:
F(x) = P(X \leq x)
Graphical Representation of Cumulative Distribution Function
The graph of a CDF is a non-decreasing curve that starts at 0 and approaches 1 as the x increases. It is useful for visualizing the probability distribution of the random variable.
Properties of Cumulative Distribution Function
Some of the properties of Cumulative Distribution Function:
The CDF is a non-decreasing function. This means that for any two values x_1 and x_2 such that x_1 \leq x_2 the corresponding CDF values satisfy F(x_1) \leq F(x_2).
As x approaches negative infinity the CDF approaches 0:
\lim_{x \to -\infty} F(x) = 0
As x approaches positive infinity the CDF approaches 1:
\lim_{x \to +\infty} F(x) = 1
The CDF of a continuous random variable is a continuous function while the CDF of the discrete random variable has jumps or discontinuities at specific points.
The CDF is always non-negative i.e. F(x) \geq 0 for the all x.
How to Calculate Cumulative Distribution Function?
Steps to find cumulative distribution function are given below-
Step 1. Identify the Distribution: The Determine whether the random variable follows a discrete or continuous distribution.
Step 2. Determine the PDF: For continuous distributions find the Probability Density Function (PDF) f(x).
Step 3. Integrate the PDF: The Integrate the PDF from the -\infty to x to find the CDF:
F(x) = \int_{-\infty}^{x} f(t) \, dt
Step 4. Sum the Probabilities: For discrete distributions sum the probabilities for the all values less than or equal to the x.
Using Probability Density Function (PDF) to Find CDF
For continuous random variables the CDF F(x) is derived from the PDF f(x) by the integrating:
F(x) = \int_{-\infty}^{x} f(t) \, dt
This process accumulates the probability from the left up to the point x.
Types of Cumulative Distribution Functions
CDF of a Discrete Random Variable
For a discrete random variable X with the possible values x_1, x_2, \dots, x_n and corresponding probabilities P(X = x_i) = p_i the CDF is given by:
F(x) = \sum_{x_i \leq x} p_i
The CDF of the discrete random variable increases in steps at the points where the variable takes on the specific values.
For example, Consider a random variable X that represents the roll of the fair 6-sided die. The CDF F(x) for this random variable is:
F(x) =
\begin{cases}
0 & \text{if } x < 1 \\
\frac{1}{6} & \text{if } 1 \leq x < 2 \\
\frac{2}{6} & \text{if } 2 \leq x < 3 \\
\frac{3}{6} & \text{if } 3 \leq x < 4 \\
\frac{4}{6} & \text{if } 4 \leq x < 5 \\
\frac{5}{6} & \text{if } 5 \leq x < 6 \\
1 & \text{if } x \geq 6
\end{cases}
CDF of a Continuous Random Variable
For a continuous random variable X with the probability density function (PDF) f(x) the CDF is the integral of the PDF:
F(x) = \int_{-\infty}^{x} f(t)\, dt
Since the CDF is the integral of the PDF it is a continuous and differentiable function.
For example: Consider a continuous random variable X that is uniformly distributed between the 0 and 1. The CDF F(x) for this random variable is:
F(x) =
\begin{cases}
0 & \text{if } x < 0 \\
x & \text{if } 0 \leq x \leq 1 \\
1 & \text{if } x > 1
\end{cases}
Difference Between Cumulative Distribution Function (CDF) and Probability Density Function (PDF)
Some of the common differences between CDF and PDF are listed in the following table:
Aspect | Cumulative Distribution Function (CDF) | Probability Density Function (PDF) |
---|
Definition | Shows the probability that a random variable is less than or equal to a certain value. | Describes the probability of the random variable taking a specific value. |
Representation | A running total of probabilities. | The likelihood of the random variable being exactly at a specific value. |
Mathematical Expression | F(x) = P(X ≤ x) | f(x) such that f(x) ≥ 0 and \int_{-\infty}^{\infty} f(x)dx = 1 |
Value Range | Always non-decreasing and ranges from 0 to 1. | Can be any non-negative value but does not exceed 1. |
Interpretation | CDF gives the cumulative probability up to a certain point. | PDF gives the relative likelihood for a continuous random variable to occur at a specific value. |
Visualization | A continuous, non-decreasing curve. | A curve where the area under the curve represents probabilities. |
Area Under Curve | Not applicable as it represents cumulative probability. | The total area under the PDF curve equals 1. |
Usage | Used to find the probability that a random variable falls within a certain range. | Used to understand the distribution and likelihood of specific outcomes. |
Derivative Relationship | The derivative of the CDF gives the PDF. | The integral of the PDF gives the CDF. |
When to Use CDF vs. PDF?
- CDF: Use the CDF when we need to find the probability that a random variable is less than or equal to the specific value. It provides the cumulative probability up to the certain point.
- PDF: Use the PDF when you need to the find the probability density at a specific value. The PDF represents the rate of the change of the CDF and is useful for the calculating the probabilities over intervals.
Relationship Between CDF and PDF
For a continuous random variable X with the PDF f(x) the relationship between the CDF F(x) and PDF is given by:
f(x) = \frac{dF(x)}{dx}
This indicates that the PDF is the derivative of the CDF.
Solved Examples: Cumulative Distribution Function
Example 1: Consider a random variable X representing the number of the heads obtained in the two coin flips. Calculate the CDF F(x) of X.
Solution:
The possible values of X are 0, 1 and 2. The probabilities are:
P(X=0) = \frac{1}{4} \quad \text{(no heads)}
P(X=1) = \frac{1}{2} \quad \text{(one head)}
P(X=2) = \frac{1}{4} \quad \text{(two heads)}
The CDF F(x) is:
F(x) =
\begin{cases}
0 & \text{if } x < 0 \\
\frac{1}{4} & \text{if } 0 \leq x < 1 \\
\frac{3}{4} & \text{if } 1 \leq x < 2 \\
1 & \text{if } x \geq 2 \\
\end{cases}
Example 2: Let X be a continuous random variable with the PDF f(x) = 2x for the 0 \leq x \leq 1. Find the CDF F(x).
Solution:
To find the CDF integrate the PDF:
F(x) = \int_{-\infty}^{x} f(t) \, dt = \int_{0}^{x} 2t \, dt = x^2 \quad \text{for } 0 \leq x \leq 1
Thus, the CDF is:
F(x) =
\begin{cases}
0 & \text{if } x < 0 \\
x^2 & \text{if } 0 \leq x \leq 1 \\
1 & \text{if } x > 1 \\
\end{cases}
Example 3: Given the CDF F(x) = 1 - e^{-x} for x \geq 0 find the probability that X lies between the 1 and 2.
Solution:
The probability that X lies between the 1 and 2 is:
P(1 \leq X \leq 2) = F(2) - F(1)
Calculate:
F(2) = 1 - e^{-2} \quad \text{and} \quad F(1) = 1 - e^{-1}
P(1 \leq X \leq 2) = (1 - e^{-2}) - (1 - e^{-1}) = e^{-1} - e^{-2}
Example 4: Let X be a mixed random variable with the following distribution: P(X=0) = 0.2, P(X=1) = 0.3 and the continuous part is uniformly distributed over [2, 3]. Find the CDF F(x).
Solution:
For x < 0, F(x) = 0.
For 0 \leq x < 1, F(x) = 0.2.
For 1 \leq x < 2, F(x) = 0.5.
For 2 \leq x < 3 the continuous part applies:
F(x) = 0.5 + 0.5 \times (x - 2) = 0.5 + 0.5x - 1
Thus, F(x) = 0.5x - 0.5 for 2 \leq x \leq 3.
For x \geq 3, F(x) = 1.
Example 5: Given a PDF f(x) = \frac{3}{4}(1 - x^2) for the -1 \leq x \leq 1 find the CDF F(x).
Solution:
Integrate the PDF to find the CDF:
F(x) = \int_{-\infty}^{x} f(t) \, dt = \int_{-1}^{x} \frac{3}{4}(1 - t^2) \, dt
Solving the integral:
F(x) = \frac{3}{4} \left[t - \frac{t^3}{3}\right]_{-1}^{x} = \frac{3}{4} \left(x - \frac{x^3}{3} + \frac{2}{3}\right)
The CDF (F(x)) is:
F(x) =
\begin{cases}
0 & \text{if } x < -1 \\
\frac{3}{4} \left(x - \frac{x^3}{3} + \frac{2}{3}\right) & \text{if } -1 \leq x \leq 1 \\
1 & \text{if } x > 1 \\
\end{cases}
Practical Question: Cumulative Distribution Function (CDF)
Questions 1. Find the CDF of a random variable X representing the number of the heads in three coin flips.
Questions 2. Determine the CDF for a continuous random variable X with the PDF f(x) = 3x^2 for 0 \leq x \leq 1.
Questions 3. Calculate the probability that X is less than or equal to the 4 given the CDF F(x) = 1 - e^{-x} for x \geq 0.
Questions 4. Find the CDF for the random variable X uniformly distributed over the interval [2, 5].
Questions 5. Given the CDF F(x) for a random variable X find the PDF by differentiating the CDF.
Questions 6. Use the CDF F(x) = 1 - e^{-x} to find the probability that X lies between the 2 and 3.
Questions 7. For a discrete random variable X taking values 1, 2 and 3 with the probabilities 0.2, 0.5 and 0.3 respectively find the CDF F(x).
Questions 8.Calculate the CDF for a normal distribution with the mean 0 and standard deviation 1 at x = 1.
Questions 9.Given a random variable X with the PDF f(x) = \frac{1}{2} for 0 \leq x \leq 2 find the CDF F(x).
Questions 10. Determine the probability that a continuous random variable X with the PDF f(x) = 4x for the 0 \leq x \leq 1 is greater than 0.5.
Conclusion
The Cumulative Distribution Function (CDF) is an essential concept in the probability and statistics offering the comprehensive way to the describe the distribution of the random variables. Understanding the CDF and its properties is fundamental to the various statistical analyses and real-world applications. Whether dealing with the discrete or continuous variables the CDF provides the powerful tool for the calculating probabilities and making the inferences about data.
Read More
Similar Reads
Plot Cumulative Distribution Function in R
In this article, we will discuss how to plot a cumulative distribution function (CDF) in the R Programming Language. The cumulative distribution function (CDF) of a random variable evaluated at x, is the probability that x will take a value less than or equal to x. To calculate the cumulative distri
4 min read
Compute Empirical Cumulative Distribution Function in R
The Empirical Cumulative Distribution Function (ECDF) is a non-parametric method for estimating the Cumulative Distribution Function (CDF) of a random variable. Unlike parametric methods, the ECDF makes no assumptions about the underlying probability distribution of the data.It is defined as a step
8 min read
How to create a plot of cumulative distribution function in R?
Empirical distribution is a non-parametric method used to estimate the cumulative distribution function (CDF) of a random variable. It is particularly useful when you have data and want to make inferences about the population distribution without making any assumptions about its form. In this articl
4 min read
Probability Distribution Function
Probability Distribution refers to the function that gives the probability of all possible values of a random variable.It shows how the probabilities are assigned to the different possible values of the random variable.Common types of probability distributions Include: Binomial Distribution.Bernoull
8 min read
Continuous Uniform Distribution in R
The continuous uniform distribution is also referred to as the probability distribution of any random number selection from the continuous interval defined between intervals a and b. Â A uniform distribution holds the same probability for the entire interval. Thus, its plot is a rectangle, and theref
4 min read
Composition of Functions
The composition of functions is a process where you combine two functions into a new function. Specifically, it involves applying one function to the result of another function. In simpler terms, the output of one function becomes the input for the other function.Mathematically, the composition of t
11 min read
Cumulative frequency Curve
In statistics, graph plays an important role. With the help of these graphs, we can easily understand the data. So in this article, we will learn how to represent the cumulative frequency distribution graphically.Cumulative FrequencyThe frequency is the number of times the event occurs in the given
9 min read
Pareto Distribution
Pareto distribution is a continuous probability distribution named after the Italian economist Vilfredo Pareto, who introduced the concept in 1897 while studying the distribution of wealth.It is widely known for modelling phenomena where a small proportion of occurrences account for the majority of
6 min read
Cumulative Frequency
Cumulative Frequency: In statistics, cumulative frequency is defined as the sum of frequencies distributed across various class intervals. This involves organizing the data and their totals into a table where the frequencies are allocated according to each class interval. In this article, we will co
11 min read
Ceiling Function
The Ceiling Function is a mathematical function that returns the smallest integer greater than or equal to a given number. It is denoted as âxâ or ceil(x). This function is widely used in maths, computer science, and many other fields.Mathematically, the ceiling function is defined as follows:Ceil(x
6 min read