How to Plot a Log Normal Distribution in R
Last Updated :
26 Sep, 2024
In this article, we will explore how to plot a log-normal distribution in R, providing you with an understanding of its properties and the practical steps for visualizing it using R Programming Language.
Log-Normal Distribution
The log-normal distribution is a probability distribution of a random variable whose logarithm is normally distributed. This means if you take the natural logarithm (log) of a variable following a log-normal distribution, the result will be normally distributed. This distribution is widely used in finance, biology, and environmental studies where data tends to be positively skewed.
- It is always positive.
- It is right-skewed.
- Its shape depends on the mean and standard deviation of the associated normal distribution.
Generating and Plotting a Log-Normal Distribution in R
To plot a log-normal distribution, we can use base R and several additional packages like ggplot2
for better visualization. We'll explore different examples, starting from basic plotting to more advanced visualizations. To plot the log-normal distribution we would require two functions namely dlnorm() and curve().
dlnorm(x, meanlog = 0, sdlog = 1)
Where,
- x: vector of quantiles
- meanlog: mean of the distribution on the log scale with a default value of 0.
- sdlog: standard deviation of the distribution on the log scale with default values of 1.
curve(expr, from = NULL, to = NULL)
Where,
- function: The name of a function, or a call or an expression written as a function of x which will evaluate to an object of the same length as x.
- from: the start range over which the function will be plotted.
- to: the end range over which the function will be plotted.
let us plot a log-normal distribution using mean 0 and standard deviation 1 over a range of 0 to 25 using curve and dlnorm function.
R
curve(dlnorm(x, meanlog=0, sdlog=1), from=0, to=25)
Output:
log-normal distributionNow we will discuss step by step How to Plot a Log Normal Distribution and customized plot with different packages and methods using R Programming Language.
Step 1: Generate Log-Normal Data
R provides a built-in function rlnorm()
to generate random numbers from a log-normal distribution.
R
# Set the parameters
set.seed(123) # Setting seed for reproducibility
n <- 1000 # Number of observations
meanlog <- 0 # Mean of the underlying normal distribution
sdlog <- 0.5 # Standard deviation of the underlying normal distribution
# Generate log-normal data
log_normal_data <- rlnorm(n, meanlog = meanlog, sdlog = sdlog)
In this example, rlnorm(n, meanlog, sdlog)
generates n
log-normal random numbers with the specified mean and standard deviation.
Step 2: Plot the Log-Normal Distribution Using Base R
We can create a histogram of the generated data using the hist()
function.
R
# Plot the histogram
hist(log_normal_data,
breaks = 30,
probability = TRUE,
main = "Log-Normal Distribution",
xlab = "Value",
col = "skyblue",
border = "white")
# Add a density curve
lines(density(log_normal_data), col = "red", lwd = 2)
Output:
Log-Normal Distribution Using Base RStep 3: Plotting Log-Normal Distribution Using ggplot2
For more advanced visualization, ggplot2
offers powerful tools for plotting the log-normal distribution.
R
library(ggplot2)
# Create a data frame
df <- data.frame(Value = log_normal_data)
# Plot using ggplot2
ggplot(df, aes(x = Value)) +
geom_histogram(aes(y = ..density..), bins = 30, fill = "skyblue", color = "black", alpha = 0.7) +
geom_density(color = "red", size = 1.2) +
labs(title = "Log-Normal Distribution with ggplot2", x = "Value", y = "Density") +
theme_minimal()
Output:
Plotting Log-Normal Distribution Using ggplot2Step 4: Overlaying the Theoretical Log-Normal Distribution Curve
We can add the theoretical log-normal probability density function (PDF) on top of the histogram to visualize how the generated data fits the log-normal distribution.
R
# Create a sequence of values for the x-axis
x_vals <- seq(min(log_normal_data), max(log_normal_data), length.out = 1000)
# Calculate the theoretical PDF
theoretical_pdf <- dlnorm(x_vals, meanlog = meanlog, sdlog = sdlog)
# Plot using ggplot2 with theoretical curve
ggplot(df, aes(x = Value)) +
geom_histogram(aes(y = ..density..), bins = 30, fill = "lightblue", color = "black", alpha = 0.6) +
geom_line(aes(x = x_vals, y = theoretical_pdf), color = "darkblue", size = 1.2, linetype = "dashed") +
labs(title = "Log-Normal Distribution with Theoretical Curve",
x = "Value",
y = "Density") +
theme_minimal()
Output:
Overlaying the Theoretical Log-Normal Distribution CurveStep 5: Comparing Log-Normal Distribution with Normal Distribution
It's often insightful to compare the log-normal distribution with a normal distribution by plotting them together.
R
# Generate normal data for comparison
normal_data <- rnorm(n, mean = meanlog, sd = sdlog)
# Create a combined data frame
combined_df <- data.frame(Value = c(log_normal_data, normal_data),
Distribution = rep(c("Log-Normal", "Normal"), each = n))
# Plot the comparison
ggplot(combined_df, aes(x = Value, fill = Distribution)) +
geom_histogram(aes(y = ..density..), bins = 30, position = "identity", alpha = 0.4) +
geom_density(aes(color = Distribution), size = 1.2) +
labs(title = "Comparison of Log-Normal and Normal Distributions",
x = "Value",
y = "Density") +
theme_minimal() +
scale_color_manual(values = c("blue", "red")) +
scale_fill_manual(values = c("blue", "red"))
Output:
Comparing Log-Normal Distribution with Normal DistributionConclusion
The log-normal distribution is widely used in different fields and is an essential concept in statistics. In this article, we demonstrated how to generate and plot a log-normal distribution using both base R and ggplot2
. We also showed how to overlay a theoretical density curve, visualize the distribution on a log scale, and compare it with a normal distribution.