Draw Multiple Overlaid Histograms with ggplot2 Package in R
Last Updated :
13 Jun, 2023
In this article, we are going to see how to draw multiple overlaid histograms with the ggplot2 package in the R programming language.
To draw multiple overlaid histograms with the ggplot2 package in R, you can use the geom_histogram() layer multiple times, each with different data and mapping specifications.
Here is an example to create multiple histograms with different fill colors:
set.seed(123)
data1 <- rnorm(100, mean = 5, sd = 1)
data2 <- rnorm(100, mean = 7, sd = 2)
data3 <- rnorm(100, mean = 9, sd = 1.5)
# Create a ggplot object
p <- ggplot() +
geom_histogram(aes(x = data1, fill = "data1"), alpha = 0.5) +
geom_histogram(aes(x = data2, fill = "data2"), alpha = 0.5) +
geom_histogram(aes(x = data3, fill = "data3"), alpha = 0.5) +
scale_fill_manual(values = c("data1" = "red", "data2" = "green", "data3" = "blue")) +
labs(title = "Multiple Overlaid Histograms", x = "Value", y = "Frequency")
# Show the plot
p
In this example, the geom_histogram() layer is used three times, each with different sample data (data1, data2, and data3). The fill aesthetic is mapped to a categorical variable ("data1", "data2", "data3") to differentiate the histograms. The scale_fill_manual() function is used to set the fill colors for each category. The labs() function is used to add a title and axis labels to the plot.
You can also adjust the binwidth, color, and transparency of the histograms to achieve the desired visualization.
We will be drawing multiple overlaid histograms using the alpha argument of the geom_histogram() function from the ggplot2 package. In this approach for drawing multiple overlaid histograms, the user first needs to install and import the ggplot2 package on the R console and call the geaom_histogram function by specifying the alpha argument of this function to a float value between 0 to 1 which will lead to the transparency of the different histogram plots on the same plot with the set of the data-frame as this function parameter to get multiple overlaid histograms in the R programming language.
geom_histogram() function: This function is an in-built function of ggplot2 module.
Syntax: geom_histogram(mapping = NULL, data = NULL, stat = "bin", position = "stack", ...)
Parameters:
- mapping: The aesthetic mapping, usually constructed with aes or aes_string. Only needs to be set at the layer level if you are overriding the plot defaults.
- data: A layer-specific dataset - only needed if you want to override the plot defaults.
- stat: The statistical transformation to use on the data for this layer.
- position: The position adjustment to use for overlapping points on this layer
To install and import the ggplot2 package in the R console, the user needs to follow the following syntax:
install.packages("ggplot2")
library("ggplot2")
The alpha argument: This is a graphical parameter is a number from 0 to 1 opaque to transparent, it adjusts the transparency of the plot.
Example 1:
In this example, we will be taking 2 different 100 random data set to create 2 different histograms on the single plot using the alpha argument of the geom_histogram() function from the ggplot2 package in the R programming language.
R
library("ggplot2")
data <- data.frame(values = c(rnorm(100),
rnorm(100)),
group = c(rep("A", 100),
rep("B", 100)))
ggplot(data, aes(x = values, fill = group)) +
geom_histogram(position = "identity", alpha = 0.4, bins = 50)
Output:
Multiple Overlaid Histograms with ggplot2 Package in R- The ggplot2 library must be loaded in the first line before any plots can be produced.
 - The data. frame() function is then used to build a dataset named data. "Values" and "Group" are two columns in this dataset. The "group" column uses rep() to assign the value "A" to the first 100 values and "B" to the following 100 values, while the "values" column is created by concatenating two sets of 100 random normal values using rnorm().
 - The plot is initialized using the ggplot() function. The aes() function is used to set the aesthetic mappings and the dataset data is supplied as the data source. The "values" column is assigned to the x-axis by the formula x = values, and the "group" column is given the fill color of the histogram bars by the formula fill = group.
 - The histograms are produced by adding the geom_histogram() layer. The histograms are not stacked but rather are superimposed immediately on top of one another thanks to the position = "identity" parameter. The histogram bars' transparency level is set to 0.4 (40% opaque) via the alpha = 0.4 option. The histogram's number of bins (intervals) is specified by the bins = 50 option.
 - The resulting figure will show the two groups' histograms overlayed on top of one another, with the bars' fill colors reflecting the groups. You can modify the transparency and bin settings to suit your tastes.
Example 2:
In this example, we will be taking 3 different data to create 3 different histograms on a single plot using the alpha argument of the geom_histogram() function from the ggplot2 package in the R programming language.
R
library("ggplot2")
data <- data.frame(values = c(c(6,2,5,4,1,6,1,5,4,7),
c(4,1,4,4,5,5,4,6,2,4),
c(9,1,5,7,1,10,6,4,1,7)),
group = c(rep("A", 10),
rep("B", 10),
rep("C", 10)))
ggplot(data, aes(x = values, fill = group)) +
geom_histogram(position = "identity", alpha = 0.4, bins = 50)
Output:
Multiple Overlaid Histograms with ggplot2 Package in R
Example 3:
R
# Load the ggplot2 library
library(ggplot2)
# Create a dataset
data <- data.frame(values = c(rnorm(100),
rnorm(100)),
group = c(rep("A", 100),
rep("B", 100)))
# Create overlaid histograms using ggplot2
ggplot(data) +
geom_histogram(aes(x = values, fill = group),
binwidth = 0.5,
alpha = 0.5,
position = "identity") +
labs(title = "Overlaid Histograms",
x = "Values",
y = "Frequency") +
scale_fill_manual(values = c("A" = "blue", "B" = "green"))
Output:
Multiple Overlaid Histograms with ggplot2 Package in R- Geom_histogram (aes (x = values, fill = group), bin width [0.5], alpha [0.5], position ["identity")]): Additional parameters for the geom_histogram() layer are specified in the code that is placed after the ggplot() line.
aes(x=values, fill=group) The aesthetic mappings are set using the aes() function. The "values" column is assigned to the x-axis by the formula x = values, and the "group" column is given the fill color of the histogram bars by the formula fill = group.
 - binwidth = 0.5: The width of each bin (interval) in the histogram is determined by this parameter. It is set to 0.5 in this instance.
 - alpha = 0.5: The histogram bars' transparency level is determined by the alpha parameter. The bars are 50% opaque with the value set to 0.5.
 - position = "identity": To ensure that the histograms are superimposed directly on the background, the position option is set to "identity.".
 - scale_fill_manual (values = c("A" = "blue", "B" = "green") The fill colors for each group are individually specified using the scale_fill_manual() method.
 - values = c("A" = "blue", "B" = "green" Each group's fill color is set with this. The color "blue" is assigned to group "A," whereas the color "green" is given to group "B."
You can change the appearance of the histogram by adding these lines of code, which allow you to control the bin width, transparency, location, title, axis labels, and fill colors for each group.
In the final plot, the histograms for the two groups are displayed overlaid, with the bars having the desired fill colors, a translucent look, and the pertinent labels and titles.
Similar Reads
Draw Confidence Interval on Histogram with ggplot2 in R
A histogram is a graph that shows the distribution of a dataset. It can be used to estimate the probability distribution of a continuous variable, such as height or weight. To create a histogram, you need to divide the range of values into a series of intervals, called bins, and count the number of
6 min read
Scatterplot with marginal histograms in ggplot2
Histograms are graphical representations of data distributions, where data is divided into equal intervals called bins and the number of data points falling in each bin is represented by a bar. Histograms are useful for understanding the shape of the data distribution, identifying outliers, and find
8 min read
Create a Scatter Plot with Multiple Groups using ggplot2 in R
In this article, we will discuss how to create a scatter plot with multiple groups in R Programming Language. Geoms can be added to the plot to compute various graphical representations of the data in the plot (points, lines, bars). The geom_point() method is used to create scatter plots in R. The g
2 min read
Histograms in the Lattice Package
For making trellis or tiny multiple plots, a style of visualization that displays several versions of a plot for subsets of the data, the Lattice package in R is a potent tool. Lattice's histogram() function can be used to generate histograms for continuous variables and includes a number of useful
8 min read
Create interactive ggplot2 graphs with Plotly in R
"A Picture is worth a thousand words," and that picture would be even more expressive if the user could interact with it. Hence the concept of "interactive graphs or charts. Interactive charts allow both the presenter and the audience more freedom since they allow users to zoom in and out, hover and
6 min read
Plot Multiple Histograms On Same Plot With Seaborn
Histograms are a powerful tool for visualizing the distribution of data in a dataset. When working with multiple datasets or variables, it can be insightful to compare their distributions side by side. Seaborn, a python data visualization package offers powerful tools for making visually appealing m
3 min read
Show multiple plots from ggplot on one page in R
When working with data visualization in R using the ggplot2 package, there are times when you need to display multiple plots on the same page or grid. This is useful when you want to compare different plots or display multiple visualizations in a report or presentation. R offers several ways to achi
3 min read
How to Avoid Overlapping Labels in ggplot2 in R?
In this article, we are going to see how to avoid overlapping labels in ggplot2 in R Programming Language. To avoid overlapping labels in ggplot2, we use guide_axis() within scale_x_discrete(). Syntax: plot+scale_x_discrete(guide = guide_axis(<type>)) In the place of we can use the following p
2 min read
Draw ggplot2 Legend without Plot in R
A legend in the graph describes each part of the plot individually and is used to show statistical data in graphical form. In this article, we will see how to draw only the legend without a plot in ggplot2. First, let us see how to draw a graph with a legend so that the difference is apparent. For
3 min read
How To Join Multiple ggplot2 Plots with cowplot?
In this article, we are going to see how to join multiple ggplot2 plots with cowplot. To join multiple ggplot2 plots, we use the plot_grid() function of the cowplot package of R Language. Syntax: plot_grid(plot1,plot2,label=<label-vector>, ncol, nrow) Parameters: plot1 and plot2 are plots that
3 min read