Open In App

How to Create Black and White Transparent Overlapping Histograms Using ggplot2 in R?

Last Updated : 27 Sep, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Frequency distributions and histograms are vital graphical methods employed in statistics. When comparing one distribution to another, having histograms of each overlaid on one another makes comparison very easy. In this article, we’ll learn how to make bi-variate black-and-white, transparent, overlapping histograms using the package ggplot2 in R so that the overlapping part can be easily distinguished to ensure easy visualization using R Programming Language.

Introduction to Histograms in ggplot2

ggplot2 is an R package used for creating a variety of plots amongst which the plots of histograms, and allows for added customization. If dealing with the histograms of two or more sets, the overlay can assist us in comparing these sets easily. By introducing brightness and reducing the color choice range to simply black and white, it would be possible to make this visualization look polished.

First, ensure that you have the necessary packages installed and loaded:

# Install ggplot2 if you haven't already
install.packages("ggplot2")

# Load the ggplot2 library
library(ggplot2)

For this tutorial, we'll use a built-in dataset like mtcars and compare the distribution of mpg (miles per gallon) for different subsets of data.

Step 1: Creating a Basic Histogram

We begin by constructing the simplest of all graphical displays, the histogram that gives the distribution of one variable. The next script demonstrates how to format the data to create a basic histogram of the variable mpg in the mtcars data frame.

R
# Create overlapping histograms for different cylinder groups
ggplot(mtcars, aes(x = mpg, fill = factor(cyl))) +
  geom_histogram(binwidth = 2, position = "identity") +
  theme_minimal()

Output:

plot_1
Creating a Basic Histogram
  • fill = factor(cyl) differentiates between the number of cylinders.
  • position = "identity" ensures the histograms are plotted on top of each other rather than stacked.

Step 2: Adding Transparency for Overlapping Histograms

When comparing two distributions, which is the main goal of two-sample hypothesis testing, we place two histograms on top of each other. Now, go back to the mtcars dataset and divide it by the number of cylinders (cyl). Then, let’s take two of those histograms and plot them on the same graph. So, to make the overlap obvious we will make it more transparent (using alpha parameter).

R
# Add transparency to overlapping histograms
ggplot(mtcars, aes(x = mpg, fill = factor(cyl))) +
  geom_histogram(binwidth = 2, position = "identity", alpha = 0.5) +
  theme_minimal()

Output:

plot_2
Adding Transparency for Overlapping Histograms

We used the fill aesthetic to differentiate the two groups (4-cylinder and 6-cylinder cars).

  • position = "identity" ensures that the histograms are drawn on top of each other, rather than being stacked.
  • alpha = 0.5 makes the histograms semi-transparent, allowing the overlap to be visible.

Step 3: Adjusting the Aesthetic for Black and White

For a clean black-and-white theme, we can use black outlines and fill each histogram with shades of white or gray. Here’s how we can modify the aesthetics:

R
# Create black and white overlapping histograms with transparency
ggplot(mtcars, aes(x = mpg, fill = factor(cyl))) +
  geom_histogram(binwidth = 2, position = "identity", color = "black", alpha = 0.5) +
  scale_fill_manual(values = c("white", "gray", "black")) +
  theme_minimal()

Output:

plot_3
Adjusting the Aesthetic for Black and White
  • The scale_fill_manual() function assigns white, gray, and black to the three cylinder groups.
  • The color = "black" argument ensures the borders of the histograms are outlined in black for better visibility.

Step 4: Customizing Axes and Labels

Customizing labels and titles improves readability and interpretation. Let’s add appropriate axis labels and a title:

R
# Final plot with customized labels and legend position
ggplot(mtcars, aes(x = mpg, fill = factor(cyl))) +
  geom_histogram(binwidth = 2, position = "identity", color = "black", alpha = 0.5) +
  scale_fill_manual(values = c("white", "gray", "black")) +
  labs(title = "Overlapping Histogram of MPG by Cylinder Count",
       x = "Miles per Gallon (MPG)", 
       y = "Frequency", 
       fill = "Cylinders") +
  theme_minimal() +
  theme(legend.position = "top")

Output:

plot_5
Customizing Axes and Labels

This final version includes:

  • The labs() function is used to add a title and axis labels for better interpretation.
  • theme(legend.position = "top") moves the legend to the top, providing a cleaner look.

Conclusion

As a matter of fact, creating black-and-white transparent overlapping histograms in ggplot2 is quite easy. However, geom_histogram() is helpful if we want to compare distributions, and using the right aesthetics will help with that. The work area transparency allows for seeing where there are overlaps and the fact that it is black and white is more professional. This approach is applicable when you have many groups or categories in your data programs.


Next Article

Similar Reads