Open In App

How to add calibrated axes to PCA biplot in ggplot2 in R?

Last Updated : 09 Jul, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Principal Component Analysis (PCA) is the statistical technique used to emphasize the variation and bring out strong patterns in the dataset. It can be often used to reduce the dimensionality of the dataset while preserving as much variability as possible. the biplot is a type that can display the scores and loadings from PCA on the same plot and it can help both the relationships between the variables and the observations.

What are calibrated axes?

Calibrated axes are axes on a graph or chart that have been precisely marked and labeled to ensure accurate measurement and representation of data. Calibration involves adjusting the scale and intervals of the axes so that they correctly reflect the range and distribution of the data being visualized.

Key Aspects of Calibrated Axes

Here we will discuss the main Key Aspects of Calibrated Axes in the R Programming Language.

  1. Scale Accuracy: The intervals between the axis marks are consistent and proportional. For example, on a calibrated axis, the distance between 1 and 2 is the same as between 2 and 3.
  2. Label Precision: The axis labels accurately reflect the values and units of measurement. This is crucial for understanding the magnitude and trends in the data.
  3. Consistency: Both the x-axis and y-axis (or any other axes in more complex plots) are scaled consistently, ensuring that comparisons and interpretations are valid.
  4. Calibration Methods: In some contexts, such as scientific measurements, axes might need to be calibrated against known standards or reference points to ensure accuracy.

Calibrated axes to PCA biplot in ggplot2 in R

The PCA biplot can display two key pieces of information:

  • Scores: It can represent the observations in the reduced PCA space.
  • Loadings: It can represents the contribution of the each variable to the principal components.

Calibrated axes can be added to the biplot to show the direction and magnitude of the each variables contribution to the principal components. Implementation of How to add calibrated axes to PCA biplot in ggplot2 in R

Step 1: Install and load the required packages

First we will Install and load the required packages.

R
install.packages("ggplot2")
install.packages("dplyr")
library(ggplot2)
library(dplyr)

Step 2: Perform the PCA

Now we will perform the PCA on the inbuilt Iris dataset in R.

R
data(iris)
iris_pca <- prcomp(iris[, 1:4], scale. = TRUE)

Step 3: Extract the Scores and Loadings

Now we will Extract the Scores and Loadings for calibrated axes.

R
scores <- as.data.frame(iris_pca$x)
loadings <- as.data.frame(iris_pca$rotation)

# Add species to scores dataframe
scores$Species <- iris$Species

Step 4: Create the Biplot with Calibrated Axes

Now we will Create the Biplot with Calibrated Axes.

R
ggplot(data = scores, aes(x = PC1, y = PC2, color = Species)) +
  geom_point(size = 3) +
  geom_segment(data = loadings, aes(x = 0, y = 0, xend = PC1*5, yend = PC2*5),
               arrow = arrow(length = unit(0.2, "cm")), color = "red") +
  geom_text(data = loadings, aes(x = PC1*5, y = PC2*5, label = rownames(loadings)),
            color = "red", vjust = -0.5) +
  theme_minimal() +
  xlab(paste("PC1 (", round(iris_pca$sdev[1]^2/sum(iris_pca$sdev^2)*100, 1), "%)",
                                                                      sep = "")) +
  ylab(paste("PC2 (", round(iris_pca$sdev[2]^2/sum(iris_pca$sdev^2)*100, 1), "%)", 
                                                                      sep = "")) +
  ggtitle("PCA Biplot with Calibrated Axes")

Output:

gh
Add calibrated axes to PCA biplot in ggplot2 in R

This PCA biplot effectively visualizes the variance explained by the first two principal components and shows how the original variables contribute to these components. The calibrated axes provide clear and accurate scales, aiding in the interpretation of the plot. The plot indicates that the Petal measurements are more influential in distinguishing the Iris species than the Sepal measurements.

Conclusion

Adding the calibrated axes to the PCA biplot in ggplot2 provides the clearer understanding of how to each variable contributes to the prinicipal components. This method can enhances the interpretability of the biplot. Making it easier to identify the patterns and relationships in the data. By the following the steps outlined in this article, we can create the imformative and visually appealing PCA biplots in R.


Next Article

Similar Reads