Analyzing Data in Subsets Using R
Last Updated :
27 Mar, 2024
In this article, we will explore various methods to analyze data in subsets using R Programming Language.
How to analyze data in the subsets
Analyzing data encompasses employing diverse methodologies to acquire insights, recognize patterns, and draw significant conclusions from datasets. This encompasses activities such as computing summary statistics, visualizing data, and identifying trends within the dataset. R language offers various methods or functions to analyze data in the subsets. By using these methods, can work more efficiently. Some of the methods are:
Analyzing data in subsets by using subset() Function
subset(x, subset, select, . . . .)
This method is used to analyze the data present in the subsets. In the below example, we created a data frame and analyzed the data in the subsets.
R
# Example data
data <- data.frame(
ID = 1:10,
Category = rep(c("A", "B"), each = 5),
Value = rnorm(10)
)
print(data)
# Subsetting using subset() function
subset_A <- subset(data, Category == "A")
subset_B <- subset(data, Category == "B")
print("Analyzing the data in subsets")
print(subset_A) # Print subsets
print(subset_B)
Output:
ID Category Value
1 1 A 1.5658719
2 2 A 0.3142731
3 3 A -1.4552153
4 4 A 0.9014216
5 5 A -0.2758858
6 6 B 1.3345081
7 7 B -1.0618629
8 8 B 1.1188082
9 9 B -1.3202145
10 10 B 1.2453632
[1] "Analyzing the data in subsets"
ID Category Value
1 1 A 1.5658719
2 2 A 0.3142731
3 3 A -1.4552153
4 4 A 0.9014216
5 5 A -0.2758858
ID Category Value
6 6 B 1.334508
7 7 B -1.061863
8 8 B 1.118808
9 9 B -1.320214
10 10 B 1.245363
In the below example, we created a data frame and analyzed the data in the subsets.
R
# creating data frame
data <- data.frame(
ID = 1:6,
Name = rep(c("X", "Y"), each = 3),
Value = rnorm(6)
)
print(data)
# Subsetting using subset() function
subset_X <- subset(data, Name == "X")
subset_Y <- subset(data, Name == "Y")
print(" Analyzing the data in subsets")
print(subset_X)
print(subset_Y)
Output:
ID Name Value
1 1 X -0.02737704
2 2 X 0.31270382
3 3 X -0.92980339
4 4 Y 0.43035869
5 5 Y 0.30612408
6 6 Y 0.89034199
[1] " Analyzing the data in subsets"
ID Name Value
1 1 X -0.02737704
2 2 X 0.31270382
3 3 X -0.92980339
ID Name Value
4 4 Y 0.4303587
5 5 Y 0.3061241
6 6 Y 0.8903420
Subsetting the data Frame
These method is used to analyze the data present in subsets. In the below example, we created a data frame and analyzed the data.
R
# Sample data frame
df <- data.frame(
student_id = 1:10,
test_score = c(80, 85, 90, 75, 95, 82, 78, 88, 92, 70),
gender = c("M", "F", "M", "F", "M", "F", "M", "F", "M", "F")
)
# Subset of male students
male_students <- df[df$gender == "M", ]
print(male_students)
print("Analyzing the data ")
# Summary statistics for male students
summary(male_students$test_score)
Output:
student_id test_score gender
1 1 80 M
3 3 90 M
5 5 95 M
7 7 78 M
9 9 92 M
[1] "Analyzing the data "
Min. 1st Qu. Median Mean 3rd Qu. Max.
70.0 78.5 84.0 84.2 90.5 95.0
In the below example, we created a data frame and analyzed the data in the subsets.
R
# Sample sales data
sales_data <- data.frame(
transaction_id = 1:24,
product_category = rep(c("Electronics", "Clothing", "Books"), each = 8),
sales_amount = c(150, 200, 100, 120, 180, 80, 70, 90, 110, 95, 250, 300, 280, 320,
270, 40, 60, 50, 55, 45, 65, 78, 89, 34)
)
# Subset of sales data for Electronics category
electronics_sales <- sales_data[sales_data$product_category == "Electronics", ]
# Displaying the subset
print(electronics_sales)
Output:
transaction_id product_category sales_amount
1 1 Electronics 150
2 2 Electronics 200
3 3 Electronics 100
4 4 Electronics 120
5 5 Electronics 180
6 6 Electronics 80
7 7 Electronics 70
8 8 Electronics 90
Conclusion
In Conclusion, we learned various methods to analyze the data in subsets. R language offers versatile tools to analyze the data in subsets.
Similar Reads
Data analysis using R
Data Analysis is a subset of data analytics, it is a process where the objective has to be made clear, collect the relevant data, preprocess the data, perform analysis(understand the data, explore insights), and then visualize it. The last step visualization is important to make people understand wh
10 min read
Comparing two data sets in R
There may be a situation where we have to compare datasets to do data analysis, with having the same structure but differences in the data. So to identify what is changed in the dataset and to get a summary to which extent it is changed. We can use the compare package in R. We can easily use this pa
2 min read
Subset Data Frames Using Logical Conditions In R
In this article, we will explore various methods of Subset data frames using logical conditions using the R Programming Language. How to Subset data frames using logical conditionsR language offers various methods to subset data frames using logical conditions. By using these methods provided by R,
3 min read
How to Use SPSS for Data Analysis
Data Analysis involves the use of statistics and other techniques to interpret the data. It involves cleaning, analyzing, finding statistics and finally visualizing them in graphs or charts. Data Analytics tools are mainly used to deal with structured data. The steps involved in Data Analysis are as
5 min read
Subsetting in R Programming
In R Programming Language, subsetting allows the user to access elements from an object. It takes out a portion from the object based on the condition provided. There are 4 ways of subsetting in R programming. Each of the methods depends on the usability of the user and the type of object. For examp
11 min read
Filter or subsetting rows in R using Dplyr
In this article, we are going to filter the rows from dataframe in R programming language using Dplyr package. Dataframe in use: Method 1: Subset or filter a row using filter() To filter or subset row we are going to use the filter() function. Syntax: filter(dataframe,condition) Here, dataframe is t
6 min read
How to plot a subset of a dataframe using ggplot2 in R ?
In this article, we will discuss plotting a subset of a data frame using ggplot2 in the R programming language. Dataframe in use: AgeScoreEnrollNo117700521880103177915419752051885256199630717903581971409188345 To get a complete picture, let us first draw a complete data frame. Example: C/C++ Code #
8 min read
Get the summary of dataset in R using Dply
In this article, we will discuss how to get a summary of the dataset in the R programming language using Dplyr package. To get the summary of a dataset summarize() function of this module is used. This function basically gives the summary based on some required action for a group or ungrouped data,
2 min read
Unnesting a list of lists in a data frame column in R
Working with data that has lists within columns is frequent when using R programming language. These lists may include various kinds of information, including other lists. But, working with these hierarchical lists can be difficult, especially if we wish to analyze or visualize the data. A list of l
5 min read
Retail Store Location Analysis in R
Choosing the right location for a retail store is crucial for its success. Location analysis involves examining various factors such as demographics, foot traffic, competition, and accessibility to determine the most favorable sites. In this article, we will explore how to perform retail store locat
6 min read