How to Split Vector and DataFrame in R
Last Updated :
24 Apr, 2025
R is a programming language and environment specifically designed for facts analysis, statistical computing, and graphics. Sometimes it is required to split data into batches for various data manipulation and analysis tasks. In this article, we will discuss some techniques to split vectors into chunks using the R Programming Language.
Concepts related to the topic
In R language, a vector is a fundamental data structure that stores sequences of elements. Vector is the same as a one-dimensional array in other languages that can hold elements of the same data type.
a chunk is a portion or segment of data that is processed as a unit, often used to improve efficiency, manage memory usage, or handle data streams.
How to split Vector into chunks
Below are the methods that we will cover in this article:
- Using split()
- Using cut()
- Using a Loop
Using split()
The split() is a built-in function in R which is used to split vector, data frame or list into subsets based on the the factor provided.
Syntax
split(x, f)
parameters:
x: object to be split.
f: factor or grouping variable indicating how to split x.
Split vector of numeric data
here we created vector of numbers from 1-16 and provided chunk size as 4.
R
# Create a sample vector
my_vect <- 1:16
# printing vector before split
print('Vecor before split :')
print(my_vect)
# Define the number of elements in each chunk
chunk_size <- 4
# Split the vector into chunks and store in chunks variable
chunks <- split(my_vect, ceiling(seq_along(my_vect) / chunk_size))
# add all chunks to list
chunks_list <- list(chunks=chunks)
# Print the chunks list
print(chunks_list)
Output:
[1] "Vecor before split :"
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
$chunks
$chunks$`1`
[1] 1 2 3 4
$chunks$`2`
[1] 5 6 7 8
$chunks$`3`
[1] 9 10 11 12
$chunks$`4`
[1] 13 14 15 16
Split vector based on groups
R
# Create a vector
my_vect <- c(1, 2, 3, 4, 5, 6)
# Create a factor to define groups
groups <- factor(c("A", "A", "B", "B", "C", "C"))
# Split the vector based on groups
result <- split(my_vect, groups)
# Print the result
print(result)
Output:
$A
[1] 1 2
$B
[1] 3 4
$C
[1] 5 6
Split a Vector into Chunks Using cut()
cut() function in R is often used to split numeric data into intervals based on specified breakpoints. In below we generated breakpoints by using seq() function to determine where to cut the vector into chunks. To cut the sequence of indices into intervals defined by the breakpoints we used cut() function. At the end we have used split() function to split vector into chunks based on the cuts.
R
# Create sample vector
my_vect <- 1:10
# Define the number of elements you want in each chunk
chunk_size <- 3
# Generate breakpoints for cutting the vector into chunks starting from 0
breakpts <- seq(0, length(my_vect)+2, by = chunk_size)
# Cut the vector into chunks based on the breakpoints
chunks <- cut(seq_along(my_vect), breaks = breakpts, labels = FALSE)
# Split the vector into chunks based on the cuts
chunks <- split(my_vect, chunks)
# Print the chunks
print(chunks)
Output:
$`1`
[1] 1 2 3
$`2`
[1] 4 5 6
$`3`
[1] 7 8 9
$`4`
[1] 10
Using a Loop
In this approach we will simply use loop to split vector into chunks. We use a for loop to iterate over the vector. at each iteration loop increments by chunk size. in each iteration of the loop, we determine the end index of the current chunk. We use min() function to ensure that the end index does not exceed the length of the vector. extract subsets of elements from vector corresponding to the current chunk using indexing.
R
# Create sample vector
my_vector <- 1:10
# Define the number of elements you want in each chunk
chunk_size <- 3
# Initialize an empty list to store chunks
chunks <- list()
# Iterate over the vector and extract subsets for each chunk
for (i in seq(1, length(my_vector), by = chunk_size)) {
# Determine the end index for the current chunk
end_index <- min(i + chunk_size - 1, length(my_vector))
# Extract subset for the current chunk
chunk <- my_vector[i:end_index]
# Add the chunk to the list
chunks[[length(chunks) + 1]] <- chunk
}
# Print the chunks
print(chunks)
Output:
[[1]]
[1] 1 2 3
[[2]]
[1] 4 5 6
[[3]]
[1] 7 8 9
[[4]]
[1] 10
Split data frame in R Using split()
Function
R
# Create a sample data frame
my_data <- data.frame(
ID = c(1, 2, 3, 4, 5),
Name = c("Jayesh", "Anurag", "Vipul", "Pratham", "Shivang"),
Age = c(25, 30, 22, 35, 28),
Score = c(85, 92, 78, 95, 88)
)
my_data
# Split the data frame based on a factor (e.g., Age)
split_data <- split(my_data, my_data$Age)
split_data
Output:
ID Name Age Score
1 1 Jayesh 25 85
2 2 Anurag 30 92
3 3 Vipul 22 78
4 4 Pratham 35 95
5 5 Shivang 28 88
ID Name Age Score
3 3 Vipul 22 78
$`25`
ID Name Age Score
1 1 Jayesh 25 85
$`28`
ID Name Age Score
5 5 Shivang 28 88
$`30`
ID Name Age Score
2 2 Anurag 30 92
$`35`
ID Name Age Score
4 4 Pratham 35 95
Split data frame in R Using subset()
Function
R
# Create a sample data frame
my_data <- data.frame(
ID = c(1, 2, 3, 4, 5),
Name = c("Jayesh", "Anurag", "Vipul", "Pratham", "Shivang"),
Age = c(25, 30, 22, 35, 28),
Score = c(85, 92, 78, 95, 88)
)
my_data
# Split the data frame based on a logical condition (e.g., Age greater than 25)
subset_data <- subset(my_data, Age > 25)
subset_data
Output:
ID Name Age Score
1 1 Jayesh 25 85
2 2 Anurag 30 92
3 3 Vipul 22 78
4 4 Pratham 35 95
5 5 Shivang 28 88 ID Name Age Score
2 2 Anurag 30 92
4 4 Pratham 35 95
5 5 Shivang 28 88
Conclusion
In conclusion, splitting a vector into chunks in R can be achieved through various methods, each method has its ownadvantages and flexibility.
- Using split(): Convenient for splitting based on a factor or grouping variable.
- Using cut(): Useful for cutting numeric data into intervals and splitting based on breakpoints.
- Using a loop: Provides control over the splitting process, allowing for customization if needed.
Similar Reads
Convert DataFrame to vector in R
In this article, we will discuss how a dataframe can be converted to a vector in R. For the Conversion of dataframe into a vector, we can simply pass the dataframe column name as [[index]]. Approach: We are taking a column in the dataframe and passing it into another variable by the selection method
2 min read
How to select a subset of DataFrame in R
In general, when we were working on larger dataframes, we will be only interested in a small portion of it for analyzing it instead of considering all the rows and columns present in the dataframe. Creation of Sample Dataset Let's create a sample dataframe of Students as follows R student_details
2 min read
How to Convert a List to a Dataframe in R
We have a list of values and if we want to Convert a List to a Dataframe within it, we can use a as.data.frame. it Convert a List to a Dataframe for each value. A DataFrame is a two-dimensional tabular data structure that can store different types of data. Various functions and packages, such as dat
4 min read
How to plot a subset of a dataframe in R ?
In this article, we will learn multiple approaches to plotting a subset of a Dataframe in R Programming Language. Here we will be using, R language's inbuilt "USArrests" dataset. Method 1: Using subset() function In this method, first a subset of the data is created base don some condition, and then
2 min read
How to create dataframe in R
Dataframes are fundamental data structures in R for storing and manipulating data in tabular form. They allow you to organize data into rows and columns, similar to a spreadsheet or a database table. Creating a data frame in the R Programming Language is a simple yet essential task for data analysis
3 min read
How to Export DataFrame to CSV in R ?
R Programming language allows us to read and write data into various files like CSV, Excel, XML, etc. In this article, we are going to discuss how to Export DataFrame to CSV file in R Programming Language. Approach:Â Write Data in column wise formatCreate DataFrame for these dataWrite Data to the CS
1 min read
List of Dataframes in R
DataFrames are generic data objects of R which are used to store the tabular data. They are two-dimensional, heterogeneous data structures. A list in R, however, comprises of elements, vectors, data frames, variables, or lists that may belong to different data types. In this article, we will study h
7 min read
How To Remove Duplicates From Vector In R
A vector is a basic data structure that is used to represent an ordered collection of elements of the same data type. It is one-dimensional and can contain numeric, character, or logical values. It is to be noted that the vector in C++ and the vector in R Programming Language are not the same. In C+
4 min read
How to split a big dataframe into smaller ones in R?
In this article, we are going to learn how to split and write very large data frames into slices in the R programming language. Introduction We know we have to deal with large data frames, and that is something which is not easy, So to deal with such large data frames, it is very much helpful to spl
4 min read
How to create a new vector from a given vector in R
In this article, we will discuss How to create a new vector from a given vector in R Programming Language. Create a new vector from a given vectorYou can use various functions and techniques depending on your needs to create a new vector from a given vector in R. Here are some common methods. 1. Sub
2 min read