How to split DataFrame in R
Last Updated :
23 Sep, 2021
In this article, we will discuss how to split the dataframe in R programming language.
A subset can be split both continuously as well as randomly based on rows and columns. The rows and columns of the dataframe can be referenced using the indexes as well as names. Multiple rows and columns can be referred using the c() method in base R.
Splitting dataframe by row
Splitting dataframe by row indexes
The dataframe cells can be referenced using the row and column names and indexes.
Syntax:
data-frame[start-row-num:end-row-num,]
The row numbers are retained in the final output dataframe.
Example: Splitting dataframe by row
R
# create first dataframe
data_frame1<-data.frame(col1=c(rep('Grp1',2),
rep('Grp2',2),
rep('Grp3',2)),
col2=rep(1:3,2),
col3=rep(1:2,3)
)
print("Original DataFrame")
print(data_frame1)
# extracting first four rows
data_frame_mod <- data_frame1[1:4,]
print("Modified DataFrame")
print(data_frame_mod)
Output:
[1] "Original DataFrame"
col1 col2 col3
1 Grp1 1 1
2 Grp1 2 2
3 Grp2 3 1
4 Grp2 1 2
5 Grp3 2 1
6 Grp3 3 2
[1] "Modified DataFrame"
col1 col2 col3
1 Grp1 1 1
2 Grp1 2 2
3 Grp2 3 1
4 Grp2 1 2
Example: Splitting dataframe by row
R
# create first dataframe
data_frame1<-data.frame(col1=c(rep('Grp1',2),
rep('Grp2',2),
rep('Grp3',2)),
col2=rep(1:3,2),
col3=rep(1:2,3)
)
print("Original DataFrame")
print(data_frame1)
# extracting first four rows
data_frame_mod <- data_frame1[6,]
print("Modified DataFrame")
print(data_frame_mod)
Output:
[1] "Original DataFrame"
col1 col2 col3
1 Grp1 1 1
2 Grp1 2 2
3 Grp2 3 1
4 Grp2 1 2
5 Grp3 2 1
6 Grp3 3 2
[1] "Modified DataFrame"
col1 col2 col3
6 Grp3 3 2
Splitting dataframe rows randomly
The dataframe rows can also be generated randomly by using the set.seed() method. It generates a random sample, which is then fed into any arbitrary random dummy generator function. The rows can then be extracted by comparing them to a function.
Example: Splitting dataframe by rows randomly
R
# create first dataframe
data_frame1<-data.frame(col1=c(rep('Grp1',2),
rep('Grp2',2),
rep('Grp3',2)),
col2=rep(1:3,2),
col3=rep(1:2,3),
col4 = letters[1:6]
)
print("Original DataFrame")
print(data_frame1)
# extracting last two columns
set.seed(99999)
rows <- nrow(data_frame1)
rand <- rbinom(rows, 2, 0.5)
data_frame_mod <- data_frame1[rand == 0, ]
print("Modified DataFrame")
print(data_frame_mod)
Output:
[1] "Original DataFrame"
col1 col2 col3 col4
1 Grp1 1 1 a
2 Grp1 2 2 b
3 Grp2 3 1 c
4 Grp2 1 2 d
5 Grp3 2 1 e
6 Grp3 3 2 f
[1] "Modified DataFrame"
col1 col2 col3 col4
5 Grp3 2 1 e
6 Grp3 3 2 f
Splitting dataframe by column
Splitting dataframe by column names
The dataframe can also be referenced using the column names. Multiple column names can be specified using the c() method containing column names as strings. The column names may be contiguous or random in nature.
Syntax:
data-frame[,c(col1, col2,...)]
Example: splitting dataframe by column names
R
# create first dataframe
data_frame1<-data.frame(col1=c(rep('Grp1',2),
rep('Grp2',2),
rep('Grp3',2)),
col2=rep(1:3,2),
col3=rep(1:2,3),
col4 = letters[1:6]
)
print("Original DataFrame")
print(data_frame1)
# extracting sixth row
data_frame_mod <- data_frame1[,c("col2","col4")]
print("Modified DataFrame")
print(data_frame_mod)
Output:
[1] "Original DataFrame"
col1 col2 col3 col4
1 Grp1 1 1 a
2 Grp1 2 2 b
3 Grp2 3 1 c
4 Grp2 1 2 d
5 Grp3 2 1 e
6 Grp3 3 2 f
[1] "Modified DataFrame"
col2 col4
1 1 a
2 2 b
3 3 c
4 1 d
5 2 e
6 3 f
Splitting dataframe by column indices
The dataframe can also be referenced using the column indices. Individual, as well as multiple columns, can be extracted from the dataframe by specifying the column position.
Syntax:
data-frame[,start-col-num:end-col-num]
Example: Split dataframe by column indices
R
# create first dataframe
data_frame1<-data.frame(col1=c(rep('Grp1',2),
rep('Grp2',2),
rep('Grp3',2)),
col2=rep(1:3,2),
col3=rep(1:2,3),
col4 = letters[1:6]
)
print("Original DataFrame")
print(data_frame1)
# extracting last two columns
data_frame_mod <- data_frame1[,c(3:4)]
print("Modified DataFrame")
print(data_frame_mod)
Output:
[1] "Original DataFrame"
col1 col2 col3 col4
1 Grp1 1 1 a
2 Grp1 2 2 b
3 Grp2 3 1 c
4 Grp2 1 2 d
5 Grp3 2 1 e
6 Grp3 3 2 f
[1] "Modified DataFrame"
col3 col4
1 1 a
2 2 b
3 1 c
4 2 d
5 1 e
6 2 f
Similar Reads
How to Sort a DataFrame in R ?
In this article, we will discuss how to sort the dataframe in R Programming Language. In R DataFrame is a two-dimensional tabular data structure that consists of rows and columns. Sorting a DataFrame allows us to reorder the rows based on the values in one or more columns. This can be useful for var
5 min read
How to Unnest dataframe in R ?
In this article, we will discuss how to unnest dataframes in R Programming Language. Unnesting of dataframe refers to flattening it. Method 1: Using do.call approach The do.call() method in base R constructs and executes a function call from a function using its corresponding argument list. Syntax
3 min read
How To Merge Two DataFrames in R ?
In this article, We are going to see how to merge two R dataFrames. Merging of Data frames in R can be done in two ways. Merging columnsMerging rowsMerging columns In this way, we merge the database horizontally. We use the merge function to merge two frames by one or more common key variables(i.e.,
2 min read
How to Remove Rows in R DataFrame?
In this article, we will discuss how to remove rows from dataframe in the R programming language. Method 1: Remove Rows by Number By using a particular row index number we can remove the rows. Syntax: data[-c(row_number), ] where. data is the input dataframerow_number is the row index position Exam
2 min read
How to Stack DataFrame Columns in R?
A dataframe is a tubular structure composed of rows and columns. The dataframe columns can be stacked together to divide the columns depending on the values contained within them. Method 1: Using stack method The cbind() operation is used to stack the columns of the data frame together. Initially,
3 min read
How to Switch Two Columns in R DataFrame?
In this article, we will discuss how to switch two columns in dataframe in R Programming Language. Let's create the dataframe with 6 columns R # create a dataframe data = data.frame(column1=c(1, 2, 3), column2=c(4, 5, 6), column3=c(2, 3, 4), column4=c(4, 5, 6), column5=c(5, 3, 2), column6=c(2, 3, 1)
1 min read
How to Transpose a Data Frame in R?
Transposing means converting rows to columns and columns to rows of a data frame. Transposing can be useful for various purposes, such as reshaping data or preparing it for specific analyses.Transpose a Data Frame in R Using t() functionHere we are using t() function which stands for transpose to Tr
2 min read
How to Delete Row(s) in R DataFrame ?
In this article, we will see how row(s) can be deleted from a Dataframe in R Programming Language. Deleting a single row For this, the index of the row to be deleted is passed with a minus sign. Syntax: df[-(index), ] Example 1 :Â R # creating a data frame with # some data . df=data.frame(id=c(1,2,3
2 min read
How to merge multiple DataFrames in R ?
In this article, we will discuss how to merge multiple dataframes in R Programming Language. Dataframes can be merged both row and column wise, we can merge the columns by using cbind() function and rows by using rbind() function Merging by Columns cbind() is used to combine the dataframes by column
2 min read
How to add column to dataframe in R ?
In this article, we are going to see how to add columns to dataframe in R. First, let's create a sample dataframe. Adding Column to the DataFrame We can add a column to a data frame using $ symbol. syntax: dataframe_name $ column_name = c( value 1,value 2 . . . , value n)Â Here c() function is a vec
2 min read