Row wise operation in R using Dplyr
Last Updated :
26 Apr, 2025
The dplyr package in R programming is used to perform simulations in the data by performing manipulations and transformations. It can be installed into the working space using the following command :
install.packages("dplyr")
Create Dataframe using Row
The data frame created by tibble contains rows and columns arranged in a tabular structure. It illustrates the data type of the data frame’s column. It can be created in R using the following dimensions
R
library ( "dplyr" )
data = tibble (col1= c (1, 4, 2, 5, 6, 9, 5, 3, 6, 3),
col2= c ( "a" , "b" , "a" , "c" , "b" ,
"b" , "b" , "a" , "c" , "a" ),
col3= c (3, 2, 4, 2, 1, 4, 8, 6, 4, 2))
data % > % rowwise ()
|
Output:
Application of the mutate method
The mutate() method in R is then applied using the pipe operator to create new columns in the provided data. The mutate() method is used to calculate the aggregated function provided.
Syntax: mutate(new-col-name = func)
Arguments :
- new-col-name – The new column to be added to the data
- func – The function to be applied on the specified data frame.
The following code snippet illustrates the procedure where the mean of col1 and col3 values of the data are calculated the same mean value is returned in all of these rows since group_by method is not taken into account.
R
data %>% mutate (mean = mean ( c (col1,col3)))
|
Output:
Using a combination of rowwise() and mutate() methods
In the following code snippet, the rowwise method is used in collaboration with the mutate method. Therefore the mean value of col1 and col3 value of the data table is calculated for each row individually. For instance, the mean of 1 and 3 in row 1 of the table is equivalent to 2 and is therefore displayed under the mean column.
R
data %>% rowwise () %>% mutate (mean = mean ( c (col1,col3)))
|
Output
Using summarise method
The summarise method is used to create a summary of the values across the data rows that fall within one column. It is preferably used with a group_by method and the output data contains one row for each of the groups present in the column for which the group_by method is invoked. The method has the following syntax:
Syntax : Summarise(new-col-name=fun())
Arguments: fun – any aggregate function that may be applied over the rows
In the following code snippet, a new column sum is displayed which contains the submission of the values present in the col1 and col3 values of the data. The sum aggregate method has been used to calculate the total values.
R
data %>%
rowwise () %>%
summarise (sum = sum ( c (col1,col3)))
|
Output:
Using summarise in combination with group_by
To apply a function to every group in the data, we need to first group the data according to the classes available. The group_by() method in the dplyr package divides the data into different segments. It has the following syntax :
Syntax: group_by(col1, col2..)
Arguments : col1, col2,.. – The columns to group the data by
In the following code snippet, the group_by method is combined with a summarise method to calculate the sum of the grouped col3 values
For Example, the value 4 appears 3 times in the col3 parameter and has been returned in the output only once.
R
data %>%
rowwise () %>%
group_by (col3) %>%
summarise (sum = sum ( c (col1,col3)))
|
Output:
Using across method
The Across method is used to span multiple data elements be its rows, or columns of the data. For instance, it can be used to check as well as return the desired output with various inbuilt functions like is.numeric. In the following code, the row sums of all the rows have been calculated which contain integral values satisfying the condition of being numerical. Therefore, the sum of col1, col3, and col4 values for each row has been displayed.
R
data %>% mutate (sum = rowSums ( across ( where (is.numeric))))
|
Output:
Using head method
The do method is used to perform a specific task of returning a subset of values of the data frame by applying methods like head over it. The head(.,1) is used to print the first row of every group contained in the group_by method.
R
data %>% group_by (col3) %>% do ( head (., 1))
|
Output:

Similar Reads
Windows Function in R using Dplyr
Aggregation functions in R are used to take a bunch of values and give us output as a single value. Some of the examples of aggregation methods are the sum and mean. Windows functions in R provide a variation to the aggregation methods in the sense that they return the number of outputs equivalent t
7 min read
Remove Rows with NA Using dplyr Package in R
NA means Not Available is often used for missing values in a dataset. In Machine Learning NA values are a common problem and if not treated properly can create severe issues during data analysis. NA is also referred as NaN which means Not a number. To understand NA values we can think of an admissio
5 min read
Filter or subsetting rows in R using Dplyr
In this article, we are going to filter the rows from dataframe in R programming language using Dplyr package. Dataframe in use: Method 1: Subset or filter a row using filter() To filter or subset row we are going to use the filter() function. Syntax: filter(dataframe,condition) Here, dataframe is t
6 min read
Remove Duplicate rows in R using Dplyr
In this article, we are going to remove duplicate rows in R programming language using Dplyr package. Method 1: distinct() This function is used to remove the duplicate rows in the dataframe and get the unique data Syntax: distinct(dataframe) We can also remove duplicate rows based on the multiple c
3 min read
Mutating column in dplyr using rowSums
In this article, we are going to discuss how to mutate columns in dataframes using the dplyr package in R Programming Language. Installation The package can be downloaded and installed in the R working space using the following command : Install Command - install.packages("dplyr") Load Command - lib
3 min read
Group by one or more variables using Dplyr in R
The group_by() method is used to divide and segregate date based on groups contained within the specific columns. The required column to group by is specified as an argument of this function. It may contain multiple column names. Syntax: group_by(col1, col2, ...) Example 1: Group by one variable C/C
2 min read
Array Operations in R Programming
Arrays are the R data objects which store the data in more than two dimensions. Arrays are n-dimensional data structures. For example, if we create an array of dimensions (2, 3, 3) then it creates 3 rectangular matrices each with 2 rows and 3 columns. They are homogeneous data structures. Now, letâs
4 min read
Intersection of dataframes using Dplyr in R
In this article, we will discuss how to find the Intersection of two dataframes using the Dplyr package in R programming language. Dplyr provides intersect() method to get the common data in two dataframes. Syntax: intersect(dataframe1,dataframe2,dataframe3,........,dataframe n) We can perform this
1 min read
Group by function in R using Dplyr
Group_by() function belongs to the dplyr package in the R programming language, which groups the data frames. Group_by() function alone will not give any output. It should be followed by summarise() function with an appropriate action to perform. It works similar to GROUP BY in SQL and pivot table i
2 min read
Rank variable by group using Dplyr package in R
In this article, we are going to see how to rank the variable by group using dplyr in R Programming Language. The dplyr package in R is used to perform mutations and data manipulations in R. It is particularly useful for working with data frames and data tables. The package can be downloaded and ins
2 min read