Filter or subsetting rows in R using Dplyr
Last Updated :
28 Jul, 2021
In this article, we are going to filter the rows from dataframe in R programming language using Dplyr package.
Dataframe in use:

Method 1: Subset or filter a row using filter()
To filter or subset row we are going to use the filter() function.
Syntax:
filter(dataframe,condition)
Here, dataframe is the input dataframe, and condition is used to filter the data in the dataframe
Example: R program to filter the data frame
R
library (dplyr)
data= data.frame (id= c (7058,7059,7060,7089,7072,7078,7093,7034),
department= c ( 'IT' , 'sales' , 'finance' , 'IT' , 'finance' ,
'sales' , 'HR' , 'HR' ),
salary= c (34500.00,560890.78,67000.78,25000.00,
78900.00,25000.00,45000.00,90000))
print (data)
print ( "==========================" )
print ( filter (data,department== "sales" ))
|
Output:

Method 2: Filter dataframe with multiple conditions
We are going to use the filter function to filter the rows. Here we have to specify the condition in the filter function.
Syntax:
filter(dataframe,condition1condition2,.condition n)
Here, dataframe is the input dataframe and conditions is used to filter the data in the dataframe
Example: R program to filter multiple rows
R
library (dplyr)
data= data.frame (id= c (7058,7059,7060,7089,7072,7078,7093,7034),
department= c ( 'IT' , 'sales' , 'finance' , 'IT' , 'finance' ,
'sales' , 'HR' , 'HR' ),
salary= c (34500.00,560890.78,67000.78,25000.00,
78900.00,25000.00,45000.00,90000))
print (data)
print ( "==========================" )
print ( filter (data,department== "sales" & salary >27000))
|
Output:

Example: Filter rows by OR operator
R
library (dplyr)
data= data.frame (id= c (7058,7059,7060,7089,7072,7078,7093,7034),
department= c ( 'IT' , 'sales' , 'finance' , 'IT' , 'finance' ,
'sales' , 'HR' , 'HR' ),
salary= c (34500.00,560890.78,67000.78,25000.00,
78900.00,25000.00,45000.00,90000))
print (data)
print ( "==========================" )
print ( filter (data,department== "IT" | salary >27000))
|
Output:

Example: R program to filter using and, or
R
library (dplyr)
data= data.frame (id= c (7058,7059,7060,7089,7072,7078,7093,7034),
department= c ( 'IT' , 'sales' , 'finance' , 'IT' , 'finance' ,
'sales' , 'HR' , 'HR' ),
salary= c (34500.00,560890.78,67000.78,25000.00,
78900.00,25000.00,45000.00,90000))
print (data)
print ( "==========================" )
print ( filter (data,department== "sales" & salary >27000 | salary<5000))
|
Output:

Method 3: Using slice_head() function
This function is used to get top n rows from the dataframe.
Syntax:
dataframe %>% slice_head(n)
where, dataframe is the input dataframe, %>% is the operator (pipe operator) that loads the dataframe and n is the number of rows to be displayed.
Example: R program that used slice_head() to filter rows
R
library (dplyr)
data= data.frame (id= c (7058,7059,7060,7089,7072,7078,7093,7034),
department= c ( 'IT' , 'sales' , 'finance' , 'IT' , 'finance' ,
'sales' , 'HR' , 'HR' ),
salary= c (34500.00,560890.78,67000.78,25000.00,
78900.00,25000.00,45000.00,90000))
print (data)
print ( "==========================" )
data %>% slice_head (n=3)
print ( "==========================" )
data %>% slice_head (n=5)
print ( "==========================" )
data %>% slice_head (n=1)
|
Output:

Method 4: Using slice_tail() function
This function is used to get last n rows from the dataframe
Syntax:
dataframe %>% slice_tail(n)
Where, dataframe is the input dataframe, %>% is the operator (pipe operator) that loads the dataframe and n is the number of rows to be displayed from last
Example: R program to filter last rows by using slice_tail() method
R
library (dplyr)
data= data.frame (id= c (7058,7059,7060,7089,7072,7078,7093,7034),
department= c ( 'IT' , 'sales' , 'finance' , 'IT' , 'finance' ,
'sales' , 'HR' , 'HR' ),
salary= c (34500.00,560890.78,67000.78,25000.00,
78900.00,25000.00,45000.00,90000))
print (data)
print ( "==========================" )
data %>% slice_tail (n=3)
print ( "==========================" )
data %>% slice_tail (n=5)
print ( "==========================" )
data %>% slice_tail (n=1)
|
Output:

Method 5: Using top_n() function
This function is used to get top n rows.
Syntax:
data %>% top_n(n=5)
Example: R program that filter rows using top_n() function
R
library (dplyr)
data= data.frame (id= c (7058,7059,7060,7089,7072,7078,7093,7034),
department= c ( 'IT' , 'sales' , 'finance' , 'IT' , 'finance' ,
'sales' , 'HR' , 'HR' ),
salary= c (34500.00,560890.78,67000.78,25000.00,78900.00,
25000.00,45000.00,90000))
print (data)
print ( "==========================" )
data %>% top_n (n=3)
print ( "==========================" )
data %>% top_n (n=5)
print ( "==========================" )
data %>% top_n (n=1)
|
Output:

Method 6: Using slice_sample() function
Here, we are going to filter rows using the slice_sample() function, this will return sample n rows randomly
Syntax:
slice_sample(n)
Example: R program to filter rows using slice_sample () function
R
library (dplyr)
data= data.frame (id= c (7058,7059,7060,7089,7072,7078,7093,7034),
department= c ( 'IT' , 'sales' , 'finance' , 'IT' , 'finance' ,
'sales' , 'HR' , 'HR' ),
salary= c (34500.00,560890.78,67000.78,25000.00,
78900.00,25000.00,45000.00,90000))
print (data)
print ( "==========================" )
data %>% slice_sample (n=3)
print ( "==========================" )
data %>% slice_sample (n=5)
print ( "==========================" )
data %>% slice_sample (n=1)
|
Output:

Method 7: Using slice_max() function
This function returns the maximum n rows of the dataframe based on a column
Syntax:
dataframe %>% slice_max(column, n )
Where dataframe is the input dataframe, the column is the dataframe column where max rows are returned based on this column and n is the number of maximum rows to be returned
Example: R program to filter using slice_max() function
R
library (dplyr)
data= data.frame (id= c (7058,7059,7060,7089,7072,7078,7093,7034),
department= c ( 'IT' , 'sales' , 'finance' , 'IT' , 'finance' ,
'sales' , 'HR' , 'HR' ),
salary= c (34500.00,560890.78,67000.78,25000.00,
78900.00,25000.00,45000.00,90000))
print (data)
print ( "==========================" )
print (data %>% slice_max (salary, n = 3))
print ( "==========================" )
print (data %>% slice_max (department, n = 5))
print ( "==========================" )
|
Output:

Method 8: Using slice_min() function
This function returns the minimum n rows of the dataframe based on a column
Syntax:
dataframe %>% slice_min(column, n )
Where dataframe is the input dataframe, the column is the dataframe column where max rows are returned based on this column and n is the number of minimum rows to be returned
Example: R program to filter using slice_min()
R
library (dplyr)
data= data.frame (id= c (7058,7059,7060,7089,7072,7078,7093,7034),
department= c ( 'IT' , 'sales' , 'finance' , 'IT' , 'finance' ,
'sales' , 'HR' , 'HR' ),
salary= c (34500.00,560890.78,67000.78,25000.00,
78900.00,25000.00,45000.00,90000))
print (data)
print ( "==========================" )
print (data %>% slice_min (salary, n = 3))
print ( "==========================" )
print (data %>% slice_min (department, n = 5))
print ( "==========================" )
|
Output:

Method 9: Using sample_frac() function
The sample_frac() function selects a random n percentage of rows from a data frame (or table). First parameter contains the data frame name, the second parameter tells what percentage of rows to select
Syntax:
(sample_frac(dataframe,n)
Where dataframe is the input dataframe and n is the fraction value
Example: R program to filter data using sample_frac() function
R
library (dplyr)
data= data.frame (id= c (7058,7059,7060,7089,7072,7078,7093,7034),
department= c ( 'IT' , 'sales' , 'finance' , 'IT' , 'finance' ,
'sales' , 'HR' , 'HR' ),
salary= c (34500.00,560890.78,67000.78,25000.00,
78900.00,25000.00,45000.00,90000))
print (data)
print ( "==========================" )
print ( sample_frac (data,0.2))
print ( "==========================" )
print ( sample_frac (data,0.4))
print ( "==========================" )
print ( sample_frac (data,0.7))
print ( "==========================" )
|
Output:

Similar Reads
Remove Duplicate rows in R using Dplyr
In this article, we are going to remove duplicate rows in R programming language using Dplyr package. Method 1: distinct() This function is used to remove the duplicate rows in the dataframe and get the unique data Syntax: distinct(dataframe) We can also remove duplicate rows based on the multiple c
3 min read
Filter rows based on text patterns using R
In this article, we will explore various methods to filter rows based on text patterns by using the R Programming Language. How do we filter the rows based on text patterns?R language offers various methods to filter the rows based on text patterns in various data sets. By using these methods provid
3 min read
Filtering row which contains a certain string using Dplyr in R
In this article, we will learn how to filter rows that contain a certain string using dplyr package in R programming language. Functions Used Two main functions which will be used to carry out this task are: filter(): dplyr package's filter function will be used for filtering rows based on condition
4 min read
Filter multiple values on a string column in R using Dplyr
In this article we will learn how to filter multiple values on a string column in R programming language using dplyr package. Method 1: Using filter() method filter() function is used to choose cases and filtering out the values based on the filtering conditions. Syntax: filter(df, condition) Parame
3 min read
Remove Rows with NA Using dplyr Package in R
NA means Not Available is often used for missing values in a dataset. In Machine Learning NA values are a common problem and if not treated properly can create severe issues during data analysis. NA is also referred as NaN which means Not a number. To understand NA values we can think of an admissio
5 min read
Select Random Samples in R using Dplyr
In this article, we will be looking at different methods for selecting random samples from the Dplyr package of the R programming language. To install and import the Dplyr package in the R programming language, the user needs to follow the syntax: Syntax: install.packages("dplyr") library(dplyr) Met
2 min read
Analyzing Data in Subsets Using R
In this article, we will explore various methods to analyze data in subsets using R Programming Language. How to analyze data in the subsetsAnalyzing data encompasses employing diverse methodologies to acquire insights, recognize patterns, and draw significant conclusions from datasets. This encompa
4 min read
How to use a variable in dplyr::filter?
Data manipulation and transformation require the use of data manipulation verbs and the dplyr package in R is crucial. One of its functions is filter(), which allows the row to be selected based on imposed conditions. However, one of the activities that frequently occur in data analysis processing i
4 min read
How to remove NA values with dplyr filter
In this article, we will examine various methods to remove NA values with dplyr filter by using R Programming Language. Remove NA values with the dplyr filterR language offers various methods to remove NA values with dplyr filter efficiently. By using these methods provided by R, it is possible to r
3 min read
Mutating column in dplyr using rowSums
In this article, we are going to discuss how to mutate columns in dataframes using the dplyr package in R Programming Language. Installation The package can be downloaded and installed in the R working space using the following command : Install Command - install.packages("dplyr") Load Command - lib
3 min read