Filtering row which contains a certain string using Dplyr in R
Last Updated :
28 Jul, 2021
In this article, we will learn how to filter rows that contain a certain string using dplyr package in R programming language.
Functions Used
Two main functions which will be used to carry out this task are:
- filter(): dplyr package’s filter function will be used for filtering rows based on condition
Syntax: filter(df , condition)
Parameter :
- df: The data frame object
- condition: The condition to filter the data upon
- grepl(): grepl() function will is used to return the value TRUE if the specified string pattern is found in the vector and FALSE if it is not found.
Syntax: grepl(pattern, string, ignore.case=FALSE)
Parameters:
- pattern: regular expressions pattern
- string: character vector to be searched
- ignore.case: whether to ignore case in the search. Here ignore.case is an optional parameter as is set to FALSE by default.
Dataframe in Use:
marks |
age |
roles |
20.1 |
21 |
Software Eng. |
30.2 |
22 |
Software Dev |
40.3 |
23 |
Data Analyst |
50.4 |
24 |
Data Eng. |
60.5 |
25 |
FrontEnd Dev |
Filtering rows that contain the given string
Here we have to pass the string to be searched in the grepl() function and the column to search in, this function returns true or false according to which filter() function prints the rows.
Syntax: df %>% filter(grepl(‘Pattern’, column_name))
Parameters:
df: Dataframe object
- grepl(): finds the pattern String
- “Pattern”: pattern(string) to be found
- column_name: pattern(string) will be searched in this column
Example:
R
library (dplyr)
df <- data.frame ( marks = c (20.1, 30.2, 40.3, 50.4, 60.5),
age = c (21:25),
roles = c ( 'Software Eng.' , 'Software Dev' ,
'Data Analyst' , 'Data Eng.' ,
'FrontEnd Dev' ))
df %>% filter ( grepl ( 'Dev' , roles))
|
Output:
marks age roles
1 30.2 22 Software Dev
2 60.5 25 FrontEnd Dev
Filtering rows that do not contain the given string
Note the only difference in this code from the above approach is that here we are using a ‘!‘ not operator, this operator inverts the output provided by the grepl() function by converting TRUE to FALSE and vice versa, this in result only prints the rows which does not contain the patterns and filter outs the rows containing the pattern.
Syntax: df %>% filter(!grepl(‘Pattern’, column_name))
Parameters:
- df: Dataframe object
- grepl(): finds the pattern String
- “Pattern“: pattern(string) to be found
- column_name: pattern(string) will be searched in this column
Example:
R
library (dplyr)
df <- data.frame ( marks = c (20.1, 30.2, 40.3, 50.4, 60.5),
age = c (21:25),
roles = c ( 'Software Eng.' , 'Software Dev' ,
'Data Analyst' , 'Data Eng.' ,
'FrontEnd Dev' ))
df %>% filter (! grepl ( 'Eng.' , roles))
|
Output:
marks age roles
1 30.2 22 Software Dev
2 40.3 23 Data Analyst
3 60.5 25 FrontEnd Dev
Filtering rows containing Multiple patterns(strings)
This code is also similar to the above approaches the only difference is that while passing the multiple patterns(string) in the grepl() function, the patterns are separated with the OR(‘ | ‘) operator. This prints all the rows containing the specified pattern.
Syntax:
df %>% filter(grepl(‘Patt.1 | Patt.2‘, column_name))
Example:
R
library (dplyr)
df <- data.frame ( marks = c (20.1, 30.2, 40.3, 50.4, 60.5),
age = c (21:25),
roles = c ( 'Software Eng.' , 'Software Dev' ,
'Data Analyst' , 'Data Eng.' ,
'FrontEnd Dev' ))
df %>% filter ( grepl ( 'Dev|Eng.' , roles))
|
Output:
marks age roles
1 20.1 21 Software Eng.
2 30.2 22 Software Dev
3 50.4 24 Data Eng.
4 60.5 25 FrontEnd Dev
Filtering rows that do not contain multiple patterns(strings)
This code is similar to the above approach, the only difference is that we are using ‘!‘ not operator, this operator inverts the output provided by the grepl() function by converting TRUE to FALSE and vice versa, this in result only prints the rows which do not contain the specified multiple patterns and filter outs the rows containing the patterns.
Syntax:
df %>% filter(!grepl(‘Patt.1 | Patt.2’, column_name))
Example:
R
library (dplyr)
df <- data.frame ( marks = c (20.1, 30.2, 40.3, 50.4, 60.5),
age = c (21:25),
roles = c ( 'Software Eng.' , 'Software Dev' ,
'Data Analyst' , 'Data Eng.' ,
'FrontEnd Dev' ))
df %>% filter (! grepl ( 'Data|Front' , roles))
|
Output:
marks age roles
1 20.1 21 Software Eng.
2 30.2 22 Software Dev
Similar Reads
Filter multiple values on a string column in R using Dplyr
In this article we will learn how to filter multiple values on a string column in R programming language using dplyr package. Method 1: Using filter() method filter() function is used to choose cases and filtering out the values based on the filtering conditions. Syntax: filter(df, condition) Parame
3 min read
Filter or subsetting rows in R using Dplyr
In this article, we are going to filter the rows from dataframe in R programming language using Dplyr package. Dataframe in use: Method 1: Subset or filter a row using filter() To filter or subset row we are going to use the filter() function. Syntax: filter(dataframe,condition) Here, dataframe is t
6 min read
Extracting a String Between Two Other Strings in R
String manipulation is a fundamental aspect of data processing in R. Whether you're cleaning data, extracting specific pieces of information, or performing complex text analysis, the ability to efficiently work with strings is crucial. One common task in string manipulation is extracting a substring
3 min read
Case when statement in R Dplyr Package using case_when() Function
This article focuses upon the case when statement in the R programming language using the case_when() function from the Dplyr package. Case when is a mechanism using which we can vectorize a bunch of if and else if statements. In simple words, using a case when statement we evaluate a condition expr
4 min read
Sum Across Multiple Rows and Columns Using dplyr Package in R
In this article, we are going to see how to sum multiple Rows and columns using Dplyr Package in R Programming language. The dplyr package is used to perform simulations in the data by performing manipulations and transformations. It can be installed into the working space using the following comman
3 min read
Filter data by multiple conditions in R using Dplyr
In this article, we will learn how can we filter dataframe by multiple conditions in R programming language using dplyr package. The filter() function is used to produce a subset of the data frame, retaining all rows that satisfy the specified conditions. The filter() method in R programming languag
3 min read
Filter Rows Based on Conditions in a DataFrame in R
In this article, we will explore various methods to filter rows based on Conditions in a data frame by using the R Programming Language. How to filter rows based on Conditions in a data frame R language offers various methods to filter rows based on Conditions in a data frame. By using these methods
3 min read
Single-Table Analysis with dplyr using R Language
The dplyr package is used to perform simulations in the data by performing manipulations and transformations. It can be installed into the working space using the following command : install.packages("dplyr") Let's create the main dataframe: C/C++ Code #installing the required libraries library(dply
5 min read
Converting a Vector of Type Character into a String Using R
In R Language data manipulation often involves converting data types. One common task is converting a vector of type characters into a single string. This article will guide you through the process using base R functions and additional packages like stringr and paste. We will discuss different metho
3 min read
strings.Contains Function in Golang with Examples
strings.Contains Function in Golang is used to check the given letters present in the given string or not. If the letter is present in the given string, then it will return true, otherwise, return false. Syntax:Â func Contains(str, substr string) bool Here, str is the original string and substr is t
2 min read