How to find duplicate values in a list in R
Last Updated :
12 Apr, 2024
In this article, we will see how to find duplicate values in a list in the R Programming Language in different scenarios.
Finding duplicate values in a List
In R, the duplicated() function is used to find the duplicate values present in the R objects. This function determines which elements of a List are duplicates and returns a logical vector (Holds TRUE/FALSE values) indicating which elements are duplicates. TRUE is returned if the element already exists. Otherwise, FALSE will be returned.
Syntax:
duplicated(List_name)
Here, List_name is the input list.
Let's have a list with 10 values and find the duplicate values.
R
# Create a List
List_data =list(1,2,3,4,5,6,7,5,4,3)
print(List_data)
# Find duplicates in the above List
print(duplicated(List_data))
Output:
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] 4
[[5]]
[1] 5
[[6]]
[1] 6
[[7]]
[1] 7
[[8]]
[1] 5
[[9]]
[1] 4
[[10]]
[1] 3
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE
We can see that last three elements in the List are duplicated. So TRUE is returned for them.
Let's have a list that hold 2 lists and find duplicates in each of the list separately.
R
# Create a List with 2 lists
List_data =list(list1=list(100,200,300,300,300),
list2=list("Java","HTML","PHP","JSP","Statistics"))
print(List_data)
# Find duplicates in list1 from List_data
print(duplicated(List_data$list1))
# Find duplicates in list2 from List_data
print(duplicated(List_data$list2))
Output:
$list1
$list1[[1]]
[1] 100
$list1[[2]]
[1] 200
$list1[[3]]
[1] 300
$list1[[4]]
[1] 300
$list1[[5]]
[1] 300
$list2
$list2[[1]]
[1] "Java"
$list2[[2]]
[1] "HTML"
$list2[[3]]
[1] "PHP"
$list2[[4]]
[1] "JSP"
$list2[[5]]
[1] "Statistics"
[1] FALSE FALSE FALSE TRUE TRUE
[1] FALSE FALSE FALSE FALSE FALSE
There are two duplicates in list1.
Let's create a List having three vectors and find the duplicates in each vector.
R
# Create a List with 3 vectors
List_data =list(Id=c(1,2,3,4,5,4,5),Subject=c("Java","HTML","HTML","Python"),
Marks=c(100,89,78,69,80))
print(List_data)
# Find duplicates in the Id
duplicated(List_data$Id)
# Find duplicates in the Subject
duplicated(List_data$Subject)
# Find duplicates in the Marks
duplicated(List_data$Marks)
Output:
$Id
[1] 1 2 3 4 5 4 5
$Subject
[1] "Java" "HTML" "HTML" "Python"
$Marks
[1] 100 89 78 69 80
[1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE
[1] FALSE FALSE TRUE FALSE
[1] FALSE FALSE FALSE FALSE FALSE
- Id holds two duplicate values i.e 4 and 5
- Subject holds one duplicate value i.e "HTML"
- There are no duplicates in the Marks vector.
Let's create a List having 2 vectors and return total number of duplicate elements. To do this we need to use the sum() function and pass the duplicated() function as a parameter to it.
R
# Create a List with 2 vectors
List_data =list(Id=c(1,2,3,4,5,4,5),Subject=c("Java","HTML","HTML","Python"))
print(List_data)
# Find duplicates in the Id
sum(duplicated(List_data$Id))
# Find duplicates in the Subject
sum(duplicated(List_data$Subject))
Output:
$Id
[1] 1 2 3 4 5 4 5
$Subject
[1] "Java" "HTML" "HTML" "Python"
[1] 2
[1] 1
There are 2 duplicates in the Id vector and one duplicate in the Subject vector.
Conclusion
In conclusion, identifying duplicate values in a list in R is essential for data cleaning and quality assurance. By utilizing various methods such as the duplicated()
function we can efficiently detect and handle duplicate values.
Similar Reads
How to find duplicate values in a factor in R finding duplicates in data is an important step in data analysis and management to ensure data quality, accuracy, and efficiency. In this article, we will see several approaches to finding duplicate values in a factor in the R Programming Language. It can be done with two methods Using duplicated()
2 min read
How to find missing values in a list in R Missing values are frequently encountered in data analysis. In R Programming Language effectively dealing with missing data is critical for correct analysis and interpretation. Whether you're a seasoned data scientist or a new R user, understanding how to identify missing values is critical. In this
3 min read
How to Find Duplicate Rows in PL/SQL Finding duplicate rows is a widespread requirement when dealing with database analysis tasks. Duplicate rows often create problems in analyzing tasks. Detecting them is very important. PL/SQL is a procedural extension for SQL. We can write custom scripts with the help of PL/SQL and thus identifying
5 min read
How to create a list in R In this article, we will discuss What is a list and various methods to create a list using R Programming Language. What is a list?A list is the one-dimensional heterogeneous data i.e., which stores the data of various types such as integers, float, strings, logical values, and characters. These list
2 min read
How to Find Unique Values and Sort Them in R Finding and Sorting unique values is a common task in R for data cleaning, exploration, and analysis. In this article, we will explore different methods to find unique values and sort them in R Programming Language. Using sort() with unique()In this approach, we use the unique() function to find uni
2 min read