Open In App

Remove Duplicate Dictionaries from Nested Dictionary - Python

Last Updated : 29 Jan, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

We are given a nested dictionary we need to remove the duplicate dictionaries from the nested dictionary. For example we are given a nested dictionary d = {'key1': [{'a': 1}, {'b': 2}, {'a': 1}], 'key2': [{'x': 3}, {'y': 4}]} we need to remove the duplicate dictionary from this dictionary so output should be {'key1': [{'a': 1}, {'b': 2}], 'key2': [{'y': 4}, {'x': 3}]}. We can use sets , list comprehension for this.

Using a Set

A simple way is to convert list of dictionaries to a set of frozen sets (since sets cannot contain duplicate elements) and then back to a list.

Python
d = {'key1': [{'a': 1}, {'b': 2}, {'a': 1}], 'key2': [{'x': 3}, {'y': 4}]}

# Remove duplicate dictionaries by converting to frozensets and back
for key, value in d.items():
    d[key] = [dict(t) for t in {frozenset(item.items()) for item in value}]
    
print(d)

Output
{'key1': [{'b': 2}, {'a': 1}], 'key2': [{'y': 4}, {'x': 3}]}

Explanation:

  • Convert dictionaries to frozensets to remove duplicates.
  • Convert frozensets back to dictionaries to restore structure.

Using a Loop and a Set

We can manually track seen dictionaries using a set to avoid duplicates.

Python
d = {'key1': [{'a': 1}, {'b': 2}, {'a': 1}], 'key2': [{'x': 3}, {'y': 4}]}

for key, value in d.items():
  
    # empty set to track seen dictionaries
    seen = set()  
    
    # list to store unique dictionaries
    u = []  
    
    for item in value:
      
        # Convert the dictionary to a frozenset for comparison
        item_f = frozenset(item.items())  
        
        # If the frozenset is not in the seen set, it's a unique dictionary
        if item_f not in seen:  
          
            # Add the frozenset to the seen set
            seen.add(item_f)  
            
            # Append the unique dictionary to the result list
            u.append(item)  
            
    # Update the key in the dictionary with the list of unique dictionaries
    d[key] = u  

print(d)

Output
{'key1': [{'a': 1}, {'b': 2}], 'key2': [{'x': 3}, {'y': 4}]}

Explanation:

  • Track unique dictionaries for each list of dictionaries, convert each dictionary to a frozenset and use a set to track previously seen dictionaries ensuring only unique dictionaries are kept.
  • Update dictionary after processing each list update the dictionary with list of unique dictionaries removing any duplicates.

Using List Comprehension and a Set

Code uses list comprehension and a set to remove duplicates from lists of dictionaries. It converts each dictionary to a frozenset to ensure uniqueness then iterates through list adding only unique dictionaries to a new list which is then used to update the original dictionary.

Python
d = {'key1': [{'a': 1}, {'b': 2}, {'a': 1}], 'key2': [{'x': 3}, {'y': 4}]}

for key, value in d.items():
  
    # Initialize an empty list to store unique dictionaries for the current key
    d[key] = []
  
  	# Use list comprehension to add unique dictionaries to the list
    [d[key].append(item) for item in value 
     
     # Check if the frozenset is not already in the list
     if frozenset(item.items()) not in {frozenset(i.items()) for i in d[key]}]  

print(d)

Output
{'key1': [{'a': 1}, {'b': 2}], 'key2': [{'x': 3}, {'y': 4}]}

Explanation:

  • Loop through dictionary Iterates through the dictionary, processing each key and its associated list of dictionaries.
  • List comprehension for each dictionary in the list it converts the dictionary to a frozenset and checks if it's already in list of seen dictionaries. Only unique dictionaries are appended to list.

Using pandas

If we are working with large datasets we can use pandas to convert dictionaries into a DataFrame and remove duplicates easily.

Python
import pandas as pd

d = {'key1': [{'a': 1}, {'b': 2}, {'a': 1}], 'key2': [{'x': 3}, {'y': 4}]}

for key, value in d.items():
  
    # Convert the list of dictionaries to a DataFrame, remove duplicates, and convert it back to a list of dictionaries
    d[key] = pd.DataFrame(value).drop_duplicates().to_dict(orient='records')

print(d)

Output
{'key1': [{'a': 1.0, 'b': nan}, {'a': nan, 'b': 2.0}], 'key2': [{'x': 3.0, 'y': nan}, {'x': nan, 'y': 4.0}]}

Explanation:

  • Convert to DataFrame and remove duplicates each list of dictionaries is converted to a pandas DataFrame where duplicates are removed using the drop_duplicates() method.
  • Convert back to list of dictionaries dataFrame is converted back into a list of dictionaries with the to_dict(orient='records') method, updating the original dictionary.

Next Article

Similar Reads