Python | Remove all duplicates words from a given sentence
Last Updated :
30 Dec, 2024
Goal is to process a sentence such that all duplicate words are removed, leaving only the first occurrence of each word. Final output should maintain the order of the words as they appeared in the original sentence. Let's understand how to achieve the same using different methods:
Using set with join()
A set in Python is a collection that automatically removes duplicate values. This method is the simplest and fastest. However, set does not maintain order hence, it's not ideal for situations where maintaining order is intended.
Python
s1 = "Geeks for Geeks"
s2 = s1.split() # Split the sentence into words
# Convert the list to a set and back to a list to remove duplicates
s3 = list(set(s2))
# Join the list back into a sentence
s4 = ' '.join(s3)
print(s4)
Explanation:
- sentence s1 = "Geeks for Geeks" is split into individual words using split() method. This results in the list s2 = ["Geeks", "for", "Geeks"].
- list s2 is converted into a set using set(s2). This automatically removes duplicates because sets do not allow repeated elements. The set will be {"Geeks", "for"}.
- Then, set is converted back into a list with list(set(s2)), which results in the list s3 = ["Geeks", "for"]. Note that the order of elements may not be preserved when converting from a set back to a list.
Other methods that we can use to remove all duplicates words from a given sentence are:
Using List Comprehension with Set
This method uses list comprehension which is a concise way to create lists. We also use a set to track the words we have already seen. By doing this we can efficiently remove duplicates and maintain the original order of the words in the sentence.
Python
s1 = "Geeks for Geeks"
a = s1.split() # Split the sentence into words
seen = set() # Set to track unique words
# Use list comprehension to filter out duplicates while maintaining order
res = [word for word in a if not (word in seen or seen.add(word))]
# Join the list back into a sentence
s2 = ' '.join(res)
print(s2)
Explanation:
- An empty set seen is created to keep track of the words that have already been encountered.
- list comprehension is used to iterate over a. The condition not (word in seen or seen.add(word)) ensures each word is added only the first time it appears, while maintaining the original order.
- The unique words are joined back into a string with ' '.join(res), resulting in s2 = "Geeks for".
Using dict.fromkeys()
In this method we use dictionaries. In Python 3.7 and later dictionaries remember the order in which items are added. By using dict.fromkeys() we can remove duplicates while keeping the order of the words intact.
Python
s1 = "Geeks for Geeks"
s2 = s1.split() # Split the sentence into words
# Use a dictionary to remove duplicates and preserve order
s3 = list(dict.fromkeys(s2))
# Join the list back into a sentence
s4 = ' '.join(s3)
print(s4)
Explanation:
- dict.fromkeys(s2) creates a dictionary where each word in s2 becomes a key. Since dictionaries do not allow duplicate keys, this automatically removes any duplicates while preserving the order. Converting the dictionary back to a list with list() gives s3 = ["Geeks", "for"].
Using Simple Loop
This method is the most basic way of removing duplicates but it can be slower for longer sentences. It loops through each word checks if it's already been added to a result list and adds it only if it hasn't appeared before.
Python
s1 = "Geeks for Geeks"
s2 = s1.split() # Split the sentence into words
res = [] # List to store unique words
# Loop through words and add only unique ones to the result
for word in s2:
if word not in res:
res.append(word)
# Join the list back into a sentence
s3 = ' '.join(res)
print(s3)
Example:
- An empty list res is initialized to store the unique words.
- for loop iterates through each word in s2. If the word is not already in res, it is appended to res. This ensures that only the first occurrence of each word is added.
Similar Reads
Remove All Duplicates from a Given String in Python The task of removing all duplicates from a given string in Python involves retaining only the first occurrence of each character while preserving the original order. Given an input string, the goal is to eliminate repeated characters and return a new string with unique characters. For example, with
2 min read
Ways to remove duplicates from list in Python In this article, we'll learn several ways to remove duplicates from a list in Python. The simplest way to remove duplicates is by converting a list to a set.Using set()We can use set() to remove duplicates from the list. However, this approach does not preserve the original order.Pythona = [1, 2, 2,
2 min read
Reverse each word in a sentence in Python In this article, we will explore various methods to reverse each word in a sentence. The simplest approach is by using a loop.Using LoopsWe can simply use a loop (for loop) to reverse each word in a sentence.Pythons = "Hello World" # Split 's' into words words = s.split() # Reverse each word using a
2 min read
Remove duplicate words from Sentence using Regular Expression Given a string str which represents a sentence, the task is to remove the duplicate words from sentences using regular Expression in Programming Languages like C++, Java, C#, Python, etc. Examples of Remove Duplicate Words from SentencesInput: str = "Good bye bye world world" Output: Good bye world
5 min read
Remove Duplicate/Repeated words from String Given a string S, the task is to remove all duplicate/repeated words from the given string. Examples: Input: S = "Geeks for Geeks A Computer Science portal for Geeks" Output: Geeks for A Computer Science portal Explanation: here 'Geeks' and 'for' are duplicate so these words are removed from the str
4 min read