Bigram formation from a given Python list
Last Updated :
21 Apr, 2023
When we are dealing with text classification, sometimes we need to do certain kind of natural language processing and hence sometimes require to form bigrams of words for processing. In case of absence of appropriate library, its difficult and having to do the same is always quite useful. Let's discuss certain ways in which this can be achieved.
Method #1 : Using list comprehension + enumerate() + split() The combination of above three functions can be used to achieve this particular task. The enumerate function performs the possible iteration, split function is used to make pairs and list comprehension is used to combine the logic.
Python3
# Python3 code to demonstrate
# Bigram formation
# using list comprehension + enumerate() + split()
# initializing list
test_list = ['geeksforgeeks is best', 'I love it']
# printing the original list
print ("The original list is : " + str(test_list))
# using list comprehension + enumerate() + split()
# for Bigram formation
res = [(x, i.split()[j + 1]) for i in test_list
for j, x in enumerate(i.split()) if j < len(i.split()) - 1]
# printing result
print ("The formed bigrams are : " + str(res))
Output :
The original list is : ['geeksforgeeks is best', 'I love it'] The formed bigrams are : [('geeksforgeeks', 'is'), ('is', 'best'), ('I', 'love'), ('love', 'it')]
Time Complexity: O(n), where n is the length of the list test_list
Auxiliary Space: O(n) additional space of size n is created where n is the number of elements in the res list
Method #2 : Using zip() + split() + list comprehension The task that enumerate performed in the above method can also be performed by the zip function by using the iterator and hence in a faster way. Let's discuss certain ways in which this can be done.
Python3
# Python3 code to demonstrate
# Bigram formation
# using zip() + split() + list comprehension
# initializing list
test_list = ['geeksforgeeks is best', 'I love it']
# printing the original list
print ("The original list is : " + str(test_list))
# using zip() + split() + list comprehension
# for Bigram formation
res = [i for j in test_list
for i in zip(j.split(" ")[:-1], j.split(" ")[1:])]
# printing result
print ("The formed bigrams are : " + str(res))
Output :
The original list is : ['geeksforgeeks is best', 'I love it'] The formed bigrams are : [('geeksforgeeks', 'is'), ('is', 'best'), ('I', 'love'), ('love', 'it')]
Method #3 : Using reduce():
Algorithm:
- Initialize the input list "test_list".
- Print the original list "test_list".
- Use a list comprehension and enumerate() to form bigrams for each string in the input list.
- Append each bigram tuple to a result list "res".
- Print the formed bigrams in the list "res".
Python3
from functools import reduce
# initializing list
test_list = ['geeksforgeeks is best', 'I love it']
# printing the original list
print("The original list is : " + str(test_list))
# using reduce() method to form bigrams
res = reduce(lambda acc, s: acc + [(w, s.split()[i+1]) for i, w in enumerate(s.split()) if i < len(s.split())-1], test_list, [])
# printing result
print("The formed bigrams are : " + str(res))
#This code is contributed by Jyothi pinjala.
OutputThe original list is : ['geeksforgeeks is best', 'I love it']
The formed bigrams are : [('geeksforgeeks', 'is'), ('is', 'best'), ('I', 'love'), ('love', 'it')]
Time complexity:
The time complexity of the code is O(n*m) where n is the number of strings in the input list and m is the maximum number of words in any string. The reason for this is that the code iterates through each string in the input list and splits it into words, and then iterates through each word to form bigrams. This operation is performed once for each string in the input list, so the time complexity is proportional to the number of strings in the list and the maximum number of words in any string.
Space complexity:
The space complexity of the code is also O(n*m) where n is the number of strings in the input list and m is the maximum number of words in any string. The reason for this is that the code creates a result list "res" that stores all the formed bigrams. The size of the list is proportional to the number of bigrams formed, which in turn is proportional to the number of words in each string. Therefore, the space complexity is proportional to the number of strings in the input list and the maximum number of words in any string.
Similar Reads
Python Tutorial | Learn Python Programming Language Python Tutorial â Python is one of the most popular programming languages. Itâs simple to use, packed with features and supported by a wide range of libraries and frameworks. Its clean syntax makes it beginner-friendly.Python is:A high-level language, used in web development, data science, automatio
10 min read
Python Interview Questions and Answers Python is the most used language in top companies such as Intel, IBM, NASA, Pixar, Netflix, Facebook, JP Morgan Chase, Spotify and many more because of its simplicity and powerful libraries. To crack their Online Assessment and Interview Rounds as a Python developer, we need to master important Pyth
15+ min read
Python OOPs Concepts Object Oriented Programming is a fundamental concept in Python, empowering developers to build modular, maintainable, and scalable applications. By understanding the core OOP principles (classes, objects, inheritance, encapsulation, polymorphism, and abstraction), programmers can leverage the full p
11 min read
Python Projects - Beginner to Advanced Python is one of the most popular programming languages due to its simplicity, versatility, and supportive community. Whether youâre a beginner eager to learn the basics or an experienced programmer looking to challenge your skills, there are countless Python projects to help you grow.Hereâs a list
10 min read
Python Exercise with Practice Questions and Solutions Python Exercise for Beginner: Practice makes perfect in everything, and this is especially true when learning Python. If you're a beginner, regularly practicing Python exercises will build your confidence and sharpen your skills. To help you improve, try these Python exercises with solutions to test
9 min read
Python Programs Practice with Python program examples is always a good choice to scale up your logical understanding and programming skills and this article will provide you with the best sets of Python code examples.The below Python section contains a wide collection of Python programming examples. These Python co
11 min read
Enumerate() in Python enumerate() function adds a counter to each item in a list or other iterable. It turns the iterable into something we can loop through, where each item comes with its number (starting from 0 by default). We can also turn it into a list of (number, item) pairs using list().Let's look at a simple exam
3 min read
Python Data Types Python Data types are the classification or categorization of data items. It represents the kind of value that tells what operations can be performed on a particular data. Since everything is an object in Python programming, Python data types are classes and variables are instances (objects) of thes
9 min read
Python Introduction Python was created by Guido van Rossum in 1991 and further developed by the Python Software Foundation. It was designed with focus on code readability and its syntax allows us to express concepts in fewer lines of code.Key Features of PythonPythonâs simple and readable syntax makes it beginner-frien
3 min read
Input and Output in Python Understanding input and output operations is fundamental to Python programming. With the print() function, we can display output in various formats, while the input() function enables interaction with users by gathering input during program execution. Taking input in PythonPython input() function is
8 min read