How to read large text files in Python?
Last Updated :
13 Sep, 2022
In this article, we will try to understand how to read a large text file using the fastest way, with less memory usage using Python.
To read large text files in Python, we can use the file object as an iterator to iterate over the file and perform the required task. Since the iterator just iterates over the entire file and does not require any additional data structure for data storage, the memory consumed is less comparatively. Also, the iterator does not perform expensive operations like appending hence it is time-efficient as well. Files are iterable in Python hence it is advisable to use iterators.
Problem with readline() method to read large text files
In Python, files are read by using the readlines() method. The readlines() method returns a list where each item of the list is a complete sentence in the file. This method is useful when the file size is small. Since readlines() method appends each line to the list and then returns the entire list it will be time-consuming if the file size is extremely large say in GB. Also, the list will consume a large chunk of the memory which can cause memory leakage if sufficient memory is unavailable.
Read large text files in Python using iterate
In this method, we will import fileinput module. The input() method of fileinput module can be used to read large files. This method takes a list of filenames and if no parameter is passed it accepts input from the stdin, and returns an iterator that returns individual lines from the text file being scanned.
Note: We will also use it to calculate the time taken to read the file using Python time.
Python3
# import module
import fileinput
import time
#time at the start of program is noted
start = time.time()
#keeps a track of number of lines in the file
count = 0
for lines in fileinput.input(['sample.txt']):
print(lines)
count = count + 1
#time at the end of program execution is noted
end = time.time()
#total time taken to print the file
print("Execution time in seconds: ",(end - start))
print("No. of lines printed: ",count)
Output:
The fastest way to read a large text file using the iterator of a file object
Here, the only difference is that we will use the iterator of a file object. The open() function wraps the entire file into a file object. After that, we use an iterator to get the lines in the file object. We open the file in a 'with' block as it automatically closes the file as soon as the entire block executes.
Python3
import time
start = time.time()
count = 0
with open("sample.txt") as file:
for line in file:
print(line)
count = count + 1
end = time.time()
print("Execution time in seconds: ",(end-start))
print("No of lines printed: ",count)
Output:
The time required in the second approach is comparatively less than the first method.
Similar Reads
How to Read Text File Into List in Python? In Python, reading a text file into a list is a common task for data processing. Depending on how the file is structuredâwhether it has one item per line, comma-separated values or raw contentâdifferent approaches are available. Below are several methods to read a text file into a Python list using
2 min read
How To Read .Data Files In Python? Unlocking the secrets of reading .data files in Python involves navigating through diverse structures. In this article, we will unravel the mysteries of reading .data files in Python through four distinct approaches. Understanding the structure of .data files is essential, as their format may vary w
4 min read
How to read multiple text files from folder in Python? Reading multiple text files from a folder in Python means accessing all the .txt files stored within a specific directory and processing their contents one by one. For example, if a folder contains three text files, each with a single line of text, you might want to read each fileâs content and proc
3 min read
How to open two files together in Python? Prerequisites: Reading and Writing text files in Python Python provides the ability to open as well as work with multiple files at the same time. Different files can be opened in different modes, to simulate simultaneous writing or reading from these files. An arbitrary number of files can be opened
2 min read
How to Read from a File in Python Reading from a file in Python means accessing and retrieving the contents of a file, whether it be text, binary data or a specific data format like CSV or JSON. Python provides built-in functions and methods for reading a file in python efficiently.Example File: geeks.txtHello World Hello GeeksforGe
5 min read