Open In App

What is the Efficient Way of Reading a Huge Text File?

Last Updated : 21 Apr, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

In C++, reading a large text file efficiently requires a careful approach to ensure optimal performance in terms of memory usage and processing speed. In this article, we will learn how to read a huge text file efficiently in C++.

Read a Large Text File Efficiently in C++

The most efficient way to read a large text file is to read the file in chunks rather than line by line or one character at a time by using the combination of std::ifstream and std::istringstream to parse each chunk. This method significantly reduces the I/O operations, thereby improving the overall performance.

Approach

  • Create an object for the text file using std::ifstream.
  • Open the file using the file stream object by passing the path of the file to the file stream object.
  • Read the file in chunks of specified size referred to as BUFFER_SIZE, and process each chunk.
  • Use std::istringstream to parse each chunk into lines.
  • Print the non-empty lines returned by istringstream.
  • Process any remaining data in the last chunk to ensure no data is left unprocessed.
  • Close the file after all data in the file has been processed.

C++ Program to Read a Huge Text File

The below program illustrates how we can read a huge text file effectively in C++.

C++
// C++ Program to read a huge text file efficiently

#include <fstream>
#include <iostream>
#include <sstream>
#include <vector>
using namespace std;

// Declare the buffer size
const int BUFFER_SIZE = 1024;

int main()
{
    ifstream file("huge_file.txt");

    if (!file.is_open()) {
        cerr
            << "Error: Could not open file 'huge_file.txt'."
            << endl;
        return 1;
    }

    vector<char> buffer(BUFFER_SIZE);
    istringstream iss;

    // Parse the chunk of data from the text file into lines
    while (file.read(buffer.data(), BUFFER_SIZE)) {
        streamsize bytes_read = file.gcount();
        iss.str(string(buffer.data(), bytes_read));
        iss.clear();

        string line;
        while (getline(iss, line)) {
            if (!line.empty()) {
                cout << line << endl;
            }
        }
    }

    // Process any remaining data
    streamsize bytes_read = file.gcount();
    if (bytes_read > 0) {
        string last_chunk(buffer.data(), bytes_read);
        cout << last_chunk;
    }

    // close the file
    file.close();
    return 0;
}


Output

Hello World
GeeksforGeeks
C++ Java Python
GoLang Rust JavaScript

Time Complexity: O(N), where N is the total number of characters in the text file.
Auxiliary Space: O(M), where M is the length of the longest line in the text file.


Next Article
Practice Tags :

Similar Reads