How to Print the Longest Line(s) in a File in Linux
Last Updated :
02 Jan, 2023
Text files are frequently processed while using the Linux command line. This article will go over how to determine which lines in a file are the longest. we will use some commands like awk and grep to achieve our goal to print lines with the longest length. When working with enormous log files. Each text line in these files, which number in the hundreds of thousands, is a single JSON document that has been rendered as a single text line. To properly reroute the file(s) to a target server, such as an elastic search server, it might be necessary to process these text lines through a proxy server if their size is unusually/very large. Sometimes when file size is tremendous Sadly, egrep reports that “the regex is too long.” Then the awk command comes into play.
First, have a look at both of these commands.
1. awk command:
When using the command line, the scripting language awk is useful. It’s a commonly used command for processing text. The script runs to look for patterns that match in one or more files, and if it finds any, determines if those patterns should carry out particular actions. This manual explores the capabilities of the AWK Linux command.
Here we use the awk command for printing every line that fits a particular pattern.
Syntax:
$ awk options ‘selection _criteria {action }’ input-file > output-file
2. Grep command:
The most potent and often used Linux command-line tool is the grep (global regular expression print) command. By giving Grep search criteria, you can look for pertinent information. In a given file, it looks for a specific expression pattern. When a match is made, it publishes all the file’s lines that adhere to the given pattern.
Syntax:
$ grep "string" file name
Create the text file:
Run the command listed below to create a text file using the command line:
$ touch file_name.txt
Then include texts into your document using any text editor of your choice (we’ll be using nano editor here).
nano file_name.txt
Add texts to the file after that. Use the cat command along with the file name to view the file.
cat file_name.txt
Our document has been made, and the content has been added.
Method 1: Using the Awk command, find the longest line in a file
Let’s prepend the size of each line with a one-liner in awk to help us determine which lines are the longest:
$ awk ‘{printf “%2d| %s\n”,length,$0}’ file_name.txt
The longest line length is 52, as shown in the screen capture up top.
The Pitfall of Using the wc Command
- We can print the max line length using the wc command’s -L (-max-line-length) option: If the input contains TAB characters, wc -L will catch us off guard.
- The reason for this is that, despite the long option’s name, wc -L outputs the max display width rather than the maximum line length.
- A TAB is counted as 8 characters by the wc command. There is currently no way to modify it.
Method 2: Assemble the wc and grep Commands:
To locate all longest lines, we can now simply combine the wc -L and grep commands:
You can utilize regex from the grep command & max-line-length from the wc command by combining these two instructions. As shown in the example below, the wc command accepts the -L command flag to specify the maximum line length.
$ grep -E “^.{$(tr ‘\t’ ‘ ‘ <file_name.txt | wc -L)}$” file_name.txt
You got your line with the longest length.
Benchmarking Performance:
With the help of the time command, we’ll evaluate how well the wc & grep solution performs.
- grep and wc command benchmark:
$ time grep -E “^.{$(tr ‘\t’ ‘ ‘ <file_name.txt | wc -L)}$” file_name.txt > /dev/null
$ time awk ‘{ln=length}ln>max{delete result; max=ln}
ln==max{result[NR]=$0} END{for(i in result) print result[i] }’ file_name.txt > /dev/null
Conclusion:
We discussed approaches in this post for identifying the longest lines in an input file. We reviewed why the awk technique is substantially faster than the wc + grep strategy as well as benchmarked their performance. In addition, we looked more closely at a flaw in the wc command which we need to be careful of when using the -L option.
Similar Reads
How to Find the Longest Line from a Text File in Python
Finding the longest line from a text file consists of comparing the lengths of each line to determine which one is the longest. This can be done efficiently using various methods in Python. In this article, we will explore three different approaches to Finding the Longest Line from a Text File in Py
3 min read
How to Open a File in Linuxâ
In Linux, a file is a fundamental unit of storage, representing everything from documents and images to system logs and program data. Unlike traditional operating systems, Linux treats almost everythingâfiles, directories, devices, and processesâas a file. Whether you're accessing a simple text docu
6 min read
How to read a Large File Line by Line in PHP ?
We will use some file operations to read a large file line by line and display it. Read a file: We will read the file by using fopen() function. This function is used to read and open a file. Syntax: fopen("filename", access_mode); Parameter: filename: Filename is the name of the file access_mode: I
2 min read
How to Split a File into a List in Python
In this article, we are going to see how to Split a File into a List in Python. When we want each line of the file to be listed at consecutive positions where each line becomes an element in the file, the splitlines() or rstrip() method is used to split a file into a list. Let's see a few examples t
5 min read
How to get the number of lines in a file using PHP?
Given a file reference, find the number of lines in this file using PHP. There are a total of 3 approaches to solve this. test.txt: This file is used for testing all the following PHP codes Geeks For Geeks Approach 1: Load the whole file into memory and then use the count() function to return the nu
2 min read
Reading Lines by Lines From a File to a Vector in C++ STL
Prerequisites: STL in C++Vector in C++File handling in C++ The Standard Template Library (STL) is a set of C++ template classes to provide common programming data structures and functions such as lists, stacks, arrays, etc. It is a library of container classes, algorithms, and iterators. Vector in C
2 min read
How to Find the Longest or Shortest Text String in a Column in Excel?
In this article, we will see how to find the longest or shortest text string in a column in Excel? Usually, for finding the longest or shortest string we can visit the all string in columns one by one and compare them to get results. This seems to work when you have less amount of data in an excel s
4 min read
How to Append Text to End of File in Linux?
On Linux, while working with files in a terminal sometimes we need to append the same data of a command output or file content. Append means simply add the data to the file without erasing existing data. Today we are going to see how can we append the text in the file on the terminal. Using >>
2 min read
Shell Script to Displays All the Lines Between the Given Line Numbers
In this article, We will write a shell script to display all lines between given line numbers to console. We have a file name and start and end line, We have to write a script that will print all the lines from the specified start line to the ending line of the file. Example: File : a.txt Line 1 : H
2 min read
Bash Scripting - How to read a file line by line
In this article, we are going to see how to read a file line by line in Bash scripting. There might be instances where you want to read the contents of a file line by line using a BASH script. In this section, we will look at different ways to do just that. We will use BASH commands and tools to ach
3 min read