Text Manipulation
Text Manipulation
This is the beginner’s guide for how to search the specific texts from the bunch of paragraphs or
files using command line in Linux.
There are 4 prominent tools which comes to my mind when I think about text manipulation in
linux.
1. grep
2. sed
3. cut
4. awk
Let’s check them one by one and see their basic usage.
1. grep
It searches text files for the occurrence of a given regular expression and outputs any line
containing a match to the standard output.
ls | grep php
First, we can use above simple expression to list down all the files and folders which has “php”
in their name.
Now, you can use any string in the place of “php” for searching.
Next, we can use “-v” option to ignore specific string from the output.
ls | grep -v secret.txt
this above expression will exclude “secret.txt” from the output result.
2. sed
It is a powerful stream editor. At a very high level, sed performs text editing on a stream of text,
either a set of specific files or standard output.
Above command will replace word “noob” with the word “pro”.
3. cut
The cut command is simple, but often comes in quite handy. It is used to extract a section of
text from a line and output it to the standard output. Some of the most commonly-used switches
include -f for the field number we are cutting and -d for the field delimiter.
echo "we are providing web, android, IOT pentesting" | cut -f 3 -d ","
Above command will cut the string using “,” delimiter and print 3rd element from it which is “IOT
pentesting” in our case.
We can use this to extract usernames from “/etc/passwd” file.
4. awk
AWK is a programming language designed for text processing and is typically used as a data
extraction and reporting tool. It is also extremely powerful and can be quite complex
A commonly used switch with awk is -F, which is the field separator, and the print command,
which outputs the result text.
we echoed a line and piped it to awk to extract the first ($1) and third ($3) fields using :: as a
field separator.
The most prominent difference between the cut and awk examples we used is that cut can only
accept a single character as a field delimiter, while awk, is much more flexible. As a general rule
of thumb, when you start having a command involving multiple cut operations, you may want to
consider switching to awk.