Awk Is One of The Most Powerful Utilities Used in The Unix World. Whenever It Comes To Text Parsing
Awk Is One of The Most Powerful Utilities Used in The Unix World. Whenever It Comes To Text Parsing
where the pattern indicates the pattern or the condition on which the action is to be executed for
every line matching the pattern. In case of a pattern not being present, the action will be executed for
every line of the file. In case of the action part not being present, the default action of printing the line
will be done. Let us see some examples:
$ cat file1
Name Domain
Deepak Banking
Neha Telecom
Vijay Finance
Guru Migration
This file has 2 fields in it. The first field indicates the name of a person, and the second field
denoting their expertise, the first line being the header record.
Name
Deepak
Neha
Vijay
Guru
The above awk command does not have any pattern or condition. Hence, the action will be
executed on every line of the file. The action statement reads "print $1". awk, while reading a file,
splits the different columns into $1, $2, $3 and so on. And hence the first column is accessible using
$1, second using $2, etc. And hence the above command prints all the names which happens to be
first column in the file.
Domain
Banking
Telecom
Finance
Migration
3. In the first example, the list of names got printed along with the header record. How to omit the
header record and get only the names printed?
Deepak
Neha
Vijay
Guru
The above awk command uses a special variable NR. NR denotes line number ranging from 1 to
the actual line count. The conditon 'NR!=1' indicates not to execute the action part for the first line of
the file, and hence the header record gets skipped.
Name Domain
Deepak Banking
Neha Telecom
Vijay Finance
Guru Migration
$0 stands for the entire line. And hence when we do "print $0", the whole line gets printed.
Name Domain
Deepak Banking
Neha Telecom
Vijay Finance
Guru Migration
The above awk command has only the pattern or condition part, no action part. The '1' in the
pattern indicates "true" which means true for every line. As said above, no action part denotes just to
print which is the default when no action statement is given, and hence the entire file contents get
printed.
Let us now consider a file with a delimiter. The delimiter used here is a comma. The comma
separated file is called csv file. Assuming the file contents to be:
$ cat file1
Name,Domain,Expertise
Deepak,Banking,MQ Series
Neha,Telecom,Power Builder
Vijay,Finance,CRM Expert
Guru,Migration,Unix
This file contains 3 fields. The new field being the expertise of the respective person.
6. Let us try to print the first column of this csv file using the same method as mentioned in Point
1.
Name,Domain,Expertise
Deepak,Banking,MQ
Neha,Telecom,Power
Vijay,Finance,CRM
Guru,Migration,Unix
The output looks weird. Isnt it? We expected only the first column to get printed, but it printed little
more and that too not a definitive one. If you notice carefully, it printed every line till the first space is
encountered. awk, by default, uses the white space as the delimiter which could be a single space,
tab space or a series of spaces. And hence our original file was split into fields depending on space.
Since our requirement now involves dealing with a file which is comma separated, we need to
specify the delimiter.
Name
Deepak
Neha
Vijay
Guru
awk has a command line option "-F' with which we can specify the delimiter. Once the delimiter is
specified, awk splits the file on the basis of the delimiter specified, and hence we got the names by
printing the first column $1.
7. awk has a special variable called "FS" which stands for field separator. In place of the
command line option "-F', we can also use the "FS".
$ awk '{print $1,$3}' FS="," file1
Name Expertise
Deepak MQ Series
Guru Unix
Domain
Banking
Telecom
Finance
Migration
9. To print the first and third columns, ie., the name and the expertise:
Name Expertise
Deepak MQ Series
Guru Unix
10. The output shown above is not easily readable since the third column has more than one word. It
would have been better had the fields being displayed are present with a delimiter. Say, lets use
comma to separate the output. Also, lets discard the header record.
Deepak,MQ Series
Neha,Power Builder
Vijay,CRM Expert
Guru,Unix
OFS is another awk special variable. Just like how FS is used to separate the input fields, OFS
(Output field separator) is used to separate the output fields.
In one of our earlier articles, we saw how to read a file in awk. At times, we might have some
requirements wherein we need to pass some arguments to the awk program or to access a shell
variable or an environment variable inside awk. Let us see in this article how to pass and access
arguments in awk:
$ cat file1
24
12
34
45
$ echo $x
Now, say we want to add every value with the shell variable x.
1.awk provides a "-v" option to pass arguments. Using this, we can pass the shell variable to it.
$ awk -v val=$x '{print $0+val}' file1
27
15
37
48
As seen above, the shell variable $x is assigned to the awk variable "val". This variable "val" can
directly be accessed in awk.
2. awk provides another way of passing argument to awk without using -v. Just before specifying the
file name to awk, provide the shell variable assignments to awk variables as shown below:
24,3
12,3
34,3
45,3
3. How to access environment variables in awk? Unlike shell variables, awk provides a way to
access the environment variables without passing it as above. awk has a special variable ENVIRON
which does the needful.
$ echo $x
$ export x
24,3
12,3
34,3
45,3
Some times we might have a requirement wherein we have to quote the file contents. Assume, you
have a file which contains the list of database tables. And for your requirement, you need to quote
the file contents:
$ cat file
CUSTOMER
BILL
ACCOUNT
4. Pass a variable to awk which contains the double quote. Print the quote, line, quote.
'CUSTOMER'
'BILL'
'ACCOUNT'
5. Similarly, to double quote the contents, pass the variable within single quotes:
"CUSTOMER"
"BILL"
"ACCOUNT"
In one of our earlier articles on awk series, we had seen the basic usage of awk or gawk. In this, we
will see mainly how to search for a pattern in a file in awk. Searching pattern in the entire line or in a
specific column.
Let us consider a csv file with the following contents. The data in the csv file contains kind of
expense report. Let us see how to use awk to filter data from the file.
$ cat file
Medicine,200
Grocery,500
Rent,900
Grocery,800
Medicine,600
Rent,900
~ is the symbol used for pattern matching. The / / symbols are used to specify the pattern. The
above line indicates: If the line($0) contains(~) the pattern Rent, print the line. 'print' statement by
default prints the entire line. This is actually the simulation of grep command using awk.
2. awk, while doing pattern matching, by default does on the entire line, and hence $0 can be left off
as shown below:
Rent,900
3. Since awk prints the line by default on a true condition, print statement can also be left off.
In this example, whenever the line contains Rent, the condition becomes true and the line gets
printed.
4. In the above examples, the pattern matching is done on the entire line, however, the pattern we
are looking for is only on the first column. This might lead to incorrect results if the file contains the
word Rent in other places. To match a pattern only in the first column($1),
Rent,900
The -F option in awk is used to specify the delimiter. It is needed here since we are going to work
on the specific columns which can be retrieved only when the delimiter is known.
5. The above pattern match will also match if the first column contains "Rents". To match exactly
for the word "Rent" in the first column:
Rent,900
200
600
Medicine,200
Rent,900
Medicine,600
8. Similarly, to match for this above pattern only in the first column:
Medicine,200
Rent,900
Medicine,600
9. What if the the first column contains the word "Medicines". The above example will match it as
well. In order to exactly match only for Rent or Medicine,
Medicine,200
Rent,900
Medicine,600
The ^ symbol indicates beginning of the line, $ indicates the end of the line. ^Rent$ matches
exactly for the word Rent in the first column, and the same is for the word Medicine as well.
10. To print the lines which does not contain the pattern Medicine:
Grocery,500
Rent,900
Grocery,800
Grocery,500
Rent,900
Grocery,800
Rent,900
Grocery,800
Medicine,600
Medicine,200
This is how the logical AND(&&) condition is used in awk. The records needed to be retrieved is
only if it is the first record(NR==1) and the record is a medicine record.
14. To print all those Medicine records whose amount is greater than 500:
Medicine,600
15. To print all the Medicine records and also those records whose amount is greater than 600:
Medicine,200
Rent,900
Grocery,800
Medicine,600
In one of our earlier articles, we had discussed about joining all lines in a file and also joining every
2 lines in a file. In this article, we will see the how we can join lines based on a pattern or joining
lines on encountering a pattern using awk or gawk.
Let us assume a file with the following contents. There is a line with START in-between. We have to
join all the lines following the pattern START.
$ cat file
START
Unix
Linux
START
Solaris
Aix
SCO
1. Join the lines following the pattern START without any delimiter.
UnixLinux
SolarisAixSCO
Basically, what we are trying to do is: Accumulate the lines following the START and print them
on encountering the next START statement. /START/ searches for lines containing the pattern
START. The command within the {} will work only on lines containing the START pattern. Prints a
blank line if the line is not the first line(NR!=1). Without this condition, a blank line will come in the
very beginning of the output since it encounters a START in the beginning.
The next command prevents the remaining part of the command from getting executed for the
START lines. The second part of braces {} works only for the lines not containing the START. This
part simply prints the line without a terminating new line character(printf). And hence as a result, we
get all the lines after the pattern START in the same line. The END label is put to print a newline at
the end without which the prompt will appear at the end of the last line of output itself.
2. Join the lines following the pattern START with space as delimiter.
Unix Linux
This is same as the earlier one except it uses the format specifier %s in order to accommodate an
additional space which is the delimiter in this case.
3. Join the lines following the pattern START with comma as delimiter.
Unix,Linux
Solaris,Aix,SCO
Here, we form a complete line and store it in a variable x and print the variable x whenever a new
pattern starts. The command: x=(!x)?$0:x","$0 is like the ternary operator in C or Perl. It means if x is
empty, assign the current line($0) to x, else append a comma and the current line to x. As a
result, x will contain the lines joined with a comma following the START pattern. And in the END
label, x is printed since for the last group there will not be a START pattern to print the earlier group.
4. Join the lines following the pattern START with comma as delimiter with also the pattern
matching line.
START,Solaris,Aix,SCO
The difference here is the missing next statement. Because next is not there, the commands
present in the second set of curly braces are applicable for the START line as well, and hence it also
gets concatenated.
5. Join the lines following the pattern START with comma as delimiter with also the pattern
matching line. However, the pattern line should not be joined.
START
Unix,Linux
START
Solaris,Aix,SCO
In this, instead of forming START as part of the variable x, the START line is printed. As a result, the
START line comes out separately, and the remaining lines get joined.
n this article of the awk series, we will see the different scenarios in which we need to split a file into
multiple files using awk. The files can be split into multiple files either based on a condition, or based
on a pattern or because the file is big and hence needs to split into smaller files.
Sample File1:
Let us consider a sample file with the following contents:
$ cat file1
Item1,200
Item2,500
Item3,900
Item2,800
Item1,600
1. Split the file into 3 different files, one for each item. i.e, All records pertaining to Item1 into a
file, records of Item2 into another, etc.
$ cat Item1
Item1,200
Item1,600
$ cat Item3
Item3,900
$ cat Item2
Item2,500
Item2,800
This looks so simple, right? print prints the entire line, and the line is printed to a file whose name
is $1, which is the first field. This means, the first record will get written to a file named 'Item1', and
the second record to 'Item2', third to 'Item3', 4th goes to 'Item2', and so on.
2. Split the files by having an extension of .txt to the new file names.
The only change here from the above is concatenating the string ".txt" to the $1 which is the first
field. As a result, we get the extension to the file names. The files created are below:
$ ls *.txt
The print command prints the entire record. Since we want only the second field to go to the
output files, we do: print $2.
$ cat Item1.txt
200
600
4. Split the files so that all the items whose value is greater than 500 are in the file "500G.txt",
and the rest in the file "500L.txt".
$ cat 500L.txt
Item1,200
Item2,500
$ cat 500G.txt
Item3,900
Item2,800
Item1,600
Check the second field($2). If it is lesser or equal to 500, the record goes to "500L.txt", else to
"500G.txt".
Other way to achieve the same thing is using the ternary operator in awk:
$ awk -F, '{x=($2<=500)?"500L.txt":"500G.txt"; print > x}' file1
The condition for greater or lesser than 500 is checked and the appropriate file name is assigned
to variable x. The record is then written to the file present in the variable x.
Sample File2:
Let us consider another file with a different set of contents. This file has a pattern 'START' at
frequent intervals.
$ cat file2
START
Unix
Linux
START
Solaris
Aix
SCO
5. Split the file into multiple files at every occurrence of the pattern START .
This command contains 2 sets of curly braces: The control goes to the first set of braces only on
encountering a line containing the pattern START. The second set will be encountered by every line
since there is no condition and hence always true.
On encountering the pattern START, a new file name is created and stored. When the first START
comes, x will contain "F1" and the control goes to the next set of braces and the record is written to
F1, and the subsequent records go the file "F1" till the next START comes. On encountering next
START, x will contain "F2" and the subsequent lines goes to "F2" till the next START, and it
continues.
$ cat F1
START
Unix
Linux
Solaris
$ cat F2
START
Aix
SCO
6. Split the file into multiple files at every occurrence of the pattern START. But the line
containing the pattern should not be in the new files.
The only difference in this from the above is the inclusion of the next command. Due to the next
command, the lines containing the START enters the first curly braces and then starts reading the
next line immediately due to the next command. As a result, the START lines does not get to the
second curly braces and hence the START does not appear in the split files.
$ cat F1
Unix
Linux
$ cat F2
Solaris
Aix
SCO
$ cat F1
ANY HEADER
Unix
Linux
$ cat F2
ANY HEADER
Solaris
Aix
SCO
Sample File3:
Let us consider a file with the sample contents:
$ cat file3
Unix
Linux
Solaris
AIX
SCO
8. Split the file into multiple files at every 3rd line . i.e, First 3 lines into F1, next 3 lines into F2
and so on.
$ cat F1
Unix
Linux
Solaris
$ cat F2
Aix
SCO
Sample File4:
Let us update the above file with a header and trailer:
$ cat file4
HEADER
Unix
Linux
Solaris
AIX
SCO
TRAILER
9. Split the file at every 3rd line without the header and trailer in the new files.
$ cat F1
Unix
Linux
Solaris
$ cat F2
AIX
SCO
10. Split the file at every 3rd line, retaining the header and trailer in every file.
This one is little tricky. Before the file is processed, the first line is read using getline into the
variable f. NR%3 is checked with 2 instead of 1 as in the earlier case because since the first line is a
header, we need to split the files at 2nd, 5th, 8th lines, and so on. All the file names are stored in the
array "a" for later processing.
Without the END label, all the files will have the header record, but only the last file will have the
trailer record. So, the END label is to precisely write the trailer record to all the files other than the
last file.
$ cat F1
HEADER
Unix
Linux
Solaris
TRAILER
$ cat F2
HEADER
Aix
SCO
TRAILER
In this article of awk series, we will see how to use awk to read or parse text or CSV files containing
multiple delimiters or repeating delimiters. Also, we will discuss about some peculiar delimiters and
how to handle them using awk.
Let us consider a sample file. This colon separated file contains item, purchase year and a set of
prices separated by a semicolon.
$ cat file
Item1:2010:10;20;30
Item2:2012:12;29;19
Item3:2014:15;50;61
10;20;30
12;29;19
15;50;61
This is straight forward. By specifying colon(:) in the option with -F, the 3rd column can be retrieved
using the $3 variable.
20
29
50
What did we do here? Specified multiple delimiters, one is : and other is ; . How awk parses the
file? Its simple. First, it looks at the delimiters which is colon(:) and semi-colon(;). This means, while
reading the line, as and when the delimiter : or ; is encountered, store the part read in $1. Continue
further. Again on encountering one of the delimiters, store the read part in $2. And this continues till
the end of the line is reached. In this way, $4 contained the first part of the price component above.
Note: Always keep in mind. While specifying multiple delimiters, it has to be specified inside
square brackets( [;:] ).
3. To sum the individual components of the 3rd column and print it:
Item1:2010:60
Item2:2012:60
Item3:2014:126
The individual components of the price($3) column are available in $3, $4 and $5. Simply, sum
them up and store in $3, and print all the variables. OFS (output field separator) is used to specify
the delimiter while printing the output.
Note: If we do not use the OFS, awk will print the fields using the default output delimiter which is
space.
Item1:2010:10
Item1:2010:20
Item1:2010:30
Item2:2012:12
Item2:2012:29
Item2:2012:19
Item3:2014:15
Item3:2014:50
Item3:2014:61
The requirement here is: New records have to be created for every component of the price
column. Simply, a loop is run on from columns 3 to 5, and every time a record is framed using the
price component.
$ cat file
123;abc[202];124
125;abc[203];124
127;abc[204];124
202
203
204
At the first sight, the delimiter used in the above command might be confusing. Its simple. 2
delimiters are to be used in this case: One is [ and the other is ]. Since the delimiters itself is square
brackets which is to be placed within the square brackets, it looks tricky at the first instance.
Note: If square brackets are delimiters, it should be put in this way only, meaning first ] followed by [.
Using the delimiter like -F '[[]]' will give a different interpretation altogether.
6. To print the first value, the value within brackets, and the last value:
123;202;124
125;203;124
127;204;124
$ cat file
123;;;202;;;203
124;;;213;;;203
125;;;222;;;203
Blank output !!! The above delimiter, though specified as 3 colons is as good as one delimiter
which is a semi-colon(;) since they are all the same. Due to this, $2 will be the value between the
first and the second semi-colon which in our case is blank and hence no output.
202
213
222
The expected output !!! No square brackets is used and we got the output which we wanted.
Difference between using square brackets and not using it : When a set of delimiters are
specified using square brackets, it means an OR condition of the delimiters. For example, -F
'[;:]' means to separate the contents either on encountering ':' or ';'. However, when a set of
delimiters are specified without using square brackets, awk looks at them literally to separate the
contents. For example, -F ':;' means to separate the contents only on encountering a colon followed
by a semi-colon. Hence, in the last example, the file contents are separated only when a set of 3
continuous semi-colons are encountered.
$ cat file
123;;;202;;;;203
124;;;213;;;;203
125;;;222;;;;203
202 203
213 203
222 203
The '+' is a regular expression. It indicates one or more of previous characters. ';'+ indicates one
or more semi-colons, and hence both the 3 semi-colons and 4 semi-colons get matched.
$ cat file
123Unix203
124Unix203
125Unix203
123 203
124 203
125 203
In this case, we use the word "Unix" as the delimiter. And hence $1 and $2 contained the
appropriate values . Keep in mind, it is not just the special characters which can be used as
delimiters. Even alphabets, words can also be used as delimiters.
P.S: We will discuss about the awk split command on how to use it in these types of multiple
delimited files.
In one of our earlier articles, we discussed how to access or pass shell variables to awk. In this, we
will see how to access the awk variables in shell? Or How to access awk variables as shell variables
? Let us see the different ways in which we can achieve this.
$ cat file
Linux 20
Solaris 30
HPUX 40
$ echo $x
30
This approach is fine as long as we want to access only one value. What if we have to access
multiple values in shell?
$ echo "$z"
y=20
x=30
$ eval $z
$ echo $x
30
$ echo $y
20
awk sets the value of "x" and "y" awk variables and prints which is collected in the shell variable
"z". The eval command evaluates the variable meaning it executes the commands present in the
variable. As a result, "x=30" and "y=20" gets executed, and they become shell variables x and y with
appropriate values.
$ source f1
$ echo $x
30
$ echo $y
20
Here, instead of collecting the output of awk command in a variable, it is re-directed to a temporary
file. The file is then sourced or in other words executed in the same shell. As a result, "x" and "y"
become shell variables.
Note: Depending on the shell being used, the appropriate way of sourcing has to be done. The
"source" command is used here since the default shell is bash.
How to manipulate a text / CSV file using awk/gawk? How to insert/add a column between columns,
remove columns, or to update a particular column? Let us discuss in this article.
$ cat file
Unix,10,A
Linux,30,B
Solaris,40,C
Fedora,20,D
Ubuntu,50,E
1. To insert a new column (say serial number) before the 1st column
1,Unix,10,A
2,Linux,30,B
3,Solaris,40,C
4,Fedora,20,D
5,Ubuntu,50,E
$1=++i FS $1 => Space is used to concatenate columns in awk. This expression concatenates a
new field(++i) with the 1st field along with the delimiter(FS), and assigns it back to the 1st field($1).
FS contains the file delimiter.
Unix,10,A,1
Linux,30,B,2
Solaris,40,C,3
Fedora,20,D,4
Ubuntu,50,E,5
$NF indicates the value of last column. Hence,by assigning something to $(NF+1), a new field is
inserted at the end automatically.
Unix,10,A,1,X
Linux,30,B,2,X
Solaris,40,C,3,X
Fedora,20,D,4,X
Ubuntu,50,E,5,X
The explanation gives for the above 2 examples holds good here.
Unix,1,10,A
Linux,2,30,B
Solaris,3,40,C
Fedora,4,20,D
Ubuntu,5,50,E
NF-1 points to the 2nd last column. Hence, by concatenating the serial number in the beginning of
NF-1 ends up in inserting a column before the 2nd last.
5. Update 2nd column by adding 10 to the variable:
Unix,20,A
Linux,40,B
Solaris,50,C
Fedora,30,D
Ubuntu,60,E
$2 is incremented by 10.
UNIX,10,A
LINUX,30,B
SOLARIS,40,C
FEDORA,20,D
UBUNTU,50,E
Using the toupper function of the awk, the 1st column is converted from lowercase to uppercase.
Uni,10,A
Lin,30,B
Sol,40,C
Fed,20,D
Ubu,50,E
Using the substr function of awk, a substring of only the first few characters can be retrieved.
Unix,,A
Linux,,B
Solaris,,C
Fedora,,D
Ubuntu,,E
Set the variable of 2nd column($2) to blank(""). Now, when the line is printed, $2 will be blank.
Unix,A
Linux,B
Solaris,C
Fedora,D
Ubuntu,E
By just emptying a particular column, the column stays as is with empty value. To remove a column,
all the subsequent columns from that position, needs to be advanced one position ahead. The for
loop loops on all the fields. Using the ternary operator, every column is concatenated to the variable
"f" provided it is not 2nd column using the FS as delimiter. At the end, the variable "f" is printed
which contains the updated record. The column to be removed is passed through the awk variable
"x" and hence just be setting the appropriate number in x, any specific column can be removed.
10. Join 3rd column with 2nd colmn using ':' and remove the 3rd column:
Unix,10:A
Linux,30:B
Solaris,40:C
Fedora,20:D
Ubuntu,50:E
Almost same as last example expcept that first the 3rd column($3) is concatenated with 2nd
column($2) and then removed.
systime
strftime
mktime
Let us see in this article how to use these functions:
systime:
This function is equivalent to the Unix date (date +%s) command. It gives the Unix time, total
number of seconds elapsed since the epoch(01-01-1970 00:00:00).
1358146640
strftime:
A very common function used in gawk to format the systime into a calendar format. Using this
function, from the systime, the year, month, date, hours, mins and seconds can be separated.
Syntax:
strftime (<format specifiers>,unix time);
1. Printing current date time using strftime:
14-01-13 12-37-45
strftime takes format specifiers which are same as the format specifiers available with the date
command. %d for date, %m for month number (1 to 12), %y for the 2 digit year number, %H for the
hour in 24 hour format, %M for minutes and %S for seconds. In this way, strftime converts Unix time
into a date string.
14-01-13 12-38-08
Both the arguments of strftime are optional. When the timestamp is not provided, it takes the
systime by default.
strftime without the format specifiers provides the output in the default output format as the Unix
date command.
mktime:
mktime function converts any given date time string into a Unix time, which is of the systime
format.
Syntax:
mktime(date time string) # where date time string is a string which contains atleast 6 components in
the following order: YYYY MM DD HH MM SS
1356028200
This gives the Unix time for the date 21-Dec-12.
21-12-2012
The output of mktime can be validated by formatting the mktime output using the strftime function
as above.
29-11-2012
mktime can take negative values as well. -1 in the date position indicates one day before the date
specified which in this case leads to 29th Nov 2012.
02-12-2012 22-00-00
-2 in the hours position indicates 2 hours before the specified date time which in this case leads to
"2-Dec-2012 22" hours.
The requirement is to find the time consumed by the process which is the difference between the
start and the end times.
1. File in which the date and time component are separated by a space:
$ cat file
P1,2012 12 4 21 36 48,2012 12 4 22 26 53
P2,2012 12 4 20 36 48,2012 12 4 21 21 23
P3,2012 12 4 18 36 48,2012 12 4 20 12 35
P1,3005 secs
P2,2675 secs
P3,5747 secs
Using mktime function, the Unix time is calculated for the date time strings, and their difference
gives us the time elapsed in seconds.
$ cat file
Note: This file has the start time and end time in different formats
Difference in seconds:
P2,2675 secs
P3,5747 secs
Using gsub function, the '-' and ':' are replaced with a space. This is done because the mktime
function arguments should be space separated.
Difference in minutes:
P1,50.0833 mins
P2,44.5833 mins
P3,95.7833 mins
$ cat file
P1,2012-12-4,2012-12-6
P2,2012-12-4,2012-12-8
P3,2012-12-4,2012-12-5
Note: The start and end time has only the date components, no time components
Difference in seconds:
P1,172800 secs
P2,345600 secs
P3,86400 secs
In addition to replacing the '-' and ':' with spaces, 0's are appended to the date field since the mktime
requires the date in 6 column format.
Difference in days:
P1,2 days
P2,4 days
P3,1 days
A day has 86400(24*60*60) seconds, and hence by dividing the duration in seconds by 86400, the
duration in days can be obtained.