0% found this document useful (0 votes)
181 views

AWK Scripting

The document provides an introduction to AWK, a programming language used for manipulating data and generating reports. It describes the structure of an AWK program, which consists of patterns and associated actions. Patterns are used to match input lines and trigger actions. Common features of AWK like variables, operators, functions and control flow statements are summarized. Examples are provided to demonstrate how to search and extract data from files using AWK.

Uploaded by

AnandhababuS
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
181 views

AWK Scripting

The document provides an introduction to AWK, a programming language used for manipulating data and generating reports. It describes the structure of an AWK program, which consists of patterns and associated actions. Patterns are used to match input lines and trigger actions. Common features of AWK like variables, operators, functions and control flow statements are summarized. Examples are provided to demonstrate how to search and extract data from files using AWK.

Uploaded by

AnandhababuS
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

AWK

Mohamed Mukthar Ahmed

CONTENTS












Introduction
Structure of awk Program
First awk Program
Searching For a String
Records and Fields
Awk Relational and Logical Operators
Formatted Output
BEGIN and END Statements
Awk Defined Variables
User Defined Variables
Regular Expressions
 Match Operators
 Concatenation Operator
Mohamed Mukthar Ahmed

Page 1

CONTENTS











Combining Patterns
 Pattern Range
Awk Defined Variables Again
Built In Arithmetic Functions
Built In String Functions
Control Flow Statements
 next Statement
Arrays in Awk
for Statement
User Defined Functions
Output to File
 Output to Pipes
Awk Input
 getline( ) Function
Mohamed Mukthar Ahmed

Introduction


A powerful pattern matching and scanning tool.

SpecialSpecial-purpose language for lineline-oriented pattern


processing

Typically used to scan an input string, grab certain


portions of the string, then output the information in
another format.

Developed by Alfred Aho,


ho, Peter Weinberger, and
Brian Kernighan (hence the name AWK).
AWK).

There are several variants of awk: standard awk


(awk), Gnu awk (gawk
), and new awk (nawk
(gawk),
(nawk)) are the
most common.
.
common

Mohamed Mukthar Ahmed

Page 2

Introduction


AWK is a UNIX/Linux programming language used for


manipulating data and generating reports.

AWK can be used at the command line for simple


operations.

It can also be written into programs or scripts for


larger applications.

Mohamed Mukthar Ahmed

Structure of AWK Program




AWK scans a file (or any input source) line by line,


searching for lines that match a certain pattern
(regular expression) or condition.

For each pattern, an action is specified. The action is


performed when the pattern matches.


In short, an awk program consists of a number of


patterns and associated actions.
Actions are enclosed using curly braces, and separated
using semi-colons.

If there is an action without a pattern, all lines are


executed.
Mohamed Mukthar Ahmed

Page 3

First AWK Program




Consider the following simple awk program.


{ print $0 }

The action prints field 0 ( the entire line )

$ awk f myawk1


/etc/group

The general syntax of awk is as follows:

awk Fchar /pattern/{action} input_file


awk Fchar f awkscript input_file


The default delimiting character is Tab or Space.


Space.
However, if it is other than tab or space, we need to
explicitly state it using the F option.

$ awk F: -f myawk1

/etc/passwd
Mohamed Mukthar Ahmed

Searching For a String




To search for a string in an input line, specify it as a


pattern.

Patterns are enclosed using forward slash symbols.

# Searching for a pattern mukthar


/mukthar/ { print $0 }
$ awk f myawk2
$ who

/etc/passwd

| awk -f myawk2

Mohamed Mukthar Ahmed

Page 4

Records and Fields




awk sees input as a table of rows and columns.

The table consists of rows that represent records.


The entire row is identified by $0

Columns in the table are the fields. The columns are


identified as $1,
$1, $2,
$2, $3

awk expects that the fields are delimited by either a


spaces or a tabs. We can change the delimiter by
using the F option
# Searching for a condition
$3 == 501 { print $0 }
Mohamed Mukthar Ahmed

AWK Relational Operators




awk uses the following relational operators


Relational Operators
Operator
Meaning
Equal to
==

!=
>
>=
<
<=
~
!~

Not Equal to
Greater than
Greater than or equal to
Less than
Less than or equal to
Matches
Does not match

Logical Operators
Operator
Meaning
AND
&&

||
!

OR
NOT

Mohamed Mukthar Ahmed

Page 5

Formatted Output


awk uses the printf(


printf( ) for formatted output.

It is similar to that of C programming.

# Formatted output
/mukthar/ { printf(UID = %d\tGID = %d\n,$3, $4) }

Mohamed Mukthar Ahmed

BEGIN and END Statements




The keywords BEGIN and END are used to perform


specific actions relative to the programs execution.

BEGIN Action before the first input line is read.

END Action after all input lines have been


processed

# myawk4
# Searching for a pattern mukthar
BEGIN { print Locating User mukthar }
/mukthar/ { printf(UID = %d\tGID = %d\n, $3, $4) }

END { print End of Report }


$ awk F: -f myawk4

/etc/passwd

Mohamed Mukthar Ahmed

Page 6

AWK Defined Variable




awk supports a number of prepre-defined variables.


PrePre-defined Variables
Variable
Meaning
The current input line.
NR

NF

Number of fields in the input line.

# myawk5
# Number of valid users
END { print There are , NR , users }
$ awk F: -f myawk5

/etc/passwd

Mohamed Mukthar Ahmed

User Defined Variable




awk supports the use of variables.

There is no need to explicitly initialize the variable to


zero, awk does this by default.

# myawk5b
# Counting users of training group
$4 == 505 { training++ }
END { print The number of users in ;
print training group are , training
}
$ awk F: -f myawk5b

/etc/passwd
Mohamed Mukthar Ahmed

Page 7

Regular Expressions


awk provides pattern matching which is more


comprehensive. These patterns are called regular
expressions.
expressions.

Similar to those supported by UNIX / Linux grep


command.

# myawk6
# Searching for a user1 to user5
BEGIN { print Locating User1 to User5 }
/^user[1-5]/ { print $1, $3, $4 }
END { print End of Report }
$ awk F: -f myawk6

/etc/passwd

Mohamed Mukthar Ahmed

Regular Expressions


Use the ~ or !~ match operators for matching.

# myawk7
# Searching for UIDs 501-505
BEGIN { print Locating UIDs 501-505 }
$3 ~ /50[1-5]/ { print $1, $3, $4 }
END { print End of Report }
Pattern
Meaning
. DOT any one character
\
[ ]
[^
[^]
^
$

despecialize character
any one character in the list. Character Class
any one character not in the list
beginning of line
Also called as anchors
end of line
Mohamed Mukthar Ahmed

Page 8

Regular Expressions


We can match a repeating pattern by adding a


modifier or repetition operator.
operator.

Regular expressions can have any of the three


modifiers.
modifiers.
Modifier
Meaning
?
Match at most once the preceding character
*
+
{n}

Match preceding character exactly n times

{n,}

Match preceding character at least n times

{n,m}
n,m}


zero or more occurrences of preceding character


one or more occurrence of preceding character

Match preceding character at least n times but


not more than m times.

Alternative patterns can be specified by a


alternate separator | (pipe)
Mohamed Mukthar Ahmed

Concatenation Operator


The plus (+
(+) symbol concatenates one or more
strings in pattern matching.

# myawk8
# Searching for a pattern $unix
BEGIN { print Locating $Unix or $unix }
$1 ~ /\$+[Uu]nix/ { print $0 }
END { print End of Report }


awk interprets any string or variable on the right


side of ~ or !~ as a regular expression.

Thus, the regular expression can be assigned to a


variable, and the variable can be used in pattern
matching.

Mohamed Mukthar Ahmed

Page 9

Combining Patterns
Patterns can be combined to provide more
powerful and complex matching.
# myawk9
# Combineing patterns
BEGIN { print Combining Patterns }
$1 == 486 && $5 > 250 { print $0 }
END { print End of Report }

awk pattern range can be specified by having two


patterns separated by a comma.

The action is performed for each input line between


the occurrence of the first and second pattern.

# myawk9b
/user1/,/user8/ { print $0 }
Mohamed Mukthar Ahmed

AWK Defined Variable - Again




awk supports a number of prepre-defined variables.

Variable

NR
NF
FS
FILENAME
FNR
OFS
ORS
ARGC
ARGV

PrePre-defined Variables
Meaning
The current input line.
Number of fields in the input line.
Input field separator.
Name of current input file.
Record number in current input file.
Output field separator.
Output record separator.
Number of command line arguments.
Array of command line arguments.
Mohamed Mukthar Ahmed

Page 10

Examples
Examples on using prepre-defined variables of awk.
awk.
# myawk10
# Print the first five input lines.
FNR == 1, FNR == 5 { print $0 }

# myawk11
# Print each input line with line number.
# Print the heading with file name.
BEGIN { print "File :", FILENAME }
{ print NR, ":\t", $0 }
# myawk12
BEGIN { print "There are ", ARGC, "parameters on the
command line";
print "The first argument is ", ARGV[0];
print "The second argument is ", ARGV[1];
}
Mohamed Mukthar Ahmed

AWK Built In Arithmetic Functions




Following is a summary of awk


awks builtbuilt-in arithmetic
functions.

All operations are done in floatingfloating-point format.


Name

int(x)
int(x)
sqrt(x)
sqrt(x)
rand(x)
rand(x)
srand(x)
srand(x)
exp(x)
exp(x)
log(x)
log(x)
sin(x)
sin(x)
cos(x)
cos(x)

Arithmetic Functions
Description
Integer part of x
Square root of x
Random number between 0 and 1
x is a new seed for rand( )
Exponential function of x
Natural Logarithm of x
Sine of x, with x in radians
Cosine of x, with x in radians
Mohamed Mukthar Ahmed

Page 11

Examples
Examples on builtbuilt-in functions of awk.
awk.
# myawk13
# Print the square root of input value
{ print sqrt( $1 ) }

$ awk f myawk13
2
1.41421
3
1.73205
4
2


If no data file is specified, awk reads from the stdin


input file (i.e. keyboard)
keyboard)
Mohamed Mukthar Ahmed

AWK Built In String Functions




Following is a summary of awk


awks builtbuilt-in string
functions.

Strings are enclosed within quotes ( )


Name

String Functions
Description
Return length of s

length(s)
length(s)
Returns substring of s from position p
substr(s,p)
substr(s,p)
substr(s,p,n)
substr(s,p,n) Returns substring of s from position p of
index(s,t)
index(s,t)
match(s,t)
match(s,t)
split(s,a)
split(s,a)
split(s,a,fs)
split(s,a,fs)

length n
Returns position of t in string s
Returns position of t in string s

Splits s into elements of a defined by FS


Splits s into elements of a defined by fs
Mohamed Mukthar Ahmed

Page 12

AWK Built In String Functions




Following is a summary of awk


awks builtbuilt-in string
functions.

Strings are enclosed within quotes ( )


Name

gsub(r,s)
gsub(r,s)
gsub(r,s,t)
gsub(r,s,t)

String Functions
Description
Substitutes s in place of r in $0 globally.
Returns the number of substitutions made
Substitutes s in place of r in t. Returns the
number of substitutions made.
Substitutes s for first r.

sub(r,s)
sub(r,s)
Substitutes s for first r in t.
sub(r,s,t)
sub(r,s,t)
sprintf(fmt Returns expression list formatted according
to format string specified by fmt.
fmt.
,expr,expr-lst)
lst)

Mohamed Mukthar Ahmed

Control Flow Statements




awk provides constructs to implement selection and


iteration.
iteration.

awk control flow statements are similar to C


language constructs.

# myawk15
# Finding biggest disk
{
if (disksize < $5 )
{
disksize = $5;
computer = $0
}
}
END
{ print
computercomp_data
}
$ awk
-f myawk15
Mohamed Mukthar Ahmed

Page 13

Control Flow Statements - Examples


Examples
# myawk16

# To print out each second field for 286 computers

BEGIN { printf("Type\tLoc\tDisk\n"); }
/286/ { field = 1;
while( field <= NF )
{
printf("%s\t", $field);
field += 2;
}
print "";
}
$ awk

-f myawk16

comp_data
Mohamed Mukthar Ahmed

Control Flow Statements - 2




We are already familiar with break and continue


statements.

The next statement skips to the next input line then


restarts from the first patternpattern-action statement.

# myawk17
# Print out computer type 286 using next
{
while($1 != 286)
next;
print $0
}
$ awk

-f myawk17

comp_data
Mohamed Mukthar Ahmed

Page 14

Arrays in AWK


awk provides single dimensioned arrays.

Arrays need not be declared, they can be created.

Arrays are heterogeneous.


heterogeneous.

Array subscripts are strings.


strings.

# Array Examples
ARRAY[e001] = 100
print ARRAY[e001]


ARRAY[
ARRAY[num
num] and ARRAY[num] are not the same.

ARRAY[1] and ARRAY[


ARRAY[1] is the same.

Mohamed Mukthar Ahmed

Arrays - Examples
Examples
# myawk18

# diskspace[] holds the sum of the disk space for all


# computers
# computers[] holds number of computers of specific type

$1 == "486" { computers["486"]++ }
$5 > 0
{ diskspace[0] += $5 }
END {
print "Number of 486 computers :" ,
computers[486];
print "Total disk space :",
diskspace[0];
}
$ awk

-f myawk18

comp_data
Mohamed Mukthar Ahmed

Page 15

Arrays in AWK


awk array checking.

To check for a subscript in an array, awk provides us


the in operator.

To remove an array element, awk provides us the


delete operator.

# Array Examples
if (e001 in ARRAY)
delete ARRAY[e001]

Mohamed Mukthar Ahmed

for Statement


awk provides a for construct for handling arrays


iterations.

Syntax:
for ( var in array ) statement(s)

The var get one element at a time from the array and
executes the statement until the array elements are
not exhausted.

Mohamed Mukthar Ahmed

Page 16

for - Examples
Examples
# myawk19
# Counting number of computers of all types

{ computers[$1]++ }
END {
for(name in computers)
print Number of , name, computers
is , computers[name]
}
$ awk

-f myawk19

comp_data

Mohamed Mukthar Ahmed

User Defined Functions




awk supports user defined functions.

Syntax:
function name(arg_list)
{
statements
}

There must be NO space between the function name


and the left bracket of the argument list.

The return statement is used to return a value by the


function.
Mohamed Mukthar Ahmed

Page 17

UDF - Examples
Examples
# myawk20
# Finding factorial
function factorial(n) {
if(n<=1) return 1
else return n*factorial(n-1)
} #End of function
{
print Factorial of,$1,is,factorial($1)
}

$ awk

-f myawk20

Mohamed Mukthar Ahmed

Output To File


awk output generated by print or printf can be


redirected to a file by using the redirection concept.
concept.

The name of the file MUST be in quotes.

# myawk21
# Output to file
$1 == "486" {
print "Type = ", $1 , "Location = ", $3
}

$ awk


> "comp486.dat"

-f myawk21 comp_data

The output of awk programs can be piped into a


UNIX / Linux command. Termed as Output To Pipes
Pipes
Mohamed Mukthar Ahmed

Page 18

AWK Input getline Function




awk getline( ) function reads input from the following


 Current Pipe
 Current File
 Specific File
 Internal Pipe

# myawk22
# Using getline function
if ($1 == "486) {
FIRSTLINE = $0;
getline;
SECONDLINE = $0
}
$ awk

-f myawk22 comp_data
Mohamed Mukthar Ahmed

AWK Input getline Function




Reads next line, sets $0 and NF.


increments NR and FNR

Moreover,

# myawk23
{
print NR, $0
getline;
print NR, $0
}
$ echo 100 200 300 400 500 600 | awk
-f myawk23

Mohamed Mukthar Ahmed

Page 19

AWK Input getline Function




getline var
FNR

Read next line into var,


var, increment NR

getline < file Read next line from file.


file. Set $0, NF

# myawk24
# Using getline function
while( getline < data )
print $0;

$ awk

-f myawk24

Mohamed Mukthar Ahmed

AWK Input getline Function




getline var < file Read next line from file into var,
var,
increment NR FNR

cmd | getline Read next line from cmd.


cmd. Set $0, NF

cmd | getline var Read next line from cmd into var

# myawk25
# Using getline function
while( who | getline )
print user , $1, tty, $3;

$ awk

-f myawk25
Mohamed Mukthar Ahmed

Page 20

You might also like