0% found this document useful (0 votes)
229 views

Awk Tutorial - Learn Awk (Linux) Programming

1) Awk is a text processing language useful for manipulating text files. It recognizes concepts of files (made of records), records (made of fields by default separated by spaces/tabs), and fields. 2) Awk operates on one record at a time. Variables like $0, $1, $2 refer to the whole record and individual fields. Built-in variables include NR (line number) and NF (number of fields). 3) Awk scripts can contain BEGIN and END blocks to perform actions before/after processing lines. Conditionals, loops, arrays can be used. Regular expressions can match patterns.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
229 views

Awk Tutorial - Learn Awk (Linux) Programming

1) Awk is a text processing language useful for manipulating text files. It recognizes concepts of files (made of records), records (made of fields by default separated by spaces/tabs), and fields. 2) Awk operates on one record at a time. Variables like $0, $1, $2 refer to the whole record and individual fields. Built-in variables include NR (line number) and NF (number of fields). 3) Awk scripts can contain BEGIN and END blocks to perform actions before/after processing lines. Conditionals, loops, arrays can be used. Regular expressions can match patterns.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

1/20/2015

awktutorial|learnawk(linux)programming

awktutorial
Whyawk?

Examples
Printeverylineaftererasingthe2ndfield

TheAwktextprocessingprogramminglanguageandisausefultoolfor awk'{$2="";print}'file
manipulatingtext.
Printhi48times
Awkrecognizestheconceptsof"file","record",and"field".
Afileconsistsofrecords,whichbydefaultarethelinesofthefile.One yes|head48|awk'{print"hi"}'
linebecomesonerecord.
Printhi.0010tohi.0099
Awkoperatesononerecordatatime.
Arecordconsistsoffields,whichbydefaultareseparatedbyany
yes|head90|awk'{printf("hi00%2.0fn",NR+9)}'
numberofspacesortabs.
printwhencolumn3is<1900(inmyfile.txt).Ifavalueisnot
Fieldnumber1isaccessedwith$1,field2with$2,andsoforth.$0
numeric,itdoesn'tcomplain
referstothewholerecord.
catmyfile.txt|awk'{if($3<1900)print$3,"",$5,$7,$8}'
[awkuser@p3nlh096~]$awkhelp
Usage:awk[POSIXorGNUstyleoptions]fprogfile[]file...
Countnumberoflineswherecol3>col1
Usage:awk[POSIXorGNUstyleoptions][]'program'file...
POSIXoptions:GNUlongoptions:
awk'$3>$1{printi+"1";i++}'file
fprogfilefile=progfile
Ffsfieldseparator=fs
printonlylinesoflessthan65characters
vvar=valassign=var=val
m[fr]val
awk'length<64'
Wcompatcompat
Wcopyleftcopyleft
Wcopyrightcopyright
Wdumpvariables[=file]dumpvariables[=file]
Wexec=fileexec=file
Wgenpogenpo
Whelphelp
Wlint[=fatal]lint[=fatal]
Wlintoldlintold
Wnondecimaldatanondecimaldata
Wprofile[=file]profile[=file]
Wposixposix
Wreintervalreinterval
Wsource=programtextsource=programtext
Wtraditionaltraditional
Wusageusage
Wversionversion
Toreportbugs,seenode`Bugs'in`gawk.info',whichis
section`ReportingProblemsandBugs'intheprintedversion.
gawkisapatternscanningandprocessinglanguage.
Bydefaultitreadsstandardinputandwritesstandardoutput.
Examples:
gawk'{sum+=$1};END{printsum}'file
gawkF:'{print$1}'/etc/passwd

Now,foranexplanationofthe{print}codeblock.Inawk,curlybraces
areusedtogroupblocksofcodetogether,similartoC.Insideourblock
ofcode,wehaveasingleprintcommand.Inawk,whenaprintcommand
appearsbyitself,thefullcontentsofthecurrentlineareprinted.
$awk'{print$0}'/etc/passwd
output

root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
...

Inawk,the$0variablerepresentstheentirecurrentline,soprintand
print$0doexactlythesamething.
$awk'{print""}'/etc/passwd

$awk'{print"hello"}'/etc/passwd

Runningthisscriptwillfillyourscreenwithhello's.
http://www.cse.iitb.ac.in/~br/courses/cs699autumn2013/refs/awktutorial.html

1/4

1/20/2015

awktutorial|learnawk(linux)programming

AWKVariables
awkvariablesareinitializedtoeitherzeroortheemptystringthefirst
timetheyareused.
Variables
Variabledeclarationisnotrequired
Maycontainanytypeofdata,theirdatatypemaychangeoverthelife
oftheprogram
Mustbeginwithaletterandcontinuingwithletters,digitsand
underscores
Arecasesenstive
Someofthecommonlyusedbuiltinvariablesare:
NRThecurrentline'ssequentialnumber
NFThenumberoffieldsinthecurrentline
FSTheinputfieldseparatordefaultstowhitespaceandisreset
bytheFcommandlineparameter
/test$catcalc
356
56789
/test$awk'{d=($2($14));s=($2+$1);printd/sqrt(s),d*d/s}'calc
7.4207755.0678
18.5066342.494
/test$

inaboveexamplewehaveafilecalcwithtworowsandtwocolumns.
Notethatthefinalstatement,a"print"inthiscase,doesnotneeda
semicolon.Itdoesn'thurttoputitin,though.
Integervariablescanbeusedtorefertofields.Ifonefieldcontains
informationaboutwhichotherfieldisimportant,thisscriptwillprint
onlytheimportantfield:
$awk'{imp=$1;print$imp}'calc

ThespecialvariableNFtellsyouhowmanyfieldsareinthisrecord.This
scriptprintsthefirstandlastfieldfromeachrecord,regardlessofhow
manyfieldsthereare:
ifnowcalcfileis
356abd
56789xyz
$awk'{print$1,$NF}'calc
3abd
567xyz

BeginandEnd
AnyactionassociatedwiththeBEGINpatternwillhappenbeforeany
linebylineprocessingisdone.ActionswiththeENDpatternwill
happenafteralllinesareprocessed.
1.Oneistojustmashthemtogether,likeso:
>
awk'BEGIN{print"fee"}$1=="foo"{print"fi"}
END{print"fofum"}'filename

AWKArrays
awkhasarrays,buttheyareonlyindexedbystrings.Thiscanbevery
useful,butitcanalsobeannoying.Forexample,wecancountthe
frequencyofwordsinadocument(ignoringtheickypartaboutprinting
themout):
$awk'{for(i=1;i<=NF;i++)freq[$i]++}'filename

Thearraywillholdanintegervalueforeachwordthatoccurredinthe
file.Unfortunately,thistreats"foo","Foo",and"foo,"asdifferentwords.
http://www.cse.iitb.ac.in/~br/courses/cs699autumn2013/refs/awktutorial.html

2/4

1/20/2015

awktutorial|learnawk(linux)programming

Ohwell.Howdoweprintoutthesefrequencies?awkhasaspecial"for"
constructthatloopsoverthevaluesinanarray.Thisscriptislongerthan
mostcommandlines,soitwillbeexpressedasanexecutablescript:
#!/usr/bin/awkf
{for(i=1;i<=NF;i++)freq[$i]++}
END{for(wordinfreq)printword,freq[word]
}

AWKRegularexpressionsandblocks

awk'/pattern_to_match/{actions}'input_file
awk'/foo/{print}'abc.txt
catabc.txt|awk'/[09]+.[09]*/{print}'

Expressionsandblocks
fredprint
$1=="fred"{print$3}
root
$5~/root/{print$3}
AWKConditionalstatements
awk'{
if($1~/root/)
{
print$1
}
}'/etc/passwd

Bothscriptsfunctionidentically.Inthefirstexample,theboolean
expressionisplacedoutsidetheblock,whileinthesecondexample,the
blockisexecutedforeveryinputline,andweselectivelyperformthe
printcommandbyusinganifstatement.Bothmethodsareavailable,and
youcanchoosetheonethatbestmesheswiththeotherpartsofyour
script.
if
{

if($1=="foo"){

if($2=="foo"){

print"uno"

}else{

print"one"

}
}elseif($1=="bar"){

print"two"
}else{

print"three"
}

if
!/matchme/{print$1$3$4}
{

if($0!~/matchme/){

print$1$3$4

}
}

Bothscriptswilloutputonlythoselinesthatdon'tcontainamatchme
charactersequence.Again,youcanchoosethemethodthatworksbest
foryourcode.Theybothdothesamething.
($1=="foo")&&($2=="bar"){print}
Thisexamplewillprintonlythoselineswherefieldoneequalsfooand
fieldtwoequalsbar.
http://www.cse.iitb.ac.in/~br/courses/cs699autumn2013/refs/awktutorial.html

3/4

1/20/2015

awktutorial|learnawk(linux)programming

contactus|privacy|terms

http://www.cse.iitb.ac.in/~br/courses/cs699autumn2013/refs/awktutorial.html

4/4

You might also like