100% found this document useful (1 vote)
612 views

Cad Scripting Awk

Uploaded by

Phan Văn Tiến
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
612 views

Cad Scripting Awk

Uploaded by

Phan Văn Tiến
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

CAD

SCRIPTING

LANGUAGES

A collection of Perl, Ruby, Python, Tcl, and SKILL® scripts

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 1


by

Quan Nguyen

• A Programming Guide for VLSI Chip Design Engineers


• A Quick Reference for Various Contemporary EDA Scripting Languages
• An Introduction to GUI Graphic Programming with Tk
• An Introduction to Cadence™ SKILL® Programming for Custom IC Chip Design

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 2


Author's Biography
Quan Nguyen was born and raised in Laos in 1958. He studied at the Lycee de Vientiane until he
came to the United States at the end of the Vietnam War in 1975. He graduated from the University
of Michigan, at Ann Arbor, Michigan, USA in 1982 with the degree of BS in Electrical Engineering
and MSEE at the University of Vermont in 1986. He has designed IC chips for 25 years in the area
of high-performance circuit design for DRAM, SRAM, cache memory, Microprocessor and ASIC
chips.

Limit of Liability and Disclaimer of Liability

This book is published "as is" basic and the information is provided as "as is". The author and the publisher make no
warranty of any kind, expressed or implied, with regard to these programs or the documentation contained in this
book.

This book is for instructional purpose only and should not be used in any commercial programs. The author and the
publisher shall not be liable in the event of incidental or consequential damages in connection with, or arising out of,
the furnishing, performance, or the use of the programs, associated instructions, and/or claims of productivity gains.

CopyRight© 2009 Protection

Copyright© 2009 by Quan Nguyen. All rights reserved. No part of this book may be reproduced, stored in a retrieval
system, or transmitted in any form or by any means, electronic, photocopying or scanning, without the express written
consent of the author and the publisher. The information contained herein is for personal use and may not be
incorporated in any commercial programs, other books, databases, or any kind of software without the written consent
of the author and the publisher. Making copies of this book or any portion for any purpose other than your own use is
a violation of the United States copyright laws.

ISBN: 978-0-9777812-2-5
ISBN-10: 0-9777812-2-4

Library of Congress Control Number: 2007942133

United States Copyright Office: Certificate of Registration #TX 6-517-329

[email protected]

RAMACAD Publishing
Printed in The United States of America.
CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 3
Preface
"For the last fifteen years a fundamental change has been occurring in the way people write computer
programs. The change is a transition from system programming languages such as C or C++ to scripting
languages such as Perl or Tcl. " That was a quote by a Berkeley Professor and Tcl inventor, John K.
Ousterhout, in an IEEE Computer magazine in a 1989 paper of "Scripting: Higher Level Programming
for the 21st Century." Professor Ousterhout concluded with: " I hope that the programming language
research community will shift some of its attention to scripting languages and help develop even more
powerful scripting languages for the future. Raising the level of programming should be the single most
important goal for language designers, since it has the greatest effect on programmer productivity..."

That prediction is pintpointedly accurate. The trend of programming practice is continuously marching
towards higher-level scripting languages from conventional system languages like C/C++. The
migration is aimed for higher productivity at the expense of runtime speed, which fortunately can be
mitigated by the tremendous speed improvement of computer hardware in the last two decades. Large
speed-intense tasks are still needed to be coded with conventional system languages, while the interfaces
and most other tasks, including the ever-popular language extension scheme of the particular tool
software, are handled with scripting languages. Many modern EDA tools like PhysicalCompiler®,
Nanotime®, FirstEncounter®, etc... have core Tcl as the base interface language and specially-tailored
Tcl extension language for a particular CAD tool. Heavy algorithm tasks are coded with C/C++ for
speed. The proper balances of mixing the scripting and systems languages to solve particular
programming tasks would enhance the overall system speed and productivity of the users.

It is worth noting that both software and hardware are steadily moving towards "system
integration" with higher-level of abstraction for higher productivity, modular design, design reuse and
rapid Turn-Around Time (TAT) of a project. Software moving toward scripting language for higher-
level programming and hardware moving toward Electronic System Level (ESL) for higher-RTL
(Register Transfer Language, maybe SystemC) abstraction are on the rise. Software uses Object-
Oriented-Programming (OOP) class blue-print while hardware uses IP (Intellectual Protocol) blocks to
generate component objects or instances and to provide better "gluing" interface and ease of verification
of the whole system.

One of the reasons for why "scripting is on the rise" is that scripting allows casual programmers the
chance to join the professional programmer community. For casual programmers like hardware
engineers, programming is not their main function. They are very impatient when it comes time to code.
They are not willing nor have the luxury to spend months or weeks to learn the detailed syntax of a
particular programming language. They just like to spend enough times to learn and write simple but
useful scripts and move on to other tasks. For this reason, scripting language is better suited than system
language to satisfy the engineer's desire to get the tasks with minimal amount of time for productivity.
On the average, a line of code in scripting language is doing a lot of more works than a line of code in
system language. With scripting language, the user can create a rapid application prototype in much less
time because scripting languages "hide" many underlying details from the users and facilitate interfaces

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 4


between programming modules. For example, many scripting languages have Tk as the default graphic
GUI engines. Only with few lines of codes, the user can build a simple GUI graphic interface, literally
within hours if not minutes. This is a very unlikely scenario if implemented in the system programming
language like C/C++.

For the "even more powerful scripting languages for the future" front, many scripting languages were
born during the hyper-active internet explosion 1990's period. Many languages were created quickly and
quietly disappeared from the earth. Some have survived the test of time. In general, modern scripting
languages must provide easy support of Object-Oriented Programming (OOP). Perl does provide
support for OOP and is still the dominant scripting language. Python with OOP is gaining momentum
due to its simplicity and elegance with the motto of "There is only one way to do it" vs. "There are more
ways to do it" of Perl. Ruby is an OOP language since its inception with the motto: "There are better
ways to do it". Ruby is a powerful modern scripting language with many features inspired from Perl.
Coupled with the explosion of Ruby-on-Rails (RoR) for web applications, Ruby is the likely potential
candidate to replace the entrenched Perl, if it is ever happened. For Tcl, core Tcl does not support
OOP. Just like there are many "fragmented" Tcl derivatives, like "incr Tcl" or Tclpp, there is no standard
support for the extension of OOP in Tcl. Tcl is the currently the standard de-facto interface language
for most EDA tools.

In today Integrated-Circuits (IC) digital design environment, it is an irony that hardware designers may
spend as many hours as the software designers in front of the terminal screen "doing" software
emulation of chip hardware. Chip design process is a highly complex procedure that hardware designers
no longer use any "hand" tool but employ the Electronic Design Automation (EDA) to perform all tasks
in design steps from the initial inception of the design to the final chip hardware debugging process.
Designers who are not proficient in programming are at a disadvantage.

EDA software tools for chip design are complex and are not well integrated, even within a particular
CAD company. Each EDA vendor may have its own platform. And sometimes, a software vendor has a
particular platform for a particular tool. IC design industry is always talking about creating common
design platform for a seamless design flow to make life easier for everyone to integrate. But thing is
easier said than done. In the meantime, designers must deal with the complexity of multiple platforms.
To be proficient, the designers need to know many scripting languages.

It's no secret that engineers with proficiency in programming are far more productive than those who do
not. That's one reason why I wrote this book, which is to encourage practicing hardware designers get
more involved in programming. Spend the time and pay the price of learning to get over the initial
threshold and you will start to experience the joy of programming as well as the rewards from it.
Hopefully, this book will make it a little bit easier to get over the initial hump. Programming becomes
more critical as we progress toward the SOC (System-On-Chip) design methodology revolution, where
multi-tasking ability is the key to survive and thrive.

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 5


Why This Book

"CAD Scripting Languages" is neither a guide nor a complete reference book. It's more like a cheat
sheet full of examples. The assumption is that the reader has already mastered the basic of programming
before and has forgotten it, due to the lapse of time or due to the occasional use. This book is intended to
be used as a reminder or quick reference. It is written from the point of view of a technician, instead of
an architect. It will cover only the most-frequently-used commands and subjects. If you are new to
programming, you may need other full reference books. The primary target audience of this book is the
practicing engineer who occasionally needs programming for productivity reasons and writes codes for
his/her own or group uses.

The book concentrates on discussing the processing of the data in real-world engineering applications.
There are tons of books about Perl/Ruby/Tcl/Python out there. Why this book? Well, this book is unique
and different in many perspectives. It is a collection of many modern scripting languages in the "Rosetta
Stone" format, where the scripts of different languages are listed side by side. It tries to "cross-
pollinate" between the scripting languages. If you have experience with one of the scripting
languages, the book's goal is to help lowering the cost and time to migrate or transfer to other languages
with ease. Most examples in the book are coded with at least two scripting languages for line-by-line
comparison purpose. For every line of Perl code, our goal is to have an associate line in other languages
besides Perl for cross-reference checking.

If you are working on custom IC chip design using Cadence™ software tools and you are looking for a
SKILL® programming language guide book to supplement the documentation manuals from
Cadence™, this book is definitely for you. You do not have many choices to choose from, besides this
book of "CAD Scripting Languages", since, as far as the author's knowledge, there is no other
commercial book on SKILL programming at the time of this writing. On the other side, with the coming
of the open-source Open-Access (OA) database, CAD engineers may need to master other modern
languages like Ruby besides SKILL.

The predecessors of this book are "The Perl Connection" and "The CAD Connection". "The Perl
Connection" is a programming guide for Sed, Awk, Unix, Perl and Tcl for general hardware engineers.
"The CAD Connection" discusses Cadence™ SKILL® programming for designers who are working on
IC Custom Analog or digital designs at transistor or mask layout level. "CAD Scripting Languages"
covers the two modern scripting languages: Ruby and Python. Throughout this book, we sometimes use
the phrase "The Perl Connection", "The CAD Connection", "CAD Scripting Languages" or "Scripting
Languages by Examples" interchangeably.

Scope of the Book

Each scripting language by itself can require a whole separate volume if all subjects are covered in
details. Since "CAD Scripting Languages" attempts to cover multiple languages in a single book, it is
impossible (nor the author's desire) for the book to go over all important topics. It would overwhelm the
novice readers with mundane details. We will not cover many important topics like OOP, modules,
CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 6
packages, system interface, exceptions, etc.... Practical programming makes extensive use of packages
or pre-built modules to speed up programming time. We assume that if you can search, install and make
use of package modules, your programming skill level maybe is beyond the scope of this book. OOP
subject is also beyond the scope of this book. The book will not teach you all the programming tricks to
write great code or to become great programmer. Its humble goal is to provide you with sufficient
information to get a good start or to raise your programming skill to be sufficient in coding one-liner,
throw-away or "quick and dirty" prototype scripts. And let's see how things roll from here.

Book Organization

The author strongly believes on the concept of learning through examples (i.e. the technician way), so
every chapter is filled with real-world examples. No theory is given. Most of the times, it just describes
how things work. Over 300 examples of code snippets are presented throughout the book. It is the intent
of the author to limit most of the example scripts to less than a page. Longer scripts tend to loose the
interest of the readers quickly. Good examples are required to produce good book. The author believes
that the readers would judge the quality of a book through the good examples in it.

It is said that a capable or productive programmer always produces good codes after a short period of
immersion in a new language, no matter what languages he/she will be in. The key to success is to
concentrate more on problem solving rather than the detailed syntax. Having said that, this book tries to
focus more on problem solving and tries to limit the discussion of the language syntax in length. The
web is a much better place for the search for command syntax. However, we do explore the syntax and
techniques of the programming languages during the dissection of the examples.

We will split the book into three parts. The first part discusses text and data processing with all five
languages of Awk, Tcl, Perl, Ruby and Python. The second part deals with Tk graphics in Tcl/Tk,
Perl/Tk, Ruby/Tk and Python/Tk, to allow users to convert text data into graphic format for better
visual evaluation or to create GUI interface to invoke other processes with ease. The third part deals with
SKILL programming for the chip design process, namely the manipulation and design of the layout and
schematic database.

It is the author's observation that a good way to introduce new language for the readers is first to go
through the database variables of that language, namely with "string", "array" and "hash" and their
manipulation. Once the programmer is capable of "organizing" the data in the proper format, half of the
battle has been done. The appendix at the end of the book provides a quick tour of the scripting
languages of Ruby and Python. If the reader is new to any of these scripting languages, it is
recommended to take a brief tour of the languages.

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 7


A. A Quick Tour of Awk

Awk is one of the powerful Unix utility commands for data manipulation. If you are in hardware design,
the ability of programming AWK proficiently can make your life easier in your daily engineer tasks.
Among all the programming languages, I consider that Awk is the "optimum trade-off point" for
productivity between effort and work efficiency. Awk has been called as "The Little Language". AWK
can be amazingly "simple" for certain data calculation and reporting tasks.

In addition to text processing like Sed, Awk also deals with data manipulation, mathematical calculation
and text reformat. Awk is designed to handle the database files in the tabular spreadsheet form, which is
very common in engineering data collection.

Sed and Awk can be very handy to generate throw-away or one-liner on-the-fly scripts to pipe the output
directly to other tools. For example, we use Sed and Awk in gnuplot execution:

gnuplot> plot "<sed -n '/bob/p' data.txt | Awk '/load/{print $2,$1}'"

Awk stands for the initial letters of the author names (i.e. Aho, Weinberger, and Kernighan). There are
many versions of Awk languages like Awk, nawk, tawk, mawk, etc... We will discuss GNU "Gawk".
We had extensive coverage of Awk in the regular chapters. The topics in this chapter are:

• A.1 Fields and Records


• A.2 Arithmetic Calculation
• A.3 Array
• A.4 Import Variables
• A.5 Function
• A.6 Regular Expression
• A.7 Awk Potpourri

If you are a Window geek and like to run Unix commands like Awk, you do not need Unix or Linux
operating system installed on your computer. You can use "Cygwin" software tool to emulate Unix
commands under DOS terminal. Cygwin provides the look-and-feel of Unix but with limited support of
Unix kernel. You can run the basic Unix commands like vi, sed, Awk, cut, ls, less, etc.. You can also
pipe the commands like ls | sed -n "1p". Cygwin package can be download and installed from the web.
For more information, check http://www.cygwin.com.

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 8


A.1 Fields and Records

Awk is designed explicitly to handle lines and tokens. When you execute AWK script, each line is
automatically read in and stored in a special variable "$0". The line number is stored in a special "NR".
The line is also automatically splitted into fields : $1, $2....$NF. The keyword "NF" is maximum number
of fields in the line. From the below table, we can call the last field by $5 or $NF.

$0: whole line


$1: first field
$NF: last field

Line aa bb cc dd ee
Fields $1 $2 $3 $4 $NF
Whole Line $0

One of the most basic command construct of Awk consists of "test-action" pair. Every line in the input
data file will go through the test-action command pair. If there are twenty lines in the input file, there
will be twenty execution of test-action pair. If the "test" condition is true, then the "action" is taken. If
the "test" condition is omitted, the "action" will also be taken.

• awk ' test { action } ' file

For example, the following script would print every line that contains the string "Jeff".

• awk ' /Jeff/ { print $0 } ' myfile

When the above script is executed, Awk will open the file "myfile" and read one line at a time, until the
end of file. Construct "/Jeff/" tells Awk to search for the string "Jeff" in the line. If "Jeff" is found, the
action is taken. In this case, we print "$0" for the entire line. The back-slash "/../" pair in Awk implies
search with regular expression mode. We can re-write into more familiar code:

• awk '{if ($0 ~ /jeff/) {print $0}' myfile

> cat stud.txt

John Lee : 78 83 94 95
Vi Nguyen : 98 94 97 92
Jeff Bush : 60 66 79 56
George Bush : 56 75 80 90

> awk ' /Jeff/ { print $0 } ' stud.txt #print lines with Jeff

Jeff Bush : 60 66 79 56
CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 9
* Perl -ne '/Jeff/ && print $_ ' stud.txt

> awk ' /Jeff/ { print } ' stud.txt #default print: $0

Jeff Bush : 60 66 79 56

* Perl -ne '/Jeff/ && print ' stud.txt

> awk ' /Jeff/ ' stud.txt #no action: print

Jeff Bush : 60 66 79 56

* Perl -ne 'print if /Jeff/' stud.txt

> awk ' $0 ~ /Jeff/ { print } ' stud.txt #whole line search

Jeff Bush : 60 66 79 56

* Perl -ne '$_ =~ /Jeff/ && print' stud.txt

> awk ' !/Jeff/ { print } ' stud.txt #print lines without Jeff

John Lee : 78 83 94 95
Vi Nguyen : 98 94 97 92
George Bush : 56 75 80 90

* Perl -ne '!/Jeff/ && print' stud.txt

> awk '{ /Jeff/ {next};print}' stud.txt #print lines without Jeff

John Lee : 78 83 94 95
Vi Nguyen : 98 94 97 92
George Bush : 56 75 80 90

* Perl -ne '$_ =~ /Jeff/ && next; print' stud.txt

> awk ' $2 ~ /Bus/ { print } ' stud.txt #field matching ~

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 10


Jeff Bush : 60 66 79 56
George Bush : 56 75 80 90

* Perl -nae '$F[1] =~ /Bus/ && print' stud.txt

> awk ' $2 !~ /Bus/ { print } ' stud.txt #not equal !~

John Lee : 78 83 94 95
Vi Nguyen : 98 94 97 92

* Perl -nae '$F[1] !~ /Bus/ && print' stud.txt

> awk ' $2 == "Bush" { print } ' stud.txt #field compare ==

Jeff Bush : 60 66 79 56
George Bush : 56 75 80 90

* Perl -nae '$F[1] eq "Bush" && print' stud.txt

> awk ' $2 == "Bus" { print } ' stud.txt #not compared

(none) #Find Bus, not Bush

* Perl -nae 'if ($F[1] eq "Bus") {print}' stud.txt

> awk ' /Jeff/ { print NR,$0 } ' stud.txt #print line number

3 Jeff Bush : 60 66 79 56

* Perl -nae '/Jeff/ && print $.," ",$_' stud.txt

> awk ' /Bush/ { print NR,$1,$4 } ' stud.txt #print fields

3 Jeff 60
4 George 56

* Perl -nae '/Bush/ && print $.," ","@F[0,3]\n"' stud.txt

> awk ' NR==2 { print NR,":"$4 } ' stud.txt #row:2 col:4

2 : 98
CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 11
* Perl -nae '$. == 2 && print "$. : $F[3]\n"' stud.txt

> awk ' NR==2,NR==3 { print } ' stud.txt #line range

Vi Nguyen : 98 94 97 92
Jeff Bush : 60 66 79 56

* Perl -nae '$.>=2 && $.<=3 && print' stud.txt


* Perl -nae 'print if 2..3' stud.txt

> awk ' /Bus/ && ($4 > 58) { print } ' stud.txt #test anding

Jeff Bush : 60 66 79 56

* Perl -nae '/Bus/ && $F[3] > 58 && print' stud.txt

> awk ' { print $2} ' stud.txt #extract field

Lee
Nguyen
Bush
Bush

* Perl -nae ' print "$F[1]\n"' stud.txt

> Awk '{printf(%9s %-9s : %d %5.2f\n",$1,$2,$4,$5}' stud.txt

John Lee : 78 83.00


Vi Nguyen : 98 94.00
Jeff Bush : 60 66.00
George Bush : 56 75.00

* Perl -nae 'printf("%9s %-9s : %d %5.2f\n",@F[0..1,3,4])'

• awk ' /Jeff/ { print } ' stud.txt


When we call the "print" command without any argument, Awk will print the contents of the current
line.

• awk ' /Jeff/ ' stud.txt


When we do not specify the "action", Awk will print the contents of the current line. Note that the
construct is similar to "grep" emulation.
CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 12
• grep "Jeff" stud.txt

• awk ' $0 ~ /Jeff/ { print } ' stud.txt


$0 represents the whole line. The statement "$0 ~ /Jeff/" says looking for Jeff in the whole line. This
may be redundant since the default search space is the whole line.

• awk ' $2 ~ /Bus/ { print } ' stud.txt


Here we do not want to search the whole line but a certain field in line. Statement "$2 ~ /Bus/" says:
does field 2 contains the string "Bus"?

• awk ' $2 !~ /Bus/ { print } ' stud.txt


This is the reverse of previous example. If field 2 does not contain the string "Bus", then we take the
action.

• awk ' $2 == "Bush" { print } ' stud.txt


The difference between "~" and "==" operators is that "==" looks for perfect and exact match while "~"
looks for any embedded string. We can re-write the script with "~" operator with regular expression:
• awk ' $2 ~ /^Bush$/ { print } ' stud.txt

• awk ' $2 == "Bus" { print } ' stud.txt


We have "Bush", not "Bus", on the second field of some lines. So there is no match for this script.

• awk ' { print $2} ' stud.txt


This script does not have "test" code in the "test-action" pair. So the "test" condition is assumed to be
true. And the action will always be taken.

A.2 Arithmetic Calculation

In this example, we have a file that contains the student's test scores. There are four test scores. We like
to add all the four test scores to get a total score per student (i.e. per line.) We place the total score
number on the end of the line.

Adding Row

>cat student.txt (desired output)

John Lee : 78 83 94 95 John Lee : 78 83 94 95 => 350


Vi Nguyen : 98 94 97 92 Vi Nguyen : 98 94 97 92 => 381
Jeff Bush : 60 66 79 56 Jeff Bush : 60 66 79 56 => 261
George Bush : 56 75 80 90 George Bush : 56 75 80 90 => 301
Bob Hoff : 86 78 39 52 Bob Hoff : 86 78 39 52 => 255
Quang Nguyen : 66 99 85 92 Quang Nguyen : 66 99 85 92 => 342
Tim Lay : 56 54 65 54 Tim Lay : 56 54 65 54 => 229

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 13


awk '{
tot=0; # Reset tot

for(i=4;i<=NF;i++) {tot += $i} ; # Add fields

print $0 " => " tot # Print


}
' student.txt

perl -nae 'chomp($_);$t=0;map { $t+=$F[$_]} 3..$#F;


printf("$_ => $t\n")' student.txt

• for(i=4;i<=NF;i++) {tot += $i}


This is the "for" loop structure. We start with field 4 (i.e. "i=4") until the last field (i.e. i<=NF). We can
replace the above statement with:
• tot = $4 + $5 + $6 + $7

• tot += $i
The above statement is the short notation for
tot = tot + $i

• Add columns with exact number of fields

In the next example, we will add column of a datafile. In term of programming, a big difference between
adding row and column is that the task of adding row is completed after reading a row. For adding
column, the task of adding will not be complete until the end of the file. Hence, we will not know the
total sum of the columns until the end of file. In the meantime, we have to save temporary total numbers.

Adding Column

>cat student.txt (desired output)

John Lee : 78 83 94 95 John Lee : 78 83 94 95


Vi Nguyen : 98 94 97 92 Vi Nguyen : 98 94 97 92
Jeff Bush : 60 66 79 56 Jeff Bush : 60 66 79 56
George Bush : 56 75 80 90 George Bush : 56 75 80 90
Bob Hoff : 86 78 39 52 Bob Hoff : 86 78 39 52
Quang Nguyen : 66 99 85 92 Quang Nguyen : 66 99 85 92
Tim Lay : 56 54 65 54 Tim Lay : 56 54 65 54
CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 14
=============================
500 549 539 531

awk '

BEGIN {t[4]=0;t[5]=0;t[6]=0;t[7]=0} # Do first

{for( i=4; i<= NF; i++) { t[i]+= $i};print} # Add columns

END {print "===============\n" t[4],t[5],t[6],t[7]} # Do last

' student.txt

perl -nae 'map { $t[$_]+=$F[$_]} 3..$#F;


END {printf("==================\n %d %d %d %d\n",@t[3..6])}
' student.txt

Note the program consists of three parts: Begin, Body & End. When you execute Awk script, Awk will
execute the codes inside "BEGIN" section first. Normally, the codes in "BEGIN" section are used to
initialize some variables and other upfront tasks. Similar to "BEGIN", Awk will execute the codes inside
"END" section last. The body

BEGIN { .. } executed once

(body) executed as many times as the number of lines in the datafile

END { .. } execute last

• for( i=4 ; i<= NF ; i++ ) { t[i] += $i }


If we expand out, the equivalent statements would be:

t[4] = t[4] + $4
t[5] = t[5] + $5
t[6] = t[6] + $6
t[7] = t[7] + $7

After 1st line, t[4]=78 t[5]=83 t[6]=94 t[7]=95

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 15


After 2nd line, t[4]=176 t[5]=177 t[6]=191 t[7]=187

• Adding scores by names

In the next example, we will add column of a datafile with the same name. The names in the file may be
repeated. We will use the Awk hash array structure. Note that Bob and Bush have multiple records. We
need to add the two scores together for Bush and Bob.

Adding Column per name

>cat student.txt (desired output)

John : 78 John : 78
Bob : 98 Bush : 116
Bush : 60 Bob : 184
Bush : 56
Bob : 86

awk '

{ arr[$1] += $3} # Add columns

END {for (i in arr) {print "%6s : %d\n",i,arr[i]} # Print

' student.txt

perl -nae '$h{$F[0]} += $F[2];


END { map { printf("$_ $h{$_}\n") } keys %h; }
' student.txt

• arr[$1] += $3
The above is to create an array if the element does not exist yet. $1 is the name in field 1 and $3 the
score for that name. If the name or element does not exist in array "arr" yet, Awk will create a new
element for array "arr".

If the element is already existed, then we just add field 3 to the array element.

After line 1: arr["John"]=78


CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 16
After line 2: arr["John"]=78 arr["Bob"]=98
After line 3: arr["John"]=78 arr["Bob"]=98 arr["Bush"]=60

After line 4: arr["John"]=78 arr["Bob"]=98 arr["Bush"]=116

After line 5: arr["John"]=78 arr["Bob"]=184 arr["Bush"]=116

A.3 Array

We have extensive discussion of Awk array in the regular chapters. We repeat the discussion here for
completeness of the chapter.

As mentioned before, Awk has hash array only. The key index can be anything, string, text, number,
sentence, etc... with no space. You have to assign each array element individually, one at a time, unless
you use split function. Setting individually is similar to assigning a variable in Awk, with the [ ] bracket
following the array name. Even when you use integer as index like data[1], Awk still treats as index with
string "1".

#----Assign Awk array elements

champ[1999]="France" #define champ[1999]

champ["soccer"]="Germany" #define champ["soccer"]

champ["swim"]="France" #define champ["soccer"]

champ["Italy"]=2001 #define champ["Italy"]

y="England" #define variable y


champ[y]=2000" #define champ["England"]

#----Accessing an element in Awk array

print champ[1999] #print "France"

France #champ[1999]="France"

x="Italy" #assign variable x


print champ[x] #replace x in champ[x] first

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 17


2001 #same as print champ["Italy"]

if ("Italy" in champ) print champ["Italy"] #Key "Italy" exists in champ?

2001 #Yes we have key "20001"

if (x in champ) print champ[x] #same as above with x="Italy"

2001

foreach (i in champ) print i " => "champ[i] #Looping with foreach

Italy => 2001 #unpredictable sequence


1999 => France
England => 2000
soccer => Germany

foreach (i in asort(champ)) print i " => "champ[i] #sorted key sequence

1999 => France


England => 2000
Italy => 2001
soccer => Germany

• champ["soccer"]="Germany"
When you create an element pair in hash array. You need to quote the key. If not Awk will think is a
variable, unless the key contains digits only. If it's a variable, Awk will do substitution first.

• if ("Italy" in champ) { print champ["Italy"] }


This command is to check whether key index "Italy" exists in hash "champ". Does champ["Italy"]
exists? If it exists, then print the value of champ["Italy"] out.

• foreach (i in champ) { print i " => "champ[i] }


When we use foreach command to loop through all elements in hash, there is no predictable sequence in
which key is processed first or last. The key sequence is totally random as in case of TCL and Perl
CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 18
hashes. We can make predictable sequence with sort command:
• foreach (i in asort(champ)) print i " => "champ[i]

or we create a new array champ_sorted first then use foreach to loop:

• max=asort(champ,champ_sorted)
• foreach (i in (champ_sorted)) { print i " => "champ[i] }

• Create array with split coomand

In the previous example, we define hash one element a time. In this example, we create an array of many
elements in a single split command.

Create array elements using split

echo | Awk '

BEGIN {max=split("VN FR USA",states," ");states[4]="Laos"}; #Define array

END {
for(i=1;i<=max;++i) print "states["i"] -> "states[i]; #Loop with "max"

for(i in states) print "states["i"] => "states[i]; #Loop with "for"


}
'

states[1] -> VN
states[2] -> FR
states[3] -> USA

states[4] => Laos


states[1] => VN
states[2] => FR
states[3] => USA

perl -e '
BEGIN {@states=split(/\s+/,"VN FR USA");$states[3]="Laos"};
END {
CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 19
for($i=0;$i<=$#states;++$i) {printf("states[$i] -> $states[$i]\n")};
foreach (0..$#states) {printf("states[$_] => $states[$_]\n")};
}
'

• max=split("VN FR USA",states," ")


We create an array called "states" from a string of "VN FR USA" by using split command. The array
"states" would contain three elements. Variable max is set to 3 since there would three elements defined
in the array. The above statement is equivalent to:
states[1]="VN"
states[2]="FR"
states[3]="USA"

• for(i=1;i<=max;++i)
Use for loop to process all elements in array "states". Only three (not four) elements are printed since we
define states[4]="Laos" later. To print all elements, we would need to increase max by one:
• for(i=1;i<=max+1;++i)

• for (i in states)
Other way to loop over all elements in array. Note the order of keys is not guaranteed. Use command
"asort(states,new_states)" for sorted keys. The second statement would loop thru all elements in
my_array. Note my_array[1] is the first element and my_array[max] is the last element in my_array.

• Create array with data from file

In this example, we will create array with the data from a file. First we need in data one line at a time
and split its fields into array elements.

(desired output: sorted keys)


>cat country.txt
china => 1200
vietnam 80 france => 50
france 50 vietnam => 80
usa 270 usa => 270

vietnam ~> 80

awk '

FILENAME == "country.txt" { #Process file

split($0,F," ") #Create array

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 20


states[F[1]]=F[2] #Assign hash

next #Read next line


}

BEGIN { states["china"]=1200} #Define

END { n=asort(states) #Sort array

for (i in states) {print "states["i"] => "states[i]}; #Loop

if ("vietnam" in states) print "Vietnam ~> "states[i] #Exist in array?


}

' country.txt

perl -ne '


BEGIN { $states{china}=1200;
open(FH,"country.txt");
while (<FH>) { @F=split(/\s+/,$_);$states{$F[0]}=$F[1] };
}
END {foreach $k (sort keys %states) {printf("states{$k} => $states{$k}\n") };
if (exists($states{vietnam})) {printf("Vietnam ~> $states{vietnam}\n" )}
}
' country.txt

• FILENAME == "country.txt" {....next }


In addition to reading file in the conventional way, Awk also allows to read the file with FILENAME
command. We use it for create a hash. We can pre-read the whole array to process data. With this
command, we can read the file twice.

• split($0,F," ")
• states[F[1]]=F[2]
We split the read-in line $0 into fields and store in array F. We then use command "states[F[1]]=F[2]"
to create a real hash array and assign element for hash array.

• BEGIN{states["china"]=1200}
We add one more array element before processing the file.

• if ("vietnam" in states)
We check whether array "states" having any key called "vietnam".

Fig.4 Summary of Awk array commands


champ["soccer"]="Germany" Assign array
max=split($0,my_array," ") Create my_array with split
max is the total of elements
CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 21
for(i=1;i<=max;++i) print array[i] use max to loop thru the array
for (i in champ) print .. access array elements
if ("Italy" in champ) print .. check existence
n=asort(data) data sorted
n=asort(data,dest) data unaffected;new array: dest

A.4 Import Variables

There are many programming cases where we mix and match Unix utilities together. It is very useful for
Awk to have the ability to import global variables into Awk.

We will show several that you can import the variables into Awk for further processing.

>date
Mon Nov 28 15:30:17 PST 2001

>echo `date | cut -d" " -f6`


2001

#---- Awk '{command}' var=$value file

year=`date | cut -d" " -f6`

echo|awk '{print "The",yr+2,"is the year of tiger!"}' yr=$yrs

The 2003 is the year of tiger!

#---- Awk '....' $var '....' file

year=`date | cut -d" " -f6`

echo|awk '{print "The",'$year+2',"is the year of tiger!"}'

The 2003 is the year of tiger!

#---- Awk '{command}' var=$value file

>cat add
{print "The", yr+2, " the year of tiger!!!!" }

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 22


year=`date | cut -d" " -f6`

echo | Awk -f add yr=$year

The 2003 is the year of tiger!

• awk '{print "The", yr+2 , "is the year of tiger!" }' yr=$year
Here we import global variable "$year" into Awk. We re-assign the name "$year" to "yr". The variable
is assigned after the script codes.

• awk '{print "The", '$year+2' , "is the year of tiger!" }'


Here we mimic the double single-quote pairs: '....' $year+2 '....'. The global variables "$year" is
interpreted before Awk is invoked.

• awk -f add yr=$year


If the Awk commands are stored in a file. We use "-f" option to call awk script. We assign variables at
the end of the command line.

A.5 Function

Function is an Awk subroutine call. There are some pre-built functions like match, index, length, split,
sub, sprintf, tolower, toupper, substr, int, sqrt, exp, int, etc... You can create a subroutine with "function
command"

• awk ' function name( parameter-list [local variables] ) { ... body ... }

echo | Awk ' function add(a,b) {return a+b}; { print add(4,8) }'

12

echo | Awk ' function pr(x) {printf("7.4f\n"); { pr(63.5) }'

63.5000

In Awk, all variables are globally scoped unless you declare them as local variables. The "paramater-
list" comprises of the subroutine arguments as well as the local variable declaration. It is conventional to
place some extra places between the parameter lists and the local variables.

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 23


echo | gawk '

x=3; y=7; a=10; b=11; c=12; d=13; e=14

function add(a,b,c, d) { c=c*2; d=d*3; e=e*4; return a+b+c }

{
printf("a=%s b=%s c=%s d=%s e=%s\n",a,b,c,d,e)

print add(x,y)

printf("a=%s b=%s c=%s d=%s e=%s\n",a,b,c,d,e)

print add(x,y,50)

printf("a=%s b=%s c=%s d=%s e=%s\n",a,b,c,d,e)


}
'

a=10 b=11 c=12 d=13 e=14


10
a=10 b=11 c=12 d=13 e=56

110
a=10 b=11 c=12 d=13 e=224

• function add(a,b,c, d) {c=c*2; d=d*3; e=e*4; return a+b+c}


The function "add" has four arguments, which consists for parameters and local variables. When the
function is called, the user provides one to four operands and the rests are assumed to be local variables.

• print add(x,y)
Here the function is called with two operands to be added. Inside the function, "a" is assigned the value
of "x" or 3; and "b" the value of "y" or 7. The result is 10. Variables "c" and "d" are assumed to be local
variables.

• print add(x,y,50)
Here we perform operation with three operands. Initially, "c" is assigned the value of 50 and with
"c=c*2" is turned into 100. The result of "return a+b+c" is 3+7+100 or 110. "d" is assumed to local
variable. "e" is a global variable since the function heading does not have "e" in the argument list.

If you notice, the value of "e" is quadruple everytime the function is called. The first function call
changes "e" from 14 to 56. And the second function call, "e" is changed from 56 to 224. "c" and "d" are

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 24


declared in the function heading as local variables. Their valued are changed inside the function, but not
outside.

In the next example, we will "add scores in two files" using "function" call. Note that in the subroutine,
two variables "i" and "line" are declared as local variables. All other undeclared variables in the
"function" are considered as global variables.

>cat filea (desired output: average)

Bob 70 80 90 Tom 548 6 91.3333 98


Jim 65 85 75 Bob 420 6 70
Tom 87 90 92 Jim 480 6 80 95
Bob 50 60 70

>cat fileb

Tom 89 98 92
Jim 85 95 75

,b>
gawk '

BEGIN{ for(i in num) {delete num[i];delete tot[i]}}

function add(infilename, i,line)


{
if ( (getline line < infilename) > 0 )
for(i=2;i<=NF;i++) {tot[$1]+=$i; num[$1]++ }
close(infilename);
}

function highest(infilename, i,line)


{ if ( (getline line < infilename) > 0 )
for(i=2;i<=NF;i++) { if ($i > high[$1]) high[$1]=$i }
close(infilename);
}

FILENAME=="filea" { add("filea") }
FILENAME=="fileb" { add("fileb");highest("fileb") };

END{ for(i in num) print i,tot[i],num[i],tot[i]/num[i],high[i] }

' fileb filea

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 25


A.6 Regular Expression

In the book, we have devoted a whole section III for the "Regular Expression" (RE) topic, because of its
extreme importance in programming. Without mastering RE to a "sufficient" level, your programming
skill will be handicapped.

Basics of Regular Expression


RE Explain Example Results
. any character /h.me/ home hime h*me h,me
* 0 or more /ho*me/ hme home hoome hooome
/.*/ greedy; match everything
+ one or more /ho+me/ home hoome hooome
? zero or one /ho?me/ hme home
/h[ao]?t ht hat hot
(..) Atomic;memory /(home)+/ home homehome
[..] match any char inside [abc] match a or b or c; same /a|b|c/
[^..] any char except inside [^abc] match any char except a b c
^ Start anchor /^begin/ start with "begin"
$ End anchor /done$/ ending with "done"
/^done$/ start & end with "done"
| operator or him|her him or her

short-notation

[:alnum:] Alphanumeric characters [a-zA-Z0-9_]


[:alpha:] Alphabetic characters [a-zA-Z]
[:blank:] Tab and space characters
[:digit:] Numeric characters [0-9]
[:blank:] Tab and space characters [:space:]

Examples

sub(/bad/,"good") "bad" -> "good"


sub(/bad/,"& and good") "bad" -> "bad and good"
sub(/bad/,"good",$3) perform for field 3
$4 ~ /^(bad|good)$/ or operator
substr($2,5,2) substr($2,1,2) $substr($2,3,2) mmddyy to yymmdd

Gawk

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 26


\w [a-zA-Z0-9_] Word character
\W [^a-zA-Z0-9_] [^\w] Non-word character

Case Sensitive

IGNORECASE=1 Case insensitive


tolower(x) ~ tolower(y) case insensitive; slower

sub gsub gensub substr

There are three substitution commands in Awk: sub, gsub and gensub. Commands "sub" and "gsub" are
the legacies from the early days of Awk. Gensub is for general-purpose substitution task.

sub(r,s,t) sub(/bad/,"good",$0) change once


gsub(r,s,t) gsub(/./,"*") change globally
x=gensub(r,s,n,t) general substitution
x=gensub(r,s,1) equivalent to sub(r,s) change once
x=gensub(r,s,"g") equivalent to gsub(r,s) change globally
r regular expression
s replacement
t target (default $0)
n match number: 1,2,..,"g"
& matched pattern
\\\0 matched pattern
\\1 .. \\9 back-reference

Some advanced features of "gensub" that are not in "sub" or "gsub":

1. Specify matching number


2. No change in the original string
3. back-reference with \\0 \\1 \\2 .. \\9

First, Gensub leaves the original string target unchanged. Second, you can specify the matching order. If
the matching order is specified as "g", then the substitution is equivalent to global gsub. If the matching
order is specified as "1", then the substitution is equivalent to sub. Third, you can specify back-reference
with back-slash as in sed: \\1 \\2 .. \\x. \\0 represents the whole match and is equivalent to "&" back-
reference match.

A.7 Awk Potpourri

Awk : Basic

awk '{command}' var=$value file

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 27


awk -f myscript var=$value
awk -v var="$value" -f myscript
awk '/'$1'/ {command}' file
awk -F"\t" '{command}' file
BEGIN{FS="\n";RS=""} -
END{..} -
$0 Current line
$1-${99} Fields
sub(r,s,t) sub(/bad/,"good",$0)
gsub(r,s,t) gsub(/bad/,"good and &",$0))
gensub(r,s,h,t) Gawk General substitute
sprintf()
match($0,/^[a-z]$/
tolower if (tolower($1) ~ "yes") { ... }
toupper Convert to capital
system system ("cat" $2)
getline getline "file2";read next line
NF Number of fields in current line
NR Current record (line) mumber
FS Field sep;awk 'BEGIN{FS=","};......'
RS Record Separator
OFS Output field separator
ORS Output record separator
&& Logical AND
|| Logical OR
if if ( &&) g=1;else if(a>80)g=2;else g=3
for for(i=1;i<=4;i++) print $i
?: s=(i>0||x=="f") ? "pos" : "neg"
split a=split($1,new_array," ")
a=substr($0,2,4) get 4 letters from 2nd position
length length($2)
index index($0,"start");return a number
next -
continue break loop; next iteration
break break loop; no more iterations
do { } -
match($0,/regexp/) {..} -
CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 28
flag = !flag inverse flag
POSIX
[:alnum:] Alphabetic and numeric characters
[:alpha:] Alphabetic characters
[:blank:] Tab and space characters
[:digit:] Numeric characters
[:space:] White-space characters
[:xdigit:] Hexadecimal characters
next Read next input line & start from top
length Length of a string
int Return integer number
== Equal?
!= Not equal?
!~ Not match regular expression
~ Match?
INGORE CASE=1 Turn on case insensitive
gawk -
\w [[:alnum:]_]
\W [^[:alnum]_]
FILENAME name of current input file
blank concatenation
getline read line from file
sprintf(format,..) formatted string
int return integer portion
Awk : Grep Line

awk '/hello/' pr lines with "hello"


awk '!/hello/ {print}' Pr all lines, except lines w hello
awk '/hello/ {next};{print}' Do not print lines with hello
awk '/hello/ || /hi/ {print}' Print lines w "hello" or "hi"
awk '{if (!/hello|hi/) {print}' all lines, except line w hello|hi
awk 'NR > 10 {print} pr line 11 to the end of file
awk 'NR >= 3 && NR <= 7 {print}' print line 3 to 7
awk '/start/,/end/ {print}' print from start to end
awk '/hello/ {print NR,$0}' pr line number and line w "hello"

Awk : Grep Field

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 29


awk 'NF >= 6 {print}' pr lines w number of fields >=6
awk '$1 ~ /^[0-9]/ {print}' pr if field 1 starts with a digit
awk '$2 !~ /^[0-9]$/ {print}' pr if field 2 is not a number
awk '$2 == 3 , $2 == 8 {print}' pr lines w field2=3 to line w f2=8

Awk : Misc

awk '/hello/{count++};END{print count}' count # lines w "hello"


awk '$7 == "hello" || $3 > $2 {print}' pr if field7="hello" or f3 > f2
awk '/^[a-zA-Z]/{print;getline;print}' pr lines w "hello" & next line
awk '$1>$2 {x++};END{print x}' find total lines that f1 > f2
awk '{$NF="";print} delete last field
awk 'print $0,$NF' pr first and last fields

Awk : Substitution

awk '{sub("bad","good")' change bad to good


awk '/cop/ {sub("bad","good")' s/bad/good/ for lines w cop
awk '/cop/ {sub(/bad/,"& and good",$3}' change for f3 for lines w cop
awk '{gsub("@arr",@new)}' sub whole array
awk '{gsub(/[.,;:?],""}' remove all punctuations
gawk '{gsub(/(.+) (.+), "\\2 \\1", "g", $0)}' swap the first two fields
gawk '{if (sub(/bad/,"good",$3)>0) print $0}' substitute & print lines with "bad"

Awk : Processing

awk '{for(i=0;i<NF;i++) {print $(NF-i)," "}' rotate the fields


awk '{tot += $2};END{print "Total",tot}' print the sum of field 2
awk '$2 == tot' tot=$1' "ave=$average" transfer data into awk
awk 'NR==1 {n=$2;p=$2};$2p {p=$2}' find min and max of f2

Awk : Multi-Line

awk '{gr=($1>90)?"a":($1>80)?"b":"c"}' :? constructs


awk 'BEGIN{while (getline < "myfile" > 0). . . -
awk 'BEGIN{while ("who" | getline) { . .} -
awk '/asian/ {pop["asian"] += $3}' -
awk 'num=sprintf("%.2f",tot)... -

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 30


awk 'BEGIN{ORS=\n\n;FS="\n"}' Double space
awk '/start/,/stop/' print range of lines
s1='start'
s2='stop' escape variables into awk
awk "/$s1/,/$2/'

Awk : Array

gawk 'BEGIN{split("a,b,c,d",num,","};{..num[2]..} create num array


gawk '{z=split($2,num," ");. num[2] .' create array using split
gawk '/asian/ {pop["asian"] += $3}'
gawk '{a[$1]++};END{for(i in a) print i,n[i]}' print f1 and count
gawk '{n=asort(arr,dest}' sort array
gawk '{if (i in arr) {..} }' Check element existence
gawk '{for (i in arr) {..} }' Array looping

Awk : Multi-Dimension Array

arr[i,j] create 2D array


if ((i,j) in arr) {..} create array using split
for (i in arr) { 2D array looping
split(i,f,SUBSEP);row[f[1]]=1;col[f[2]]=1 }
for (i in col ) { .. for (j in row ) { .. arr[j,i]) }
Awk: BEGIN & END

awk 'BEGIN{OFMT=%.2f;print 1232.32e-3}' format as %.2f

Awk : File

nawk 'BEGIN{while (getline < "myfile" > 0) {i++}' process myfile first
nawk 'BEGIN{ "ls" | getline i;split(i,arr);print arr[1]}' get data thru pipe
nawk 'BEGIN{ "ls" | getline i; print i}' get data thru pipe
nawk 'BEGIN{print "Your age is:", getline i <
"/usr/home/quan"}'
nawk '{print $1,$2 | "sort +1"}' sort after print
nawk '/hello/ {print $1 > "file"}'
nawk '/hello/ {print $1 >> "file"}' append to file
nawk 'FILENAME == "filea" {..};
read multiple files
FILENAME == "fileb" {..}' filea fileb
Awk : Regular Expression

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 31


awk '/^[0-9][0-9][0-9]$/' print lines with 3 digits
awk '$4 ~ /^(big|small)$/' pr lines w field4 is "big" or "small"
awk '!($2 < $3 && $2 > $5)' negation
print gensub(/([a-zA-Z]+)([0-9]+)/,"\\2\\1 ","g",$0) change "12ab34cd" to "ab12 cd34"
print gensub(/([:alpha:]+)([0-9]+)/,"\\2\\1 ","g",$0) change "12ab34cd" to "ab12 cd34"
Awk : Misc

for(i=NF;i<0;i=NF-i+1) {$i=($i<0)?-$i:$i} rot fld in abs value


$NF != previous {print};previous=$NF' if NF dif f pre line
{if ($4>max) {max=$4;champ=$1}};END{print champ} find highest score
{line[NR]=$0};END{for(i=NR;i<0;i--) {print line[$i]} pr lines in rev order
{tot[$3] += $2};END{for (i in tot) {print i,tot[i]}' array
BEGIN{digit="^[0-9]+$"};$1 ~ digit' pr if field1 is a digit
BEGIN{sign="[-+]?";decimal="[0-9]+[.]?[0-9]*"}' -
BEGIN{frac="[.][0-9]+";exp="([eE]" sign "[0-9]+)?"}' -
BEGIN{num= "^" sign "(" dec "|" fract ")" exp "$"}' -
$1 ~ number' using coded #
BEGIN{number=^[+-]?([0-9]+[.]?[0-9]*|[.][0-9]+)$" -
{if (sub(/^\+/," ")) {printf("\n")};printf("%s",$0)}' -
{for(i=1;i<=NF;i++) {if ($i == "-s") print $(i+1)}}' print word after -s
{if ($1>90) gr="a";else if ($1 >80) gr="b";else gr="c" if-then-else
Awk : String

index substring sub gsub toupper tolower length


substring(string,start_position,length)
sub( RE , replace_with , $3 )
split( string, output_array , separator )

awk 'length > 80' print lines with length > 80


awk 'if (length($2) > 6) print' pr lines if length of field 2 > 6
x=substr(x,81,length(x)-80)};if length(x)>80 print line }
/\\$/ { join line ending w \
x=$0;getline;print substr(x,1,length(x)-1) $0;next}
S1=substr($1,5,2) substr($1,1,2) substr($1,3,2)' mmddyy yymmdd
{for(i=1;i<=NF;i++) {if ($i ~ /a=.*/) {print substr($i,3)} look f x; a=4 x=34
{x=$0;while (length(x)>80) {print substr(x,1,80); split big line > 80

for

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 32


for(i=0;i<=NF;i++} {..} next break continue
for(i in arr) {...} array
while (..) {..}

if

if (..) {..} if
if (..) {..} else {..} if-else
if (..) {..} elsif (..) {..} else {..} if-elsif-else
num = (i > 0) ? num : -num "?:" operator

system

system(" sed 's/bob/BOB/' file1 > file2' ") system call


Awk : Sed

gawk '/hello/' myfile sed -n "/hello/p" myfile


gawk '!/hello/' myfile sed "/hello/!d" myfile
gawk '{if (NR==2) {sub(/bob/,"BOB") ; print } }' myfile sed -n "2s/bob/BOB/p" myfile
gawk 'NR==2 {sub(/bob/,"BOB") ; print }' myfile sed -n "2s/bob/BOB/p" myfile
gawk 'NR==2 {sub(/bob/,"BOB")}; {print}' myfile sed "2s/bob/BOB/" myfile
gawk '{if !(NR==2) {sub(/bob/,"BOB") } ; print }' myfile sed "2!s/bob/BOB/" myfile
gawk '{/hello/ {sub(/bob/,"BOB")} ; print }' sed "/hello/s/bob/BOB/"
gawk '{!/hello/ {sub(/bob/,"BOB")} ; print }' sed "/hello/!s/bob/BOB/"
gawk '{if (NR>=2 && NR<=4) {sub(/bob/,"BOB") ;
sed -n "2,4/s/bob/BOB/p"
print } }'
gawk '{if (NR>=2 && NR<=4) {sub(/bob/,"BOB") } ;
sed "2,4/s/bob/BOB/"
print }'
gawk '{/hello/ {sub(/bob/,"BOB")} ; print }' sed "/hello/s/bob/BOB/"
gawk 'NR==2,NR=4 {print $0 > "file2"}' myfile sed "2,4w file2" myfile
gawk '/start/,/stop/ {print}' sed -n "/start/,/stop/p"
gawk '{sub("bad","good and &")' sed "s/\(bad\)/good and \1/"
gawk '/hi/ {print}' perl -ne '/hi/ && print'
gawk '{print NR,$0}' perl -ne 'print $.,$_'
gawk 'NR==2,NR==4 {print}' perl -ne '$.<=2 && $. >=4 && print'
gawk '{sub("hi","ho");print}' perl -pe 's/hi/ho/'
gawk 'NR==2 {sub("hi","ho");print}' perl -pe '$.==2 and s/hi/ho/'
gawk 'print $0}' perl -nae 'print $F[0]'
gawk 'print $NF}' perl -nae 'print $F[-1]'

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 33


match(var,"RE")' $var = ~ /RE/
$0 $_
$0 $F[1]
$NF $F[-1]
NR $.
/RE/ {print} /RE/ and print
getline $_=<>
split($0,arr) @arr=split($_,"\s+")
substr($0,2,5) substr( $_,1,5)
Awk : Perl

gawk '/hello/' myfile perl -ne "/hello/ and print" myfile


gawk '!/hello/' myfile perl -ne /helloe/ or print" myfile
Perl -ne 'if($.==2) {s/bob/BOB/;print}'
gawk '{if (NR==2) {sub(/bob/,"BOB") ; print } }' myfile
myfile
Perl -ne '$.==2 and s/bob/BOB and print'
gawk 'NR==2 {sub(/bob/,"BOB") ; print }' myfile
myfile
gawk '{/hello/ {sub(/bob/,"BOB")} ; print }' perl -ne '/hello/ and s/bob/BOB/ and print"
gawk '{print $1}' perl -ne "print $F[0]"

CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 34


CAD SCRIPTING LANGUAGES PROGRAMMING GUIDE 4 VLSI ... Quan Nguyen 35

You might also like