Unix
Unix
Prof.SREEDHAR
Module 3 Processes 41
Module 5 A Overview 49
Module 8 Parameters 68
Operating systems may be classified by both how many tasks they can
perform `simultaneously' and by how many users can be using the system
`simultaneously'. That is: single-user or multi-user and single-task or multi-
tasking. A multi-user system must clearly be multi-tasking.
Here the system is such that many users can work at a time. There is
one large CPU and high capacity storage medium enclosed into what
is called as the system unit and different terminals are attached to it.
E a ch u se r w o rks o n a se p a ra te te rm in a l a n d u tilize s th e C P U ‘s
resources.
Each users p ro g ra m a n d o th e r file s a re sto re d in th e syste m u n its‘
storage media. Thus the CPU is one and many users are using it.
Therefore there is a need of such an OS that will effectively divide the
resources of the CPU among all users. Such an OS is called a multi
user OS.
1. Multi Processing
As many users are working at a time, every user will run their
own program. When one program is run by a user it is a
process. When the same program is run by another user it is
another process. If there are different users running different
programs there are many processes undergoing execution. A
u se r sh o u ld n o t w a it u n til o th e r u se rs‘ p ro g ra m s fin ish e xe cu tio n .
Same program can share by many users at a time and run that
together. This ability of the OS to run several processing
together is called multi-processing.
2. Time Sharing
The CPU can execute only one instruction at a time. Since there
are several users running their programs the OS divides the CPU
time for each user. It allots a definite time interval called time slice
w ith in w h ich th a t u se r‘s p ro g ra m is e xe cu te d . O n ce th e tim e slice is
o ve r th e C P U sw itch e s to th e n e xt u se r a n d e xe cu te s th a t u se r‘s
p ro g ra m . A fte r th e tim e slice o f th e u se r is o ve r th e n e xt u se r‘s
program is executed.
3. Memory Management
4. Multi Tasking
The Shell: The shell acts as an interface between the user and the
machine and effectively interprets every command given by the user and
advices the kernel to act accordingly.
A single user OS will have only one shell devoted entirely to the user
whereas in a multi user OS every user will have a separate shell.
Kernel: The Kernel is the part of OS that interacts directly with the
hardware of the Computer system.
During the past 25 years the UNIX Operating System has evolved into a
powerful, flexible, and versatile operating system. It serves as the Operating
System for all types of computers, including single user personal computers
and engineering workstations, multi-user microcomputers, minicomputers,
mainframes and supercomputers, as well as special purpose devices, with
approximately 20 million computers now running UNIX and more than 100
million people using these systems. This rapid growth is expected to continue.
The success of UNIX is due to many factors, including its portability to a wide
range of machines, its adaptability and simplicity, the wide range of tasks that
it can perform, its multi-user and multi tasking nature, and its suitability for
networking, which has become increasingly important as the Internet has
blossomed. What follows is a description of the features that have made the
UNIX system so popular.
Understanding UNIX:
The features that made UNIX a hit from the start are:
Multitasking capability
Multi-user capability
Portability
Cooperative Tools and Utilities
Excellent Networking capability
Open Source Code
Multitasking
Many computers do just one thing at a time, as anyone who uses a PC or
laptop can attest. Try logging onto your company's network while opening
your browser while opening a word processing program. Chances are the
processor will freeze for a few seconds while it sorts out the multiple
instructions.
UNIX, on the other hand, lets a computer do several things at once, such as
printing out one file while the user edits another file. This is a major feature for
users, since users don't have to wait for one application to end before starting
another one.
Multi-user
The same design that permits multitasking permits multiple users to use the
computer. The computer can take the commands of a number of users --
determined by the design of the computer -- to run programs, access files,
and print documents at the same time.
The computer can't tell the printer to print all the requests at once, but it does
prioritize the requests to keep everything orderly. It also lets several users
access the same document by compartmentalizing the document so that the
changes of one user don't override the changes of another user.
Portability
A major contribution of the UNIX system was its portability, permitting it to
move from one brand of computer to another with a minimum of code
UNIX comes with hundreds of programs that are divided into two classes:
Integral utilities that are absolutely necessary for the operation of the
computer, such as the command interpreter, and
Tools that aren't necessary for the operation of UNIX but provide the
user with additional capabilities, such as typesetting capabilities and e-
mail.
Man
DC Mail
Calendar
fsck nroff vi
Tools can be added or removed from a UNIX system, depending upon the
applications required.
UNIX has provision for protecting data and communicating with other users.
The source code (Open Source) for the UNIX system has been made
available to users and programmers.
History of UNIX:
1965 Bell Laboratories joins with MIT and General Electric in the
development effort for the new operating system, Multics, which would
provide multi-user, multi-processor, and multi-level (hierarchical) file
system, among its many forward-looking features.
1969 AT&T was unhappy with the progress and drops out of the Multics
project. Some of the Bell Labs programmers who had worked on this project,
Ken Thompson, Dennis Ritchie, Rudd Canaday, and Doug McIlroy
designed and implemented the first version of the Unix File System on a PDP-
7 along with a few utilities. It was given the name UNIX by Brian Kernighan as
a pun on Multics.
1971 The system now runs on a PDP-11, with 16Kbytes of memory, including
8Kbytes for user programs and a 512Kbyte disk.
Its first real use is as a text processing tool for the patent department at Bell
Labs. That utilization justified further research and development by the
programming group. UNIX caught on among programmers because it was
designed with these features:
Programmers environment
Simple user interface
Simple utilities that can be combined to perform powerful functions
Hierarchical file system
Simple interface to devices consistent with file format
Multi-user, multi-process system
Architecture independent and transparent to the user.
By 1977, the fifth and sixth editions had been released; these contained many
new tools and utilities. The number of machines running the UNIX System,
UNIX System III, based on the Seventh edition, became A T & T ‘s first
commercial release of the UNUX System in 1982. However, after System III
was released, AT&T, through its Western Electric manufacturing subsidiary,
continued to sell versions of the UNIX system. UNIX System III, the various
research editions, and experimental versions were distributed to colleagues at
universities and other research laboratories.
Motif supported
1995 Solaris 2.5
X/Open mark for systems registered
1995 HP-UX 10.0 under the Single UNIX Specification
CDE supported
Today, the UNIX leaders include Solaris, Linux, HP-UX, AIX, and SCO.
One of the most significant points of UNIX is the availability of source code for
the system. (For those new to software, source code contains the
programming elements that, when passed through a compiler, will produce a
binary program— which can be executed.) The binary program contains
sp e cific co m p u te r in stru ctio n s, w h ich te lls th e syste m ―w h a t to d o .‖ W h e n th e
source code is available, it means that the system (or any subcomponent) can
be modified without consulting the original author ofthe program. Access to
the source code is a very positive thing and can result in many benefits. For
example, if software defects (bugs) are found within the source code, they can
be fixed right away— without perhaps waiting for the author to do so.
Another great reason is that new software functions can be integrated into the
source code, thereby increasing the usefulness and the overall functionality of
th e so ftw a re . H a vin g th e a b ility to e xte n d th e so ftw a re to th e u se r‘s
requirements is a massive gain for the end user and the software industry as
a whole. Over time, the software can become much more useful. One
downside to having access to the source code is that it can become hard to
manage, because it is possible that many different people could have
modified the code in unpredictable (and perhaps negative) ways. However,
th is p ro b le m is typ ica lly a d d re sse d b y h a vin g a ―so u rce co d e m a in ta in e r,‖
which reviews the source code changes before the modifications are
incorporated into the original version.
Another downside to source code access is that individuals may use this
information with the goal in mind of compromising system or component
security. The Internet Worm of 1988 is one such popular example. The
author, who was a graduate student at Cornell University at the time, was able
to exploit known security problems within the UNIX system to launch a
software program that gained unauthorized access to systems and was able
to replicate itself to many networked computers. The Worm was so successful
in attaching and attacking systems that it caused many of the computers to
crash due to the amount of resources needed to replicate. Although the Worm
d id n ‘t a ctu a lly ca u se sig n ifica n t p e rm a n e n t d a m a g e to th e syste m s it in fe cte d ,
it opened the eyes of the UNIX community about the dangers of source code
access and security on the Internet as a whole.
Flexible Design
GNU
The GNU project, started in the early 1980s, was intended to act as a
counterbalance to the widespread activity of corporate greed and adoption of
lice n se a g re e m e n ts fo r co m p u te r so ftw a re . T h e ―GNU is not UNIX‖ p ro je ct
w a s re sp o n sib le fo r p ro d u cin g so m e o f th e w o rld ‘s m o st p o p u la r U N IX
software.
This includes the Emacs editor and the gcc compiler. They are the
cornerstones of the many tools that a significant number of developers use
every day.
Open Software
Programming Environment
There are tools to handle many system administration tasks that you might
encounter. Also, there are tools for development, graphics manipulation, text
processing, database operations— just about any user- or system-related
re q u ire m e n t. If th e b a sic o p e ra tin g syste m ve rsio n d o e sn ‘t p ro vid e a p a rticular
tool that you need, chances are that someone has already developed the tool
and it would be available via the Internet.
System Libraries
Well Documented
UNIX is well documented with both online manuals and with many reference
books and user guides from publishers. Unlike some operating systems, UNIX
provides online main page documentation of all tools that ship with the
system.
Further, the UNIX community provides journals and magazine articles about
UNIX, tools, and related topics of interest.
UNIX is a layered operating system. The innermost layer is the hardware that
provides the services for the OS. The operating system, referred to in UNIX
as the kernel, interacts directly with the hardware and provides the services
to th e u se r p ro g ra m s. T h e se u se r p ro g ra m s d o n ‘t n e e d to kn o w a n ything
about the hardware. They just need to know how to interact with the kernel
a n d it‘s u p to th e ke rn e l to p ro vid e th e d e sire d se rvice . O n e o f th e b ig a p p e a ls
of UNIX to programmers has been that most well written user programs are
independent of the underlying hardware, making them readily portable to new
systems.
Note: The core of the UNIX system is the Kernel. The kernel controls the
co m p u ter’s reso u rces, allo ttin g th em to d ifferen t u sers an d to d ifferen t
tasks.
User programs interact with the kernel through a set of standard system
calls. These system calls request services to be provided by the kernel. Such
services would include accessing a file: open close, read, write, link, or
execute a file; starting or updating accounting records; changing ownership of
a file or directory; changing to a new directory; creating, suspending, or killing
a process; enabling access to hardware devices; and setting limits on system
resources.
Apart from the utilities that are provided as part of the UNIX operating system,
more than a thousand UNIX based application programs, like database
management systems, word processors, accounting software etc.,
The basic unit used to organize information in the UNIX System is called a
file. The UNIX file system provides a logical method for organizing, storing,
retrieving, manipulating, and managing information.
The Shell reads your commands and interprets them as requests to execute
a program or programs, which it then arranges to have carried out. Because
the shell plays this role, it is called a command interpreter. Besides being a
command interpreter, the shell is also a programming language. As a
programming language, it permits you to control how and when commands
are carried out. For each user working with UNIX at any time different shell
programs are raining. There may be several shells running in memory, but
only one kernel.
2. The C Shell
The C shell, csh, was originally developed as part of BSD UNIX. csh
introduced a number of important enhancement to sh, including the concept
of a command history list and job control.
The Korn shell, ksh, builds on the sh and extends it by adding many features
from the C shell.
Each of these shells has their own respective prompts. The Bourne shell has
the $ prompt. So when you login it is the bourn shell that is established for
you and the stage is set for you to work on the machine.
Features of Shell:
Shell Variables: The user can control the behavior of the shell, as well
as other programs utilities by storing data in variables.
Each node is either a file or a directory of files, where the latter can contain
other files and directories. You specify a file or directory by its path name,
either the full, or absolute, path name or the one relative to a location. The full
path name starts with the root, /, and follows the branches of the file system,
each separated by /, until you reach the desired file, e.g.:
/home/Sreedhar/source/xntp
A relative path name specifies the path relative to another, usually the current
working directory that you are at. Two special directory entries should be
introduced now:
● ● /Sreedhar/source/xntp
Every directory and file is listed in its parent directory. In the case of the root
directory, that parent is itself. A directory is a file that contains a table listing
the files contained within it, giving file names to the inode numbers in the list.
An inode is a special file designed to be read by the kernel to learn the
information about each file. It specifies the permissions on the file, ownership,
date of creation and of last access and change, and the physical location of
the data blocks on the disk containing the file.
The system does not require any particular structure for the data in the file
itself. The file can be ASCII or binary or a combination, and may represent
text data, a shell script, compiled object code for a program, directory table,
junk, or anything you would like.
Unix Programs
The shell is a command line interpreter. The user interacts with the kernel
through the shell. You can write ASCII (text) scripts to be acted upon by a
shell.
System programs are usually binary, having been compiled from C source
code. These are located in places like /bin, /usr/bin, /usr/local/bin, /usr/ucb,
etc.
Another powerful feature of the UNIX shell is the ability to support the
development and execution of custom shell scripts. The shell contains a mini
programming language that provides a lightweight way to develop new tools
and utilities without having to be a heavyweight software programmer. A UNIX
shell script is a combination of internal shell commands, regular UNIX
commands, and some shell programming rules.
UNIX supports a large number of different shells, and also many of the
popular ones are freely available on the Internet. Also, many versions of UNIX
come with one or more shells and as the system administrator, you can install
UNIX SHELL SCRIPTING 20
By – Prof. Sreedhar
additional shells when necessary and configure the users of the system to use
different shells, depending on specific preferences or requirements. The table
below lists many of the popular shells and a general description of each.
Once a user has logged into the system, the default shell prompt appears and
the shell simply waits for input from the user. Thus, logging into a Solaris
system as the root user for example, the standard Bourne shell prompt will be
The system echoes this prompt to signal that it is ready to receive input from
the keyboard. At this point, this user is free to type in any standard UNIX
command, application, or custom script name and the system will attempt to
execute or run the command. The shell assumes that the first argument given
bash GNU Bourne-Again shell that includes elements from the Korn
shell and C shell.
ksh The Korn shell combines the best features of the Bourne and C
shells and includes powerful programming tools
zsh Korn shell like, but also provides many more features such as
built-in spell correction and programmable command completion.
The configuration you use to access your UNIX System can be based on one
of two basic models: using multi-user computer or single user computer.
On a multi-user system, you use your own terminal device to access the UNIX
system. The computer you access can be a workstation, a microcomputer, a
mainframe computer, or even a super computer.
Single user systems are direct personal computer. In this you can directly run
UNIX OS. (UnixWare 7.1 by SCO, Solaris 7 from SunSoft, Public domain
Version of UNIX, and popular variant of UNIX known as Linux can use on
single user system).
UNIX SHELL SCRIPTING 21
By – Prof. Sreedhar
Your display can be character-based, or it can be bit mapped. It may display a
single window or multiple windows, as in the X-Windows system.
UNIX System from a Terminal: If your terminal has not been set to work with
a UNIX System, you must have its options set appropriately. Setting options is
done in different ways on different terminals.
Selecting a LOGIN : Every UNIX System has at least one person, called the
System Administrator, whose job is to maintain the system, and make it
available to its users. The system administrator is also responsible for adding
new users to the system and setting up their initial work environment on the
computer.
Login name must be more than two characters long, and if it is longer
than eight, only the first eight characters are relevant.
Your logname should not have any symbols or spaces in it, and it must
be unique for each user. Some lognames are reserved customarily for
certain uses. For example, the root normally refers to the system
administrator or superuser who is responsible for the whole system.
Dial in Access: You may have to dial into the computer using a modem
before you are connected. Use your emulator or dial function to dial the UNIX
System access number. When the system answers the call, you will hear a
high-pitched tone you should see some characters appear on screen. Then
you getting UNIX system login prompt.
Logging In:
As a multi-user system, the UNIX System first requires that you identify
yourself before you access to the system.
When you first log into a UNIX System, you will have either no password at all
(a null password) or an arbitrary password assigned by the system
administrator. These are only intended for temporary use. Neither offers any
real security. A null password gives anyone access to your account; one
assigned by the system administrator is likely to be easily guessed by
someone. Officially assigned passwords often consist of simple combinations
of your initials and your student, employee, or social security number. If your
password is simply your employee number and the letter X, anyone with
access to this information has access to all of your computer files. Sometimes
random combinations of letters and numbers are used. Such passwords are
difficult to remember, and consequently users will be tempted to write them
down in a convenient place. (Resist this temptation!)
You change your password by using the passwd command. When you issue
this command, the system checks to see if you are the owner of the login.
This prevents someone from changing your password and locking you out of
your own account. passwd first announces that it is changing the password,
and then it asks for your (current) old password, like this:
$ passwd
Old password:
New password:
The system asks for a new password and asks for the password to be verified
(you do this by retyping it). The next time you log in, the new password is
effective. Although you can ordinarily change your password whenever you
want, on some systems after you change your password you must wait a
specific period of time before you can change it again.
On some systems, you will be required to change your password the first time
you log in. This will work as described previously and will look like this:
login: sreedhar
Password:
Your password has expired.
Choose a new one.
Old password:
New password:
Re-enter new password:
Password Aging
To ensure the secrecy of your password, you will not be allowed to use the
same password for long stretches of time. On UNIX Systems, passwords age.
When yours gets to the end of its lifespan, you will be asked to change it. The
length of time your password will be valid is determined by your system
administrator. However, you can view the status of your password on most
UNIX systems. Generally, the s option to the passwd command shows you
the status of your password, like this:
The first field contains your login name; the next fields list the status of your
password, the date it was last changed, and the minimum and maximum days
allowed between password changes; and the last field is the number of days
before your password will need to be changed. Note that this is simply an
example-Km your system, you may not be allowed to read all of these fields.
An Incorrect Login
If you make a mistake in typing either your login or your password, the UNIX
System will respond this way:
login: sreedhar
Password:
Login Incorrect
login:
You will receive the "Password:" prompt even if you type an incorrect or
nonexistent login name. This prevents someone from guessing login names
and learning which one is valid by discovering one that yields the
"Password:" prompt. Because any login results in "Password:" an intruder
cannot guess login names in this way.
If you repeatedly type your login or password incorrectly (three to five times,
depending on how your system administrator has set the default), the UNIX
System will disconnect your terminal if it is connected via modem or LAN. On
some systems, the system administrator will be notified of erroneous login
attempts as a security measure. If you do not successfully log in within some
time interval (usually a minute), you will be disconnected.
If you have problems logging in, you might also check to make sure that your
CAPS LOCK key has not been set. If it has been set, you will inadvertently enter
an incorrect logname or password, because in UNIX uppercase and
lowercase letters are treated differently. (Note that unlike in some other
environments, your account will not get locked if you enter your password
incorrectly some number of times, you will just get disconnected.)
login: sreedhar
Password:
UNIX System V/386/486 Release 4.0 Version 3.0
minnie
Copyright (c) 1984, 1986, 1987, 1988, 1989, 1990 AT&T
Copyright (C) 1987, 1988 Microsoft Corp.
Copyright (C) 1990, NCR Corp.
All Rights Reserved
Last login: Mon January 29 19:55:17 on term/17
You first see the UNIX System announcement that tells you the particular
version of UNIX you are using. Next you see the name of your system, minnie
in this case. This is followed by the copyright notice.
Finally, you see a line that tells you when you logged in last. This is a security
feature. If the time of your last login does not agree with when you remember
logging in, call your system administrator. This discrepancy could be an
indication that someone has broken into your system and is using your login.
After this initial announcement, the UNIX System presents system messages
and news.
Because every user has to log in, the login sequence is the natural place to
put messages that need to be seen by all users. When you log in, you will first
see a message of the day (MOTD). Because every user must see this MOTD,
the system administrator (or root) usually reserves these messages for
comments of general interest, such as this:
After you log in, you will see the UNIX System command prompt at the far left
side of the current line. The default system prompt (for most UNIX Systems) is
the dollar sign:
This $ is the indication that the UNIX System is waiting for you to enter a
command.
In the examples in this book, you will see the $ at the beginning of a line as it
would be seen on the screen, but you are not supposed to type it.
The UNIX System enables you to define a prompt string, PS1, which is used
as a command prompt. The symbol PS1 is a shell variable (see Chapter 7)
that contains the string you want to use as your prompt. To change the
command prompt, set PS1 to some new string. For example,
changes your primary prompt string from whatever it currently is to the string "
UNIX:> ". From that point, whenever the UNIX System is waiting for you to
enter a command, it will display this new prompt at the beginning of the line.
You can change your prompt to any string of characters you want. You can
use it to remind yourself which system you are on, like this:
$ PS1="MyUnix->
MyUnix->
If you redefine your prompt, it stays effective until you change it or until you
log off. Later in this chapter, you will learn how to make these changes
automatically when you first log in.
The UNIX System makes a large number of programs available to the user.
To run one of these programs you issue a command. For example, when you
type news or passwd, you are really instructing the UNIX System command
interpreter to execute a program with the name news or passwd, and to
display the results on your screen.
The UNIX system offers several file and directory related commands which
the user can use according to his requirement.
Commands are case sensitive. command and Command are not the same.
Options are generally preceded by a hyphen (-), and for most commands,
more than one option can be strung together, in the form:
command -[option][option][option]
e.g.: ls – alR
will perform a long list on all files in the current directory and recursively
perform the list through all sub-directories.
For most commands you can separate the options, preceding each with a
hyphen, e.g.:
as in:
ls -a -l – R
Options and syntax for a command are listed in the man page for the
command.
UNIX Commands:
UNIX comes with a large number of commands that fall under each of the
categories listed above for both the generic user and the system
administrator. It is quite hard to list and explain all of the available UNIX
functions and/or commands in a single book. Therefore, a review of some of
the more important user-level commands and functions has been provided
and subsequent modules provide a more in-depth look at system-level
commands. All of the commands discussed below can be run by generic
users and of course by the system administrator. However, one or more
subfunctions of a command may be available only to the system
administrator.
The standard commands are listed bellow, which are available across many
different versions of UNIX. For example, if we wanted to get a listing of all the
users that are currently logged into the system, the who command can be
used.
The metacharacters have special meaning to the shell; they should not
normally be used as any part of a file name.
The "-" symbol can usually be used in a filename provided it is not the first
character. For example, if we had a file called -l then issuing the command ls
-l would give you a long listing of the current directory because the ls
command would think the l was an option rather than -l being a file name
argument. Some UNIX commands provide facilities to overcome this problem.
The shell offers certain special characters called a wild card character that
helps us to specify certain patterns. The shell will then match the pattern in
the file names and select all the files whose name matches the pattern and
will apply the specified file command. The wild card characters are as follows
For example:
Note that the UNIX shell performs these expansions (including any filename
matching) on a command's arguments before the command is executed.
Example
*c
includes all files ending with '.c' because * stands for any number of
any characters, e.g new.c, ptr.c, str.c etc.
A command like rm *.c will therefore delete all files ending with '.c' The
other files which do not end with '.c' will be retained. The pattern
specifies that the files must neccessarily end with '.c'.
Example
cat ab?xy
The above command will display the contents of all files whose name starts
with ab followed by any one character followed by xy.
This wild card specifies any one of the character listed out within the [ ].
Example
rm ab[efg]yz
The above command will delete all the files that begin with ab followed by
either e, f, or g followed by xy.
PIPES UNIX offers a provision whereby the output of one program can be
made the
input of another program. Both the programs are separated by the |
symbol.
Example
$ cat fil.cjpg
These files are referred to as standard input, standard output and standard
error.
Standard out (stdout) and standard error (stderr) is where the command
expects to put its output, usually the screen.
Any command to the right of the pipe must take its input from standard input.
The example on the visual shows that the output of who is passed as input to
wc -l, which gives us the number of active users on the system.
The output of the grep command is then piped to the wc -l command. The
result is that the command is counting the number of directories. In this
example, the grep command is acting as a filter.
Do not confuse the continuation prompt > with the redirection character >. The
secondary prompt will not form part of the completed command line. If you
require a redirection character you must type it explicitly.
UNIX can run a number of different processes at the same time as well as
many occurrences of a program (such as vi) existing simultaneously in the
system.
To identify the running processes, execute the command ps, which will be
covered later in this course. For example, ps -u team01 shows all running
processes from user team01.
The -f option in addition to the default information provided by ps, displays the
User Name, PPID, start time for each process (that is, a FULL listing).
The -l option displays the User ID, PPID and priorities for each process in
addition to the information provided by ps (that is, a LONG listing)
Background processes are most useful with commands that take a long time
to run.
Notes: The <ctrl-c> may not always work. A Shell script or program can trap
the signal a <ctrl-c> generates and ignore its meaning.
To find out what suspended/background jobs you have, issue the jobs
command.
The bg, fg, kill commands can be used with a job number. For instance, to
kill job number 3, you can issue the command: kill %3 The jobs command
does not list jobs that were started with the nohup command if the user has
logged off and then logged back into the system. On the other hand, if a user
invokes a job with the nohup command and then issues the jobs command
without logging off, the job will be listed.
When a shell script is executed, the shell reads the file one line at a time and
processes the commands in sequence.
Any UNIX command can be run from within a shell script. There are also a
number of built-in shell facilities which allow more complicated functions to be
performed. These will be illustrated later.
To execute this script, start the program ksh and pass the name of the shell
script as argument:
$ ksh hello
This shell reads the commands from the script and executes all commands
line by line.
The .profile contains a sequence of commands that help you customize your
environment. Because the .profile is read each time you start a new Korn
shell, the commands you put in this file to customize your environment will be
executed each time you start a new ksh.
These commands can include, but are certainly not limited to, the following:
The first file that the operating system uses at login is the /etc/environment
file. This file contains variables specifying the basic environment for all
processes and can only be changed by the system administrator.
The second file that the operating system uses at login time is the /etc/profile
file. This file controls system-wide default variables such as the mail
messages and terminal types.
The .profile file is the third file read at login time. It resides in a user's login
directory and enables a user to customize their individual working
environment. The .profile file overrides commands run and variables set and
exported by the /etc/profile file.
Ensure that newly created variables do not conflict with standard variables
such as MAIL, PS1, PS2 and so forth.
The .profile file is read only when the user logs in.
Be aware that your .profile file may not be read if you are accessing the
system through CDE (the Common Desktop Environment). By default, CDE
instead uses a file called .dtprofile. In the CDE environment, if you wish to
use the .profile file, it is necessary to uncomment the DTSOURCEPROFILE
variable assignment at the end of the .dtprofile file.
The C shell provides an easy way to abbreviate the pathname of your home
directory. When the tilde symbol (~) appears at the beginning of a word in
your command line, the shell replaces it with the full pathname of your login
directory.
Example:
% mv file ~/newfile
% mv file $home/newfile
The whence command can be used to determine exactly where the command
you specify is located. For instance, it may be a command located on the disk
drive, it may be an alias, or it may be built-in to the Korn shell. whence reports
the proper location.
whence
alias name='value'
The difference between .profile and .kshrc is that .kshrc is read each time a
subshell is spawned, whereas .profile is read once at login.
EDITOR=/usr/bin/vi
export EDITOR
It will do the same thing that the set -o vi command does as shown in the
example.
The alias command invoked with no arguments prints the list of aliases in the
form name=value on standard output.
To carry down the value of an alias to subsequent subshells, the ENV variable
has to be modified. The ENV variable is normally set to $HOME/.kshrc in the
.profile file (although you can set ENV to any shell script). By adding the alias
definition to the .kshrc file (by using one of the editors) and invoking the
.profile file, the value of the alias will be carried down to all subshells, because
the .kshrc file is run every time a Korn shell is explicitly invoked.
The file pointed to by the ENV variable should contain Korn shell specifics.
The unalias command will cancel the alias named. The names of the aliases
specified with the unalias command will be removed from the alias list.
LOCPATH is the full path name of the location of National Language Support
information, part of this being the National Language Support Table.
NLSPATH is the full path name for messages.
This unit covers only a subset of the vi functions. It is a very powerful editor.
Refer to the online documentation for additional functions.
vi does its editing in a buffer. When a session is initiated, one of two things
happens:
User Variables
It is important to note that you must NOT precede or follow the equal sign with
a space or TAB character.
Sample Session:
$person=Sreedhar
This sample session indicates that person does not represent the string
Richard. The string person is echoed as person. The BourneShell will only
do the substitution of the value of the variable when the name of the variable
is preceded with a dollar sign ($).
Sample Sesssion:
$echo person
person
$echo $person
Sreedhar
$
Sample Session:
$person=‘S re e d h a r a nd Venkatesh'
$echo $person
Sreedhar and Venkatesh
$
All shell variable names are case sensitive. For example, HOME and home
are not the same.
As a convention uppercase names are used for the standard variables set by
the system and lowercase is used for the variables set by the user.
The echo command displays the string of text to standard out (by default to
the screen).
To set a variable, use the = with NO SPACES on either side. Once the
variable has been set, to refer to the value of that variable precede the
variable name with a $. There must be NO SPACE between the $ and the
variable name.
To eliminate the need for a space after the variable name, the curly braces { }
are used.
The backquotes are supported by the bourne shell, C shell and Korn shell.
The use of $(command) is specific to the Korn shell.
The contents of the user variables and the shell variables can be modified by
the user. It is possible to assign a new value to them. The new value can be
assigned from the dollar ($) prompt or from inside a BourneShell script.
Read-only variables are different. The value of read-only variables can not be
changed.
The variable must be initialized to some value; and then, by entering the
following command, it can be made read only.
$person=Sreedhar
$readonly person
$echo $person
Sreedhar
$person=Venkatesh
person: is read only
$
The readonly command given without any arguments will display a list of all
the read-only variables.
Sample Session:
$person=Sreedhar
$readonly person
$example=Venkatesh
$readonly example
$readonly
readonly person
readonly example
$
The read-only shell variables are similar to the read-only user variables;
except the value of these variables is assigned by the shell, and the user
CANNOT modify them.
The shell will store the name of the command you used to call a program in
the variable named $0.
It has the number zero because it appears before the first argument on the
command line.
Sample Session:
$cat name_ex
echo 'The name of the command used'
echo 'to execute this script was' $0
$name_ex
The name of the command used
to execute this script was name_ex
Arguments
The BourneShell will store the first nine command line arguments in the
variables named $1, $2, ..., $9. These variables appear in this section
because you cannot change them using the equal sign. It is possible to
modify them using the set command.
Sample Session:
$cat arg_ex
echo 'The first five command line'
echo 'arguments are' $1 $2 $3 $4 $5
$arg_ex Sreedhar Venkatesh Santhosh
The first five command line
arguments are Sreedhar venkatesh Santhosh
$
The script arg_ex will display the first five command-line arguments. The
variables representing $4 and $5 have a null value.
Sample Session:
$cat display_all
echo $*
$display_all Sreedhar venkatesh Santhosh
Sreedhar venkatesh Santhosh
$
Sample Session:
$cat num_args
echo 'This script was called with'
echo $# 'arguments'
$num_args Sreedhar venkatesh Santhosh
This script was called with
3 arguments
$
Within a process, you can declare, initialize, read, and modify variables. The
variable is local to that process. When a process forks a child process, the
parent process does not automatically pass the value of the variable to the
child process.
Sample Session:
$cat no_export
car=mercedes # set the variable
echo $0 $car $$ # $0 = name of file executed
# $car =value of variable car
# $$ = PID number (process id)
inner # execute another BourneShell script
echo $0 $car $$ # display same as above
$cat inner
echo $0 $car $$ # display variables for this process
$chmod a+x no_export
$chmod a+x inner
$no_export
no_export mercedes 4790
inner 4792
no_export mercedes 4790
$
Can the value be passed from parent to child process? Yes, by using the
export command. Let's look at an example.
Sample Session:
$cat export_it
car=mercedes
export car
echo $0 $car $$
inner1
echo $0 $car $$
$cat inner1
echo $0 $car $$
car=chevy
Exporting variables is only valid from the parent to the child process. The
child process cannot change the parent's variable.
The BourneShell script can read user input from standard input. The read
command will read one line from standard input and assign the line to one or
more variables. The following example shows how this works.
Sample Session:
$cat read_script
echo "Please enter a string of your choice"
read a
echo $a
$
This simple script will read one line from standard input (keyboard) and assign
it to the variable a.
Sample Session:
$read_script
Please enter a string of your choice
Here it is
Here it is
$
Sample Session:
$cat reads
echo "Please enter three strings"
read a b c
echo $a $b $c
echo $c
echo $b
echo $a
$
This time, we will turn on the trace mechanism and follow the execution of this
BourneShell script.
Sample Session:
$sh -x reads
+ echo Please enter three strings
Please enter three strings
+ read a b c
this is more than three strings
+ echo this is more than three strings
this is more than three strings
+ echo more than three strings
more than three strings
+ echo is
is
+ echo this
this
$
It is interesting to note that the spaces separate the values for the variables
a,b, and c. For example, the variable a was assigned the string this, the
variable b was assigned the string is, and the remainder of the line was
assigned to c (including the spaces).
Sample Session:
$cat read_ex
echo 'Enter line: \c'
read line
echo "The line was: $line"
$
Sample Session:
$read_ex
Enter line: All's well that ends well
The line was: All's well that ends well
$
POSITIONAL PARAMETERS
Let's look at an example BourneShell script to see how these are used.
Sample Session:
$cat neat_shell
echo $1 $2 $3
echo $0 is the name of the shell script
echo "There were $# arguments."
echo $*
$
Sample Session:
Now, if we type the name of the BourneShell script with no arguments, we get
the following results.
Sample Session:
$neat_shell
The special variable $0 represents the name of the executing program. The
following shell, if called script.sh would output This program is called
script.sh.:
#!/bin/sh
echo This program is called $0.
exit 0
The first parameter to the shell is known as $1, the second as $2, etc. The
collection of ALL parameters is known as $*.
#!/bin/sh
echo the first parameter is $1
echo the second parameter is $2
echo the collection of ALL parameters is $*
exit 0
Though we have compared the positional parameters with variables, they are
in essence quite different. For insta n ce yo u ca n ‘t a ssig n va lu e s to $ 1 , $ 2 ..
Saying a=10 or b=alpha is fine but $1=dollar or $2=100 is simply not done.
There is one way to assign values to the positional parameters using the set
command.
$ echo $1 $2 $3 $4 $5 $6 $7
Friends come and go, but enemies accumulate
When a large number of parameters (more than 9) are passed to the shell,
shift can be used to read those parameters. If the number of parameters to be
read is known, say three, a program similar to the following could be written:
#!/bin/sh
echo The first parameter is $1.
shift
echo The second parameter is $1.
shift
echo The third parameter is $1.
exit 0
Regular expressions are used when you want to search for specify lines of
text containing a particular pattern. Most of the UNIX utilities operate on ASCII
files a line at a time. Regular expressions search for patterns on a single line,
and not for patterns that start on one line and end on another.
Regular expressions confuse people because they look a lot like the file
matching patterns the shell uses. They even act the same way--almost. The
square brackers are similar, and the asterisk acts similar to, but not identical
to the asterisk in a regular expression. In particular, the Bourne shell, C shell,
find, and cpio use file name matching patterns and not regular expressions.
There are three important parts to a regular expression. Anchors are used to
specify the position of the pattern in relation to a line of text. Character Sets
match one or more characters in a single position. Modifiers specify how
many times the previous character set is repeated. A simple example that
demonstrates all three parts is the regular expression "^#*." The up arrow is
an anchor that indicates the beginning of the line. The character "#" is a
simple character set that matches the single character "#." The asterisk is a
modifier. In a regular expression it specifies that the previous character set
can appear any number of times, including zero. This is a useless regular
expression, as you will see shortly.
There are also two types of regular expressions: the "Basic" regular
expression, and the "extended" regular expression. A few utilities like awk and
egrep use the extended expression. Most use the "regular" regular
Here is a table of the Solaris (around 1991) commands that allow you to
specify regular expressions:
Most UNIX text facilities are line oriented. Searching for patterns that span
several lines is not easy to do. You see, the end of line character is not
included in the block of text wthat is searched. It is a separator. Regular
expressions examine the text between the separators. If you want to search
for a pattern that is at one end or the other, you use anchors. The character
"^" is the starting anchor, and the character "$" is the end anchor. The regular
expression "^A" will match all lines that start with a capital A. The expression
"A$" will match all lines that end with the capital A. If the anchor characters
are not used at the proper end of the pattern, then they no longer act as
anchors. That is, the "^" is only an anchor if it is the first character in a regular
expression. The "$" is only an anchor if it is the last character. The expression
"$1" does not have an anchor. Neither is "1^." If you need to match a "^" at the
beginning of the line, or a "$" at the end of a line, you must escape the special
characters with a back slash. Here is a summary:
The use of "^" and "$" as indicators of the beginning or end of a line is a
convention other utilities use. The vi editor uses these two characters as
commands to go to the beginning or end of a line. The C shell uses "!^" to
specify the first argument of the previous line, and "!$" is the last argument on
the previous line.
The character "." is one of those special meta-characters. By itself it will match
any character, except the end-of-line character. The pattern that will match a
line with a single characters is
^.$
If you want to match specific characters, you can use the square brackets to
identify the exact characters you are searching for. The pattern that will match
any line of text that contains exactly one number is
This is verbose. You can use the hyphen between two characters to specify a
range:
^[0-9]$
You can intermix explicit characters with character ranges. This pattern will
match a single character that is a letter, number, or underscore:
[A-Za-z0-9_]
Character sets can be combined by placing them next to each other. If you
wanted to search for a word that
You can easily search for all characters except those in square brackets by
putting a "^" as the first character after the "[." To match all characters except
vowels use "[^aeiou]."
Like the anchors in places that can't be considered an anchor, the characters
"]" and "-" do not have a special meaning if they directly follow "[." Here are
some examples:
The third part of a regular expression is the modifier. It is used to specify how
may times you expect to see the previous character set. The special character
"*" matches zero or more copies. That is, the regular expression "0*"
matches zero or more zeros, while the expression "[0-9]*" matches zero or
more numbers.
This explains why the pattern "^#*" is useless, as it matches any number of
"#'s" at the beginning of the line, including zero. Therefore this will match
every line, because every line starts with zero or more "#'s."
At first glance, it might seem that starting the count at zero is stupid. Not so.
Looking for an unknown number of characters is very important. Suppose you
wanted to look for a number at the beginning of a line, and there may or may
not be spaces before the number. Just use "^ *" to match zero or more spaces
at the beginning of the line. If you need to match one or more, just repeat the
character set. That is, "[0-9]*" matches zero or more numbers, and "[0-9][0-
9]*" matches one or more numbers.
You can continue the above technique if you want to specify a minimum
number of character sets. You cannot specify a maximum number of sets with
the "*" modifier. There is a special pattern you can use to specify the minimum
and maximum number of repeats. This is done by putting those two numbers
between "\{" and "\}." The back slashes deserve a special discussion.
Normally a backslash turns off the special meaning for a character. A period
is matched by a "\." and an asterisk is matched by a "\*."
If a backslash is placed before a "<," ">," "{," "}," "(," ")," or before a digit, the
back slash turns on a special meaning. This was done because these special
functions were added late in the life of regular expressions. Changing the
meaning of "{" would have broken old expressions. This is a horrible crime
punishable by a year of hard labor writing COBOL programs. Instead, adding
a back slash added functionality without breaking old programs. Rather than
complain about the unsymmetry, view it as evolution.
Having convinced you that "\{" isn't a plot to confuse you, an example is in
order. The regular expression to match 4, 5, 6, 7 or 8 lower case letters is
[a-z]\{4,8\}
Any numbers between 0 and 255 can be used. The second number may be
omitted, which removes the upper limit. If the comma and the second number
You must remember that modifiers like "*" and "\{1,5\}" only act as modifiers if
they follow a character set. If they were at the beginning of a pattern, they
would not be a modifier. Here is a list of examples, and the exceptions:
Searching for a word isn't quite as simple as it at first appears. The string "the"
will match the word "other." You can put spaces before and after the letters
and use this regular expression: " the ." However, this does not match words
at the beginning or end of the line. And it does not match the case where
there is a punctuation mark after the word.
There is an easy solution. The characters "\<" and "\>" are similar to the "^"
and "$" anchors, as they don't occupy a position of a character. They do
"anchor" the expression between to only match if it is on a word boundary.
The pattern to search for the word "the" would be "\<[tT]he\>." The character
before the "t" must be either a new line character, or anything except a letter,
number, or underscore. The character after the "e" must also be a character
other than a number, letter, or underscore or it could be the end of line
character.
\([a-z]\)\([a-z]\)[a-z]\2\1
Potential Problems
The "\<" and "\>" characters were introduced in the vi editor. The other
programs didn't have this ability at that time. Also the "\{min,max\}" modifier is
new and earlier utilities didn't have this ability. This made it difficult for the
novice user of regular expressions, because it seemed each utility has a
different convention. Sun has retrofited the newest regular expression library
to all of their programs, so they all have the same ability. If you try to use
these newer features on other vendor's machines, you might find they don't
work the same way.
The other potential point of confusion is the extent of the pattern matches.
Regular expressions match the longest possible pattern. That is, the regular
expression
A.*B
Two programs use the extended regular expression: egrep and awk. With
these extensions, those special characters preceded by a back slash no
longer have the special meaning: "\{," "\}," "\<," "\>," "\(," "\)" as well as the
"\digit." There is a very good reason for this, which I will delay explaining to
build up suspense.
By now, you are wondering why the extended regular expressions is even
worth using. Except for two abbreviations, there are no advantages, and a lot
of disadvantages. Therefore, examples would be useful.
The three important characters in the expanded regular expressions are "(,"
"|," and ")." Together, they let you match a choice of patterns. As an example,
you can egrep to print all From: and Subject: lines from your incoming mail:
All lines starting with "From:" or "Subject:" will be printed. There is no easy
way to do this with the Basic regular expressions. You could try
"^[FS][ru][ob][mj]e*c*t*: " and hope you don't have any lines that start with
"Sromeet:." Extended expressions don't have the "\<" and "\>" characters. You
can compensate by using the alternation mechanism. Matching the word "the"
in the beginning, middle, end of a sentence, or end of a line can be done with
the extended regular expression:
(^| )the([^a-z]|$)
There are two choices before the word, a space or the beginining of a line.
After the word, there must be something besides a lower case letter or else
the end of the line. One extra bonus with extended regular expressions is the
ability to use the "*," "+," and "?" modifiers after a "(...)" grouping. The
following will match "a simple problem," "an easy problem," as well as "a
problem."
I promised to explain why the back slash characters don't work in extended
regular expressions. Well, perhaps the "\{...\}" and "\<...\>" could be added to
the extended expressions. These are the newest addition to the regular
expression family. They could be added, but this might confuse people if
those characters are added and the "\(...\)" are not. And there is no way to add
that functionality to the extended expressions without changing the current
usage. Do you see why? It's quite simple. If "(" has a special meaning, then
"\(" must be the ordinary character. This is the opposite of the Basic regular
expressions, where "(" is ordinary, and "\(" is special. The usage of the
parentheses is incompatable, and any change could break old programs.
If the extended expression used "( ..|...)" as regular characters, and "\(...\|...\)"
for specifying alternate patterns, then it is possible to have one set of regular
expressions that has full functionality. This is exactly what GNU emacs does,
by the way.
Regular
Class Type Meaning
Expression
_
A single character (except
. all Character Set
newline)
^ all Anchor Beginning of line
$ all Anchor End of line
[...] all Character Set Range of characters
* all Modifier zero or more duplicates
\< Basic Anchor Beginning of word
\> Basic Anchor End of word
\(..\) Basic Backreference Remembers pattern
\1..\9 Basic Reference Recalls pattern
_+ Extended Modifier One or more duplicates
? Extended Modifier Zero or one duplicate
\{M,N\} Extended Modifier M to N Duplicates
(...|...) Extended Anchor Shows alteration
_
\(...\|...\) EMACS Anchor Shows alteration
\w EMACS Character set Matches a letter in a word
\W EMACS Character set Opposite of \w
This visual shows another way of invoking a shell script. This method relies on
the user first making the script an executable file with the chmod command.
Note that the shell uses the PATH variable to find executable files. If you get
an error message like the following,
$ hello
ksh: hello: not found
check your PATH variable. The directory in which the shell script is stored
must be defined in the PATH variable.
If you invoke a shell script with a . (dot), it runs in the current shell. Variables
defined in this script (dir1, dir2) are therefore defined in the current shell.
Every process gives back an exit status to its parent process. Per convention
0 is given back when the process ended successfully and not equal 0 in all
other cases.
$ date
$ echo $?
0
$_
This shows successful execution of the date command. The visual shows an
example for an unsuccessful execution of a command.
CONTROL CONSTRUCTS:
The BourneShell control constructs can alter the flow of control within the
script. The BourneShell provides simple two-way branch if statements and
multiple-branch case statements, plus for, while, and until statements.
You can negate any criterion by preceding it with an exclamation mark (!).
Parentheses can be used to group criteria. If there are no parentheses, the -a
(logical AND operator) takes precedence over the -o (logical OR operator).
The test utility will evaluate operators of equal precedence from left to right.
The test utility will work from the command line but it is more often used in a
script to test input or verify access to a file.
Another way to do the test evaluation is to surround the expression with left
and right brackets. A space character must appear after the left bracket and
before the right bracket.
Test expressions can be in many different forms. The expressions can appear
as a set of evaluation criteria. The general form for testing numeric values is:
int1 op int2
This criterion is true if the integer int1 has the specified algebraic relationship
to integer int2.
-eq equal
string1 op string2
Sample Session:
$cat test_string
number=1
numero=0001
if test $number = $numero
then echo "String vals for $number and $numero are ="
else echo "String vals for $number and $numero not ="
fi
if test $number -eq $numero
then echo "Numeric vals for $number and $numero are ="
else echo "Numeric vals for $number and $numero not ="
fi
$sh -x test_string
number=1
numero=0001
+ test 1 = 0001
+ echo String vals for 1 and 0001 not =
String vals for 1 and 0001 not =
+ test 1 -eq 0001
+ echo Numeric vals for 1 and 0001 are =
Numeric vals for 1 and 0001 are =
$test_string
String vals for 1 and 0001 not =
Numeric vals for 1 and 0001 are =
The test utility can be used to determine information about file types. All of
the criterion can be found in Appendix B. A few of them are listed here:
Example:
$test -d new_dir
If new_dir is a directory, this criterion will evaluate to true. If it does not exist,
then it will be false.
The if statement evaluates the expression and then returns control based on
this status. The fi statement marks the end of the if, notice that fi is if spelled
backward.
Sample Session:
$cat check_args
if (test $# = 0)
then echo 'Please supply at least 1 argument'
exit
fi
echo 'Program is running'
$
Sample Session:
$check_args
Please supply at least 1 argument
$check_args xyz
Program is running
$
The else part of this structure makes the single-branch if statement into a two-
way branch. If the expression returns a true status, the commands between
the then and the else statement will be executed. After these have been
executed, control will start again at the statement after the fi.
If the expression returns false, the commands following the else statement will
be executed.
Sample Session:
$cat test_string
number=1
numero=0001
if test $number = $numero
then echo "String values of $number and $numero are equal"
else echo "String values of $number and $numero not equal"
fi
if test $number -eq $numero
then echo "Numeric values of $number and $numero are equal"
else echo "Numeric values of $number and $numero not equal"
fi
$
The elif construct combines the else and if statements and allows you to
construct a nested set of if then else structures.
Sample Session:
$cat case_ex
echo 'Enter A, B, or C: \c'
read letter
case $letter in
A) echo 'You entered A' ;;
B) echo 'You entered B' ;;
C) echo 'You entered C' ;;
*) echo 'You did not enter A, B, or C' ;;
esac
$chmod a+x case_ex
$case_ex
This example uses the value of a character that the user entered as the test
string. The value is represented by the variable letter. If letter has the value
of A, the structure will execute the command following A. If letter has a value
of B or C, then the appropriate commands will be executed. The asterisk
indicates any string of characters; and it, therefore, functions as a catchall for
a no-match condition. The lowercase b in the second sample session is an
example of a no match condition.
This structure will assign the value of the first item in the argument list to the
loop index and executes the commands between the do and done
statements. The do and done statements indicate the beginning and end of
the for loop.
After the structure passes control to the done statement, it assigns the value
of the second item in the argument list to the loop index and repeats the
commands. The structure will repeat the commands between the do and
done statements once for each argument in the argument list. When the
argument list has been exhausted, control passes to the statement following
the done.
Sample Session:
$cat find_henry1
for x in project1 project2 project3
do
grep henry $x
done
$head project?
==> project1 <==
henry
joe
mike
sue
$find_henry
henry
henry
$
Each file in the argument list was searched for the string, henry. When a
match was found, the string was printed.
As long as the expression returns a true exit status, the structure continues to
execute the commands between the do and the done statement. Before each
loop through the commands, the structure executes the expression. When
The until and while structures are very similar. The only difference is that the
test is at the top of the loop. The until structure will continue to loop until the
expression returns true or a nonerror condition. The while loop will continue
as long as a true or nonerror condition is returned.
Sample Session:
$cat until_ex
secretname='jenny'
name='noname'
echo 'Try to guess the secret name!'
echo
until (test "$name" = "$secretname")
do
echo 'Your guess: \c'
read name
done
echo 'You did it!'
$
The until loop will continue until name is equal to the secret name.
Sample Session:
The break and continue loop control commands correspond exactly to their
counterparts in other programming languages. The break command
terminates the loop (breaks out of it), while continue causes a jump to the next
iteration (repetition) of the loop, skipping all the remaining commands in that
particular loop cycle.
#!/bin/bash
echo
echo "Printing Numbers 1 through 20 (but not 3 and 11)."
a=0
echo -n "$a " # This will not execute for 3 and 11.
done
# Exercise:
# Why does loop print up to 20?
echo; echo
##############################################################
####
a=0
if [ "$a" -gt 2 ]
then
break # Skip entire rest of loop.
fi
exit 0
Common Options
You can list a series of files on the command line, and cat will concatenate
them, starting each in turn, immediately after completing the previous one,
e.g.:
DATE
Example
$date
Example
S date "+DATE IS%D TIME IS %T"
Example
$ date "+DAY %d MONTH %m YEAR %y"
The find command recursively searches the directory tree for each specified
path, seeking files that match a Boolean expression written using the terms
given in the text that follows the expression. The output from the find
command depends on the terms specified by the final parameter.
The command following -exec, in this example ls, is executed for each file
name found.
Note use of the escaped ; to terminate the command that find is to execute.
The find command may also be used with a -ls option; that is, $ find . -name
'm*' -ls.
The \; is hard coded with the find command. This is required for use with -
exec and -ok.
It is a good idea to use the -ok option rather than -exec if there are not a lot of
files that match the search criteria. It is a lot safer if your pattern is not exactly
what you think it is.
The search can be for simple text, like a string or a name. grep can also look
for logical constructs, called regular expressions, that use patterns and
wildcards to symbolize something special in the text, for example, only lines
that start with an uppercase T.
The command displays the name of the file containing the matched line, if
more than one file is specified for the search.
The UNIX manual, usually called man pages, is available on-line to explain
the usage of the UNIX system and commands. To use a man page, type the
command "man" at the system prompt followed by the command for which
you need information.
Syntax
man [options] command_name
Common Options
Another program used to read and write files associated with an archive is tar.
Some of the available options are
This reduces the size of a file, thus freeing valuable disk space. For example,
type
% ls -l science.txt
and note the size of the file using ls -l . Then to compress science.txt, type
% gzip science.txt
This will compress the file and place it in a file called science.txt.gz
% gunzip science.txt.gz
nslookup
nslookup host
domain name, IP address, and alias information for the given host.
e.g., nslookup www.kent.edu gives related data for www.kent.edu
Cut command.
cut command selects a list of columns or fields from one or more files.
Option -c is for columns and -f for fields. It is entered as
cut options [files]
for example if a file named testfile contains
this is firstline
this is secondline
this is thirdline
Examples:
cut -c1,4 testfile will print this to standard output (screen)
ts
Options:
Examples:
sed command launches a stream line editor which you can use at command
line.
you can enter your sed commands in a file and then using -f option edit your
text file. It works as
options:
for more information about sed, enter man sed at command line in your
system.
expr 5 + 7
expr 5 \* 7
#!/bin/sh
# Perform some arithmetic
x=24
y=4
Result=`expr $x \* $y`
echo "$x times $y is $Result"
#!/bin/sh
#Usage read echo
function function_name {
command...
}
or
function_name () {
command...
}
This second form will cheer the hearts of C programmers (and is more
portable).
function_name ()
{
command...
}
In this case, however, a semicolon must follow the final command in the
function.
#!/bin/bash
JUST_A_SECOND=1
fun ()
{ # A somewhat more complex function.
i=0
REPEATS=30
echo
echo "And now the fun really begins."
echo
funky
fun
exit 0
Debugging
The Bash shell contains no debugger, nor even any debugging-specific
commands or constructs. Syntax errors or outright typos in the script generate
cryptic error messages that are often of no help in debugging a non-functional
script.
#!/bin/bash
# ex74.sh
if [$a -gt 27 ]
then
echo $a
fi
exit 0
Anyhow, sed is a marvelous utility. Unfortunately, most people never learn its
real power. The language is very simple, but the documentation is terrible.
The Solaris on-line manual pages for sed are five pages long, and two of
those pages describe the 34 different errors you can get. A program that
spends as much space documenting the errors than it does documenting the
language has a serious learning curve.
Sed has several commands, but most people only learn the substitute
command: s. The substitute command changes all occurrences of the regular
expression into a new value. A simple example is changing "day" in the "old"
file to "night" in the "new" file:
I didn't put quotes around the argument because this example didn't need
them. If you read my earlier tutorial, you would understand why it doesn't need
quotes. If you have meta-characters in the command, quotes are necessary.
In any case, quoting is a good habit, and I will henceforth quote future
examples. That is:
s Substitute command
/../../ Delimiter
day Regular Expression Pattern String
night Replacement string
If you have many commands and they won't fit neatly on one line, you can
break up the line using a backslash:
sed -e 's/a/A/g'
-e 's/e/E/g' \
-e 's/i/I/g' \
Sed is extremely powerful, and you can do things in sed that you can't do in
any standard word processor. And because sed is external to the word
processor and comes with every Unix system in the world, once you learn sed
you'll have a very handy tool in your toolkit, even if (like me) you rarely use
Unix.
How it works: You feed sed a script of editing commands (like, "change every
line that begins with a colon to such-and-such") and sed sends your revised
text to the screen. To save the revisions on disk, use the redirection arrow,
>newfile.txt. Sample syntax:
awk:
Awk is a ``pattern scanning and processing language'' which is useful for
writing quick and dirty programs that don't have to be compiled. The calling
syntax of awk is like sed:
Like sed, awk can work on standard input or on a file. Like the shell, if you
start an awk program with
#!/bin/awk – f
then you can execute the program directly from the shell.
Most systems also have nawk, which stands for ``new awk.'' Nawk has many
more features than awk and is generally more useful. I am just going to cover
awk, but you should check out nawk too in your own time. Nawk has some
nice things like a random number generator, that awk doesn't have.
pattern { action }
What such a statement does is apply the action to all lines that match the
pattern. If there is no pattern, then it applies the action to all lines. If there is
So, for example, the program awkgrep works just like ``grep Jim''.
/Jim/
UNIX> cat input
Which of these lines doesn't belong:
Bill Clinton
George Bush
Ronald Reagan
Jimmy Carter
Sylvester Stallone
UNIX> awkgrep input
Jimmy Carter
UNIX> awkgrep < input
Jimmy Carter
UNIX>
Basically look like C programs. There are some big differences, but for the
most part, you can do most basic things that you can do in C.
Awk breaks up each line into fields, which are basically whitespace-separated
words. You can get at word i by specifying $i. The variable NF contains the
number of words on the line. The variable $0 is the line itself.
So, to print out the first and last words on each line, you can do:
Bill Clinton
George Bush
Ronald Reagan
Jimmy Carter
Sylvester Stallone
UNIX> awk '{ print $1, $NF }' input
Which belong:
Bill Clinton
George Bush
Ronald Reagan
Jimmy Carter
Sylvester Stallone
UNIX>
/Jim/ { print $0 }
UNIX> awkgrep2 input
Jimmy Carter
UNIX>
Awk has a printf just like C. You don't have to use parentheses when you call
it (although you can if you'd like). Unlike print, printf will not print a newline if
you don't want it to. So, for example, awkrev reverses the lines of a file:
Clinton Bill
Bush George
Reagan Ronald
Carter Jimmy
Stallone Sylvester
UNIX>
A few things that you'll notice about awkrev: Actions can be multiline. You
don't need semicolons to separate lines like in C. However, you can specify
multiple commands on a line and separate them with semi-colons as in C.
And you can block commands with curly braces as in C. If you want a
command to span two lines (this often happens with complex printf
statements), you need to end the first line with a backslash.
Also, you'll notice that awkrev didn't declare the variable i. Awk just figured
out that it's an integer.
Type casting
Awk lets you convert variables from one type to another on the fly. For
example, to convert an integer to a string, you simply use it as a string. String
construction can be done with concatenation, which is often very convenient.
These principles are used in awkcast:
There are two special patterns, BEGIN and END, which cause the
corresponding actions to be executed before and after any lines are
processed respectively. Therefore, the following program (awkwc) counts the
number of lines and words in the input file.
BEGIN { nl = 0; nw = 0 }
{ nl++ ; nw += NF }
END { print "Lines:", nl, "words:", nw }
UNIX> awkwc awkwc
Lines: 5 words: 26
UNIX> wc awkwc
5 26 103 awkwc
UNIX>
Awk tries to process each statement on each line. Unlike sed, there is no
``hold space.'' Instead, each statement is processed on the original version of
each line. Two special commands in awk are next and exit. Next specifies to
stop processing the current input line, and to go directly to the next one,
skipping all the rest of the statements. Exit specifies for awk to exit
immediately.
Here are some simple examples. awkpo prints out only the odd numbered
lines (note that this is an awkward way to do this, but it works):
BEGIN { ln=0 }
{ ln++
if (ln%2 == 0) next
print $0
}
awkptR prints out all lines until it reaches a lines with a capital R
/R/ { exit }
{ print $0 }
Bill Clinton
George Bush
UNIX>
Arrays
Arrays in awk are a little odd. First, you don't have to malloc() any storage --
just use it and there it is. Second, arrays can have any indices -- integers,
floating point numbers or strings. This is called ``associative'' indexing, and
can be very convenient. You cannot have multi-dimensional arrays or arrays
of arrays though. To simulate multidimensional arrays, you can just
concatenate the indices.
BEGIN { nt = 0 ; np = 0 }
This only works on lines that are all capital letters. These are the lines that
identify tournaments. On these lines, it does the following:
The next part works on all lines that contain the pattern '--'. These are the
lines with golfers' scores:
/--/ { golfer = $1
for (i = 2; $i != "--" ; i++) golfer = golfer" "$i
if (isgolfer[golfer] != "yes") {
isgolfer[golfer] = "yes"
g[np] = golfer
np++;
}
score[golfer" "this] = $(i+1)
}
The first two lines of this action set the golfer variable to be the golfer's name.
Note that you can do string comparison in awk using standard boolean
operators, unlike in C where you would have to use strcmp().
The next 5 lines use awk's associative arrays: The array isgolfer is checked
to see if it contains the string ``yes'' under the golfer's name. If so, we have
processed this golfer before. If not, we sed the golfer's entry in isgolfer to
``yes,'' set the np-th entry of the array g to be the golfer, and increment np.
Finally, we set the golfer's score for the tournament in the score array. Note
that we don't use double-indirection. Instead, we simply concatenate the
golfer's name and the tournament's name, and use that as the index for the
array.
UNIX> awkgolf kemper # Note that the ouput is only sorted because its
# sorted in the input file
KEMPER
Justin Leonard -10
Greg Norman -7
Nick Faldo -7
Nick Price -7
Loren Roberts -6
Jay Haas -5
Paul Stankowski -5
Lee Janzen -4
Phil Mickelson -4
Davis Love III -3
Tom Lehman 0
Vijay Singh 0
Kirk Triplett 1
Steve Jones 2
Mark O'Meara 5
Don Pooley missed
Ernie Els missed
Fred Couples missed
Hal Sutton missed
Jesper Parnevik missed
Scott McCarron missed
Steve Stricker missed
UNIX> cat masters usopen kemper memorial | awkgolf
MASTERS USOPEN KEMPER MEMORIAL
Tiger Woods 281 6 5
Tommy Tolles 283 2 -11
Tom Watson 284 16 0
Paul Stankowski 285 6 -5 -3
Fred Couples 286 13 missed
Davis Love III 286 5 -3 -7
Justin Leonard 286 9 -10 0
Steve Elkington 287 7
Tom Lehman 287 -2 0 -3
Ernie Els 288 -4 missed -1
File indirection
You can specify that the output of print and printf go to a file with indirection.
For example, to copy standard input to the file f1 you could do:
Bill Clinton
George Bush
Ronald Reagan
Jimmy Carter
Sylvester Stallone
UNIX>
Sometimes you just want to write a program that doesn't use standard input.
To do this, you just write the whole program as a BEGIN statement, exiting at
the end.
The Bourne shell lets you define multiline strings simply by putting newlines in
the string (within single or double quotes, of course). This means that you can
embed simple multiline awk scripts in a sh program without having to use
cumbersome backslashes, or intermediate files. For example, shwc works
just like awkwc, but works as a shell script rather than an awk program.
Awk's limitations
Awk is useful for simple data processing. It is not useful when things get more
complex for a few reasons. First, if your data file is huge, you'll do better to
write a C program (using for example the fields library from CS302/360)
because it will be more efficient sometimes by a factor of 60 or more. Second,
once you start writing procedure calls in awk, it seems to me you may as well
be writing C code. Third, you often find awk's lack of double indirection and
string processing cumbersome and inefficient.
Awk is not a good language for string processing. Irritatingly, it doesn't let you
get at string elements with array operations. I.e. the following will fail:
Actual database access is performed using the command line MySQL client
programme. To ensure that this can be found the search path is modified by
the second and third lines of the script.
PATH=$PATH:/usr/local/mysql/bin
export PATH
The name of the location being queried is then extracted from the
QUERY_STRING environment variable.
On a normal Unix system any user can create files in the directory /tmp, the
symbol $$ in the file name is replaced by the current process identification
number, this is always unique so avoids any problems with two instances of
the back end running simultaneously.
use mydatabase;
select latitude,longitude,easting,northing from gazetteer where feature =
'Prague';
The output from the MySQL client is also written to a temporary file. Typical
text is shown below (for a different query).
It will be noted that the output file includes column names and that columns
are separated by TAB characters.
The next step is to determine the number of lines in the output file, this will be
zero if no matches have been found. This is done by arranging the for the
standard Unix command wc to read the file and write the number of lines to its
standard output.
The code
if [ $ROWS -eq 0 ]
then
echo "No information for" $PLACE
else
echo "<table border=2><tr>"
tail +2 /tmp/$$.res | sed -e 's/^/<tr><td>/
s/ /<td>/g'
echo "</table>"
fi
Note that the sed edit script, introduced by the sed command line argument -e
spreads over two lines.
The file created has its name stored in the variable $sql0 and as we can see
the block between the EOA flags is the data that goes into the file. The data
block is actually a segment of SQL*Plus statements, as indicated by the
filename variable. As is common with SQL*Plus code, the key words are
picked out in ALL CAPS, with objects (tables, procedures, columns, etc.) all in
lower case. The SELECT line contains a reference to a called, packaged,
PL/SQL function which has a column name as an argument. Here the column
name is held in a variable called $column and this will be substituted at script
run-time by the real value.
So what's the point of all this extra typing? Well for one thing it allows you to
put special bits of code into the block which will only be used at certain times,
by hiding them in complex command groups. This example shows how this is
done below.
This is basically the same block except the WHERE clause has been hidden
inside an if statement. Now, depending on the Database Type in the $db_type
variable, the WHERE clause can take one of two forms. Conveniently, the
additional argument which is not required by SQL*Plus in the first form, is
ignored at execution time, even though it is still available on the last line. This
is common with all scripts, arguments are only used if they are referenced
from within the script.
So there you have the first two ways of creating another file from a script. The
version using cat can only cope with a single output form, the version using
echo can output a multitude of forms depending on the complex command
forms you use. The choice is yours. There are, however, other ways to create
output files. You can use direct generation as in the example List to create a
list of files. Or the indirect method shown in the example Counted List where
lines are built inside a loop construct and then appended to the file to create a
menu file. Or in the example Sorted List where a list of words is sorted into
alphabetic order, duplicates are removed, then the rest stored in a file.
Example list
count=1
What is perl?
Perl, sometimes referred to as Practical Extraction and Reporting Language,
is an interpreted programming language with a huge number of uses,
libraries and resources. Arguably one of the most discussed and used
languages on the internet, it is often referred to as the swiss army knife, or
duct tape, of the web.
Perl was first brought into being by Larry Wall circa 1987 as a general
purpose Unix scripting language to make his programming work simpler.
Although it has far surpassed his original creation, Larry Wall still oversees
development of the core language, and the newest version, Perl 6.
Running Perl
The simplest way to run a Perl program is to invoke the Perl interpreter with
the name of the Perl program as an argument:
perl sample.pl
The name of the Perl file is sample.pl, and perl is the name of the Perl
interpreter. This example assumes that Perl is in the execution path; if not,
you will need to supply the full path to Perl too:
/usr/local/hin/perl sample.pl
This is the preferred way of invoking Perl because it eliminates the possibility
that you might accidentally invoke a copy of Perl other than the one you
intended. We will use the full path from now on to avoid any confusion.
c:\NTperl\perl sample.pl
UNIX systems have another way to invoke an interpreter on a script file. Place
a line like
#!/usr/local/bin/perl
at the start of the Perl file. This tells UNIX that the rest of this script file is to be
interpreted by /usr/local/bin/perl. Then make the script itself executable:
chmod +x sample.pl
You can then "execute" the script file directly and let the script file tell the
operating system what interpreter to use while running it.
#!/usr/local/bin/perl -w -t
A Perl Script
Perl code can be quite free-flowing. The broad syntactic rules governing
where a statement starts and ends are
No prizes for guessing what happens when Perl runs this code; it prints
My name is Sreedhar
That's right, print is a function. It may not look like it in any of the examples so
far, where there are no parentheses to delimit the function arguments, but it is
a function, and it takes arguments. You can use parentheses in Perl functions
if you like; it sometimes helps to make an argument list clearer. More
accurately, in this example the function takes a single argument consisting of
an arbitrarily long list. We'll have much more to say about lists and arrays
later, in the "Data Types" section. There will be a few more examples of the
more common functions in the remainder of this chapter, but refer to the
"Functions" chapter for a complete run-down on all of Perl's built-in functions.
So what does a complete Perl program look like? Here's a trivial UNIX
example, complete with the invocation line at the top and a few comments:
That's not at all typical of a Perl program though; it's just a linear sequence of
commands with no structural complexity. The "Flow Control" section later in
this overview introduces some of the constructs that make Perl what it is. For
now, we'll stick to simple examples like the preceding for the sake of clarity.
Exercise:
The basic UNIX commands include some of the most commonly used commands for
users, and constructs for building shell scripts.
The following charts offer a summary of some simple UNIX commands. These are
certainly not all of the commands available in this robust operating system, but these
will help you get started.
Once you have mastered the basic UNIX commands, these will be quite valuable in
managing your own account.
These are ten commands that you might find interesting or amusing. They are
actually quite helpful at times, and should not be considered idle entertainment.
These ten commands are very helpful, especially with graphics and word processing
type applications.
These ten commands are useful for monitoring system access, or simplifying your
own environment.