0% found this document useful (0 votes)
6 views

(from Spinnelis lesson) beginners

This document serves as a beginner's guide to the UNIX operating system, covering essential topics such as logging in, typing commands, document preparation, and programming. It includes practical advice for day-to-day use, as well as an annotated bibliography for further reading. The paper aims to help new users quickly become familiar with UNIX and its functionalities.

Uploaded by

swapjim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

(from Spinnelis lesson) beginners

This document serves as a beginner's guide to the UNIX operating system, covering essential topics such as logging in, typing commands, document preparation, and programming. It includes practical advice for day-to-day use, as well as an annotated bibliography for further reading. The paper aims to help new users quickly become familiar with UNIX and its functionalities.

Uploaded by

swapjim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

UNIX For Beginners — Second Edition

Brian W. Kernighan
Bell Laboratories
Murray Hill, New Jersey 07974

ABSTRACT

This paper is meant to help new users get started on the UNIX†
operating system. It includes:
• basics needed for day-to-day use of the system — typing commands, correcting typing mistakes, logging in and
out, mail, inter-terminal communication, the file system, printing files, redirecting I/O, pipes, and the shell.
• document preparation — a brief discussion of the major formatting programs and macro packages, hints on
preparing documents, and capsule descriptions of some supporting software.
• UNIX programming — using the editor, programming the shell, programming in C, other languages and tools.
• An annotated UNIX bibliography.

November 2, 1997

†UNIX is a Trademark of Bell Laboratories.


-- --

UNIX For Beginners — Second Edition

Brian W. Kernighan
Bell Laboratories
Murray Hill, New Jersey 07974

INTRODUCTION 5. A UNIX Reading List. An annotated bibliography


From the user’s point of view, the UNIX operating of documents that new users should be aware of.
system is easy to learn and use, and presents few of the
usual impediments to getting the job done. It is hard, I. GETTING STARTED
however, for the beginner to know where to start, and
how to make the best use of the facilities available. Logging In
The purpose of this introduction is to help new users You must have a UNIX login name, which you can
get used to the main ideas of the UNIX system and start get from whoever administers your system. You also
making effective use of it quickly. need to know the phone number, unless your system
You should have a couple of other documents with uses permanently connected terminals. The UNIX sys-
you for easy reference as you read this one. The most tem is capable of dealing with a wide variety of termi-
important is The UNIX Programmer’s Manual ; it’s nals: Terminet 300’s; Execuport, TI and similar porta-
often easier to tell you to read about something in the bles; video (CRT) terminals like the HP2640, etc.;
manual than to repeat its contents here. The other use- high-priced graphics terminals like the Tektronix 4014;
ful document is A Tutorial Introduction to the UNIX plotting terminals like those from GSI and DASI; and
Text Editor, which will tell you how to use the editor to even the venerable Teletype in its various forms. But
get text — programs, data, documents — into the com- note: UNIX is strongly oriented towards devices with
puter. lower case. If your terminal produces only upper case
A word of warning: the UNIX system has become (e.g., model 33 Teletype, some video and portable ter-
quite popular, and there are several major variants in minals), life will be so difficult that you should look for
widespread use. Of course details also change with another terminal.
time. So although the basic structure of UNIX and how Be sure to set the switches appropriately on your
to use it is common to all versions, there will certainly device. Switches that might need to be adjusted
be a few things which are different on your system include the speed, upper/lower case mode, full duplex,
from what is described here. We have tried to mini- even parity, and any others that local wisdom advises.
mize the problem, but be aware of it. In cases of doubt, Establish a connection using whatever magic is needed
this paper describes Version 7 UNIX. for your terminal; this may involve dialing a telephone
This paper has five sections: call or merely flipping a switch. In either case, UNIX
1. Getting Started: How to log in, how to type, what should type ‘‘login:’’ at you. If it types garbage, you
to do about mistakes in typing, how to log out. may be at the wrong speed; check the switches. If that
Some of this is dependent on which system you fails, push the ‘‘break’’ or ‘‘interrupt’’ key a few times,
log into (phone numbers, for example) and what slowly. If that fails to produce a login message, consult
terminal you use, so this section must necessarily a guru.
be supplemented by local information. When you get a login: message, type your login
2. Day-to-day Use: Things you need every day to use name in lower case. Follow it by a RETURN; the sys-
the system effectively: generally useful com- tem will not do anything until you type a RETURN. If a
mands; the file system. password is required, you will be asked for it, and (if
3. Document Preparation: Preparing manuscripts is possible) printing will be turned off while you type it.
one of the most common uses for UNIX systems. Don’t forget RETURN.
This section contains advice, but not extensive The culmination of your login efforts is a ‘‘prompt
instructions on any of the formatting tools. character,’’ a single character that indicates that the sys-
4. Writing Programs: UNIX is an excellent system for tem is ready to accept commands from you. The
developing programs. This section talks about prompt character is usually a dollar sign $ or a percent
some of the tools, but again is not a tutorial in any sign %. (You may also get a message of the day just
of the programming languages provided by the before the prompt character, or a notification that you
system. have mail.)
-- --

-2-

Typing Commands Mistakes in Typing


Once you’ve seen the prompt character, you can If you make a typing mistake, and see it before
type commands, which are requests that the system do RETURN has been typed, there are two ways to recover.
something. Try typing The sharp-character # erases the last character typed; in
fact successive uses of # erase characters back to the
date
beginning of the line (but not beyond). So if you type
followed by RETURN. You should get back something badly, you can correct as you go:
like
dd#atte##e
Mon Jan 16 14:17:10 EST 1978
is the same as date.
Don’t forget the RETURN after the command, or noth- The at-sign @ erases all of the characters typed so
ing will happen. If you think you’re being ignored, far on the current input line, so if the line is irretriev-
type a RETURN; something should happen. RETURN ably fouled up, type an @ and start the line over.
won’t be mentioned again, but don’t forget it — it has What if you must enter a sharp or at-sign as part of
to be there at the end of each line. the text? If you precede either # or @ by a backslash \,
Another command you might try is who, which it loses its erase meaning. So to enter a sharp or at-sign
tells you everyone who is currently logged in: in something, type \# or \@. The system will always
echo a newline at you after your at-sign, even if pre-
who
ceded by a backslash. Don’t worry — the at-sign has
gives something like been recorded.
To erase a backslash, you have to type two sharps
mb tty01 Jan 16 09:11
or two at-signs, as in \##. The backslash is used exten-
ski tty05 Jan 16 09:33
sively in UNIX to indicate that the following character
gam tty11 Jan 16 13:07
is in some way special.
The time is when the user logged in; ‘‘ttyxx’’ is the sys-
tem’s idea of what terminal the user is on. Read-ahead
If you make a mistake typing the command name, UNIX has full read-ahead, which means that you
and refer to a non-existent command, you will be told. can type as fast as you want, whenever you want, even
For example, if you type when some command is typing at you. If you type dur-
ing output, your input characters will appear intermixed
whom
with the output characters, but they will be stored away
you will be told and interpreted in the correct order. So you can type
several commands one after another without waiting for
whom: not found
the first to finish or even begin.
Of course, if you inadvertently type the name of some
other command, it will run, with more or less mysteri- Stopping a Program
ous results. You can stop most programs by typing the charac-
ter ‘‘DEL’’ (perhaps called ‘‘delete’’ or ‘‘rubout’’ on
Strange Terminal Behavior your terminal). The ‘‘interrupt’’ or ‘‘break’’ key found
Sometimes you can get into a state where your ter- on most terminals can also be used. In a few programs,
minal acts strangely. For example, each letter may be like the text editor, DEL stops whatever the program is
typed twice, or the RETURN may not cause a line feed doing but leaves you in that program. Hanging up the
or a return to the left margin. You can often fix this by phone will stop most programs.
logging out and logging back in. Or you can read the
description of the command stty in section I of the Logging Out
manual. To get intelligent treatment of tab characters The easiest way to log out is to hang up the phone.
(which are much used in UNIX) if your terminal doesn’t You can also type
have tabs, type the command
login
stty −tabs
and let someone else use the terminal you were on. It
and the system will convert each tab into the right num- is usually not sufficient just to turn off the terminal.
ber of blanks for you. If your terminal does have com- Most UNIX systems do not use a time-out mechanism,
puter-settable tabs, the command tabs will set the stops so you’ll be there forever unless you hang up.
correctly for you.
Mail
When you log in, you may sometimes get the mes-
sage
You have mail.
−− −−

-3-

UNIX provides a postal system so you can communi- A protocol is needed to keep what you type from
cate with other users of the system. To read your mail, getting garbled up with what Joe types. Typically it’s
type the command like this:
mail Joe types write smith and waits.
Smith types write joe and waits.
Your mail will be printed, one message at a time, most
Joe now types his message (as many lines as he
recent message first. After each message, mail waits
likes). When he’s ready for a reply, he signals
for you to say what to do with it. The two basic
it by typing (o), which stands for ‘‘over’’.
responses are d, which deletes the message, and
Now Smith types a reply, also terminated by
RETURN, which does not (so it will still be there the
(o).
next time you read your mailbox). Other responses are
This cycle repeats until someone gets tired; he
described in the manual. (Earlier versions of mail do
then signals his intent to quit with (oo), for
not process one message at a time, but are otherwise
‘‘over and out’’.
similar.)
To terminate the conversation, each side must
How do you send mail to someone else? Suppose
type a ‘‘control-d’’ character alone on a line.
it is to go to ‘‘joe’’ (assuming ‘‘joe’’ is someone’s login
(‘‘Delete’’ also works.) When the other person
name). The easiest way is this:
types his ‘‘control-d’’, you will get the message
mail joe EOF on your terminal.
now type in the text of the letter
If you write to someone who isn’t logged in, or
on as many lines as you like ...
who doesn’t want to be disturbed, you’ll be told. If the
After the last line of the letter
target is logged in but doesn’t answer after a decent
type the character ‘‘control−d’’,
interval, simply type ‘‘control-d’’.
that is, hold down ‘‘control’’ and type
a letter ‘‘d’’.
On-line Manual
And that’s it. The ‘‘control-d’’ sequence, often called The UNIX Programmer’s Manual is typically kept
‘‘EOF’’ for end-of-file, is used throughout the system to on-line. If you get stuck on something, and can’t find
mark the end of input from a terminal, so you might as an expert to assist you, you can print on your terminal
well get used to it. some manual section that might help. This is also use-
For practice, send mail to yourself. (This isn’t as ful for getting the most up-to-date information on a
strange as it might sound — mail to oneself is a handy command. To print a manual section, type ‘‘man com-
reminder mechanism.) mand-name’’. Thus to read up on the who command,
There are other ways to send mail — you can send type
a previously prepared letter, and you can mail to a num-
man who
ber of people all at once. For more details see mail(1).
(The notation mail(1) means the command mail in sec- and, of course,
tion 1 of the UNIX Programmer’s Manual.)
man man
Writing to other users tells all about the man command.
At some point, out of the blue will come a mes-
sage like Computer Aided Instruction
Your UNIX system may have available a program
Message from joe tty07...
called learn, which provides computer aided instruc-
accompanied by a startling beep. It means that Joe tion on the file system and basic commands, the editor,
wants to talk to you, but unless you take explicit action document preparation, and even C programming. Try
you won’t be able to talk back. To respond, type the typing the command
command
learn
write joe
If learn exists on your system, it will tell you what to
This establishes a two-way communication path. Now do from there.
whatever Joe types on his terminal will appear on yours
and vice versa. The path is slow, rather like talking to II. DAY-TO-DAY USE
the moon. (If you are in the middle of something, you
have to get to a state where you can type a command. Creating Files — The Editor
Normally, whatever program you are running has to ter- If you have to type a paper or a letter or a pro-
minate or be terminated. If you’re editing, you can gram, how do you get the information stored in the
escape temporarily from the editor — read the editor machine? Most of these tasks are done with the UNIX
tutorial.) ‘‘text editor’’ ed. Since ed is thoroughly documented
in ed(1) and explained in A Tutorial Introduction to the
-- --

-4-

UNIX Text Editor, we won’t spend any time here but other variations are possible. For example, the
describing how to use it. All we want it for right now is command
to make some files. (A file is just a collection of infor-
ls −t
mation stored in the machine, a simplistic but adequate
definition.) causes the files to be listed in the order in which
To create a file called junk with some text in it, do they were last changed, most recent first. The −l
the following: option gives a ‘‘long’’ listing:
ed junk (invokes the text editor) ls −l
a (command to ‘‘ed’’, to add text)
will produce something like
now type in
whatever text you want ... −rw−rw−rw− 1 bwk 41 Jul 22 2:56 junk
. (signals the end of adding text) −rw−rw−rw− 1 bwk 78 Jul 22 2:57 temp
The ‘‘.’’ that signals the end of adding text must be at The date and time are of the last change to the file.
the beginning of a line by itself. Don’t forget it, for The 41 and 78 are the number of characters (which
until it is typed, no other ed commands will be recog- should agree with the numbers you got from ed).
nized — everything you type will be treated as text to bwk is the owner of the file, that is, the person
be added. who created it. The −rw−rw−rw− tells who has
At this point you can do various editing operations permission to read and write the file, in this case
on the text you typed in, such as correcting spelling everyone.
mistakes, rearranging paragraphs and the like. Finally, Options can be combined: ls −lt gives the
you must write the information you have typed into a same thing as ls −l, but sorted into time order. You
file with the editor command w: can also name the files you’re interested in, and ls
will list the information about them only. More
w
details can be found in ls(1).
ed will respond with the number of characters it wrote The use of optional arguments that begin with
into the file junk. a minus sign, like −t and −lt, is a common conven-
Until the w command, nothing is stored perma- tion for UNIX programs. In general, if a program
nently, so if you hang up and go home the information accepts such optional arguments, they precede any
But after w the information is there permanently; filename arguments. It is also vital that you sepa-
you can re-access it any time by typing rate the various arguments with spaces: ls−l is not
the same as ls −l.
ed junk
Type a q command to quit the editor. (If you try to Printing Files
quit without writing, ed will print a ? to remind Now that you’ve got a file of text, how do you
you. A second q gets you out regardless.) print it so people can look at it? There are a host
Now create a second file called temp in the of programs that do that, probably more than are
same manner. You should now have two files, needed.
junk and temp. One simple thing is to use the editor, since
printing is often done just before making changes
What files are out there? anyway. You can say
The ls (for ‘‘list’’) command lists the names
ed junk
(not contents) of any of the files that UNIX knows
1,$p
about. If you type
ed will reply with the count of the characters in
ls
junk and then print all the lines in the file. After
the response will be you learn how to use the editor, you can be selec-
tive about the parts you print.
junk
There are times when it’s not feasible to use
temp
the editor for printing. For example, there is a
which are indeed the two files just created. The limit on how big a file ed can handle (several thou-
names are sorted into alphabetical order automati- sand lines). Secondly, it will only print one file at
cally, a time, and sometimes you want to print several,
is lost.† one after another. So here are a couple of alterna-
tives.
† This is not strictly true — if you hang up while First is cat, the simplest of all the printing
editing, the data you were working on is saved in a programs. cat simply prints on the terminal the
file called ed.hup, which you can continue with at contents of all the files named in a list. Thus
your next session.
cat junk
−− −−

-5-

prints one file, and will remove both of the files named.
You will get a warning message if one of the
cat junk temp
named files wasn’t there, but otherwise rm, like
prints two. The files are simply concatenated most UNIX commands, does its work silently.
(hence the name ‘‘cat’’) onto the terminal. There is no prompting or chatter, and error mes-
pr produces formatted printouts of files. As sages are occasionally curt. This terseness is
with cat, pr prints all the files named in a list. The sometimes disconcerting to newcomers, but expe-
difference is that it produces headings with date, rienced users find it desirable.
time, page number and file name at the top of each
page, and extra lines to skip over the fold in the What’s in a Filename
paper. Thus, So far we have used filenames without ever
saying what’s a legal name, so it’s time for a cou-
pr junk temp
ple of rules. First, filenames are limited to 14 char-
will print junk neatly, then skip to the top of a new acters, which is enough to be descriptive. Second,
page and print temp neatly. although you can use almost any character in a
pr can also produce multi-column output: filename, common sense says you should stick to
ones that are visible, and that you should probably
pr −3 junk
avoid characters that might be used with other
prints junk in 3-column format. You can use any meanings. We have already seen, for example, that
reasonable number in place of ‘‘3’’ and pr will do in the ls command, ls −t means to list in time
its best. pr has other capabilities as well; see order. So if you had a file whose name was −t,
pr(1). you would have a tough time listing it by name.
It should be noted that pr is not a formatting Besides the minus sign, there are other characters
program in the sense of shuffling lines around and which have special meaning. To avoid pitfalls, you
justifying margins. The true formatters are nroff would do well to use only letters, numbers and the
and troff, which we will get to in the section on period until you’re familiar with the situation.
document preparation. On to some more positive suggestions. Sup-
There are also programs that print files on a pose you’re typing a large document like a book.
high-speed printer. Look in your manual under Logically this divides into many small pieces, like
opr and lpr. Which to use depends on what equip- chapters and perhaps sections. Physically it must
ment is attached to your machine. be divided too, for ed will not handle really big
files. Thus you should type the document as a
Shuffling Files About number of files. You might have a separate file for
Now that you have some files in the file sys- each chapter, called
tem and some experience in printing them, you can
chap1
try bigger things. For example, you can move a
chap2
file from one place to another (which amounts to
etc...
giving it a new name), like this:
Or, if each chapter were broken into several files,
mv junk precious
you might have
This means that what used to be ‘‘junk’’ is now
chap1.1
‘‘precious’’. If you do an ls command now, you
chap1.2
will get
chap1.3
precious ...
temp chap2.1
chap2.2
Beware that if you move a file to another one that
...
already exists, the already existing contents are lost
forever. You can now tell at a glance where a particular file
If you want to make a copy of a file (that is, to fits into the whole.
have two versions of something), you can use the There are advantages to a systematic naming
cp command: convention which are not obvious to the novice
UNIX user. What if you wanted to print the whole
cp precious temp1
book? You could say
makes a duplicate copy of precious in temp1.
pr chap1.1 chap1.2 chap1.3 ......
Finally, when you get tired of creating and
moving files, there is a command to remove files but you would get tired pretty fast, and would
from the file system, called rm. probably even make mistakes. Fortunately, there is
a shortcut. You can say
rm temp temp1
-- --

-6-

pr chap* in single quotes, as in


The * means ‘‘anything at all,’’ so this translates ls ′?′
into ‘‘print all files whose names begin with
We’ll see some more examples of this shortly.
chap’’, listed in alphabetical order.
This shorthand notation is not a property of
What’s in a Filename, Continued
the pr command, by the way. It is system-wide, a
When you first made that file called junk,
service of the program that interprets commands
how did the system know that there wasn’t another
(the ‘‘shell,’’ sh(1)). Using that fact, you can see
junk somewhere else, especially since the person
how to list the names of the files in the book:
in the next office is also reading this tutorial? The
ls chap* answer is that generally each user has a private
directory, which contains only the files that belong
produces
to him. When you log in, you are ‘‘in’’ your direc-
chap1.1 tory. Unless you take special action, when you
chap1.2 create a new file, it is made in the directory that
chap1.3 you are currently in; this is most often your own
... directory, and thus the file is unrelated to any other
file of the same name that might exist in someone
The * is not limited to the last position in a file-
else’s directory.
name — it can be anywhere and can occur several
The set of all files is organized into a (usually
times. Thus
big) tree, with your files located several branches
rm *junk* *temp* into the tree. It is possible for you to ‘‘walk’’
around this tree, and to find any file in the system,
removes all files that contain junk or temp as any
by starting at the root of the tree and walking along
part of their name. As a special case, * by itself
the proper set of branches. Conversely, you can
matches every filename, so
start where you are and walk toward the root.
pr * Let’s try the latter first. The basic tools is the
command pwd (‘‘print working directory’’), which
prints all your files (alphabetical order), and
prints the name of the directory you are currently
rm * in.
Although the details will vary according to
removes all files. (You had better be very sure
the system you are on, if you give the command
that’s what you wanted to say!)
pwd, it will print something like
The * is not the only pattern-matching feature
available. Suppose you want to print only chapters /usr/your-name
1 through 4 and 9. Then you can say
This says that you are currently in the directory
pr chap[12349]* your-name, which is in turn in the directory /usr,
which is in turn in the root directory called by con-
The [...] means to match any of the characters
vention just /. (Even if it’s not called /usr on your
inside the brackets. A range of consecutive letters
system, you will get something analogous. Make
or digits can be abbreviated, so you can also do
the corresponding changes and read on.)
this with
If you now type
pr chap[1−49]*
ls /usr/your-name
Letters can also be used within brackets: [a−z]
you should get exactly the same list of file names
matches any character in the range a through z.
as you get from a plain ls: with no arguments, ls
The ? pattern matches any single character, so
lists the contents of the current directory; given the
ls ? name of a directory, it lists the contents of that
directory.
lists all files which have single-character names,
Next, try
and
ls /usr
ls −l chap?.1
This should print a long series of names, among
lists information about the first file of each chapter
which is your own login name your-name. On
(chap1.1, chap2.1, etc.).
many systems, usr is a directory that contains the
Of these niceties, * is certainly the most use-
directories of all the normal users of the system,
ful, and you should get used to it. The others are
like you.
frills, but worth knowing.
The next step is to try
If you should ever have to turn off the special
meaning of *, ?, etc., enclose the entire argument ls /
−− −−

-7-

You should get a response something like this See ls(1) and chmod(1) for details. As a matter of
(although again the details may be different): observed fact, most users most of the time find
openness of more benefit than privacy.
bin
As a final experiment with pathnames, try
dev
etc ls /bin /usr/bin
lib
Do some of the names look familiar? When you
tmp
run a program, by typing its name after the prompt
usr
character, the system simply looks for a file of that
This is a collection of the basic directories of files name. It normally looks first in your directory
that the system knows about; we are at the root of (where it typically doesn’t find it), then in /bin and
the tree. finally in /usr/bin. There is nothing magic about
Now try commands like cat or ls, except that they have
been collected into a couple of places to be easy to
cat /usr/your-name/junk
find and administer.
(if junk is still around in your directory). The What if you work regularly with someone
name else on common information in his directory? You
could just log in as your friend each time you want
/usr/your-name/junk
to, but you can also say ‘‘I want to work on his
is called the pathname of the file that you nor- files instead of my own’’. This is done by chang-
mally think of as ‘‘junk’’. ‘‘Pathname’’ has an ing the directory that you are currently in:
obvious meaning: it represents the full name of the
cd /usr/your-friend
path you have to follow from the root through the
tree of directories to get to a particular file. It is a (On some systems, cd is spelled chdir.) Now
universal rule in the UNIX system that anywhere when you use a filename in something like cat or
you can use an ordinary filename, you can use a pr, it refers to the file in your friend’s directory.
pathname. Changing directories doesn’t affect any permis-
Here is a picture which may make this sions associated with a file — if you couldn’t
clearer: access a file from your own directory, changing to
(root) another directory won’t alter that fact. Of course,
/|\ if you forget what directory you’re in, type
/ | \
/ | \ pwd
bin etc usr dev tmp to find out.
/|\ /|\ /|\ /|\ /|\
/ | \ It is usually convenient to arrange your own
/ | \ files so that all the files related to one thing are in a
adam eve mary directory separate from other projects. For exam-
/ / \ \ ple, when you write your book, you might want to
/ \ junk
junk temp keep all the text in a directory called book. So
make one with
Notice that Mary’s junk is unrelated to Eve’s.
This isn’t too exciting if all the files of inter- mkdir book
est are in your own directory, but if you work with then go to it with
someone else or on several projects concurrently, it
becomes handy indeed. For example, your friends cd book
can print your book by saying then start typing chapters. The book is now found
pr /usr/your-name/chap* in (presumably)

Similarly, you can find out what files your neigh- /usr/your-name/book
bor has by saying To remove the directory book, type
ls /usr/neighbor-name rm book/*
or make your own copy of one of his files by rmdir book

cp /usr/your-neighbor/his-file yourfile The first command removes all files from the
directory; the second removes the empty directory.
If your neighbor doesn’t want you poking You can go up one level in the tree of files by
around in his files, or vice versa, privacy can be saying
arranged. Each file and directory has read-write-
execute permissions for the owner, a group, and cd ..
everyone else, which can be set to control access.
−− −−

-8-

‘‘..’’ is the name of the parent of whatever direc- new page. Suppose you want them run together
tory you are currently in. For completeness, ‘‘.’’ is instead. You could say
an alternate name for the directory you are in.
cat f g h >temp
pr <temp
Using Files instead of the Terminal
rm temp
Most of the commands we have seen so far
produce output on the terminal; some, like the edi- but this is more work than necessary. Clearly what
tor, also take their input from the terminal. It is we want is to take the output of cat and connect it
universal in UNIX systems that the terminal can be to the input of pr. So let us use a pipe:
replaced by a file for either or both of input and
cat f g h | pr
output. As one example,
The vertical bar | means to take the output from
ls
cat, which would normally have gone to the termi-
makes a list of files on your terminal. But if you nal, and put it into pr to be neatly formatted.
say There are many other examples of pipes. For
example,
ls >filelist
ls | pr −3
a list of your files will be placed in the file filelist
(which will be created if it doesn’t already exist, or prints a list of your files in three columns. The
overwritten if it does). The symbol > means ‘‘put program wc counts the number of lines, words and
the output on the following file, rather than on the characters in its input, and as we saw earlier, who
terminal.’’ Nothing is produced on the terminal. prints a list of currently-logged on people, one per
As another example, you could combine several line. Thus
files into one by capturing the output of cat in a
who | wc
file:
tells how many people are logged on. And of
cat f1 f2 f3 >temp
course
The symbol >> operates very much like >
ls | wc
does, except that it means ‘‘add to the end of.’’
That is, counts your files.
Any program that reads from the terminal can
cat f1 f2 f3 >>temp
read from a pipe instead; any program that writes
means to concatenate f1, f2 and f3 to the end of on the terminal can drive a pipe. You can have as
whatever is already in temp, instead of overwriting many elements in a pipeline as you wish.
the existing contents. As with >, if temp doesn’t Many UNIX programs are written so that they
exist, it will be created for you. will take their input from one or more files if file
In a similar way, the symbol < means to take arguments are given; if no arguments are given
the input for a program from the following file, they will read from the terminal, and thus can be
instead of from the terminal. Thus, you could used in pipelines. pr is one example:
make up a script of commonly used editing com-
pr −3 a b c
mands and put them into a file called script. Then
you can run the script on a file by saying prints files a, b and c in order in three columns.
But in
ed file <script
cat a b c | pr −3
As another example, you can use ed to prepare a
letter in file let, then send it to several people with pr prints the information coming down the
pipeline, still in three columns.
mail adam eve mary joe <let
The Shell
Pipes We have already mentioned once or twice the
One of the novel contributions of the UNIX mysterious ‘‘shell,’’ which is in fact sh(1). The
system is the idea of a pipe. A pipe is simply a shell is the program that interprets what you type
way to connect the output of one program to the as commands and arguments. It also looks after
input of another program, so the two run as a translating *, etc., into lists of filenames, and <, >,
sequence of processes — a pipeline. and | into changes of input and output streams.
For example, The shell has other capabilities too. For
example, you can run two programs with one com-
pr f g h
mand line by separating the commands with a
will print the files f, g, and h, beginning each on a semicolon; the shell recognizes the semicolon and
-- --

-9-

breaks the line into two commands. Thus If this is to be a regular thing, you can elimi-
nate the need to type sh: simply type, once only,
date; who
the command
does both commands before returning with a
chmod +x startup
prompt character.
You can also have more than one program and thereafter you need only say
running simultaneously if you wish. For example,
startup
if you are doing something time-consuming, like
the editor script of an earlier section, and you don’t to run the sequence of commands. The chmod(1)
want to wait around for the results before starting command marks the file executable; the shell rec-
something else, you can say ognizes this and runs it as a sequence of com-
mands.
ed file <script &
If you want startup to run automatically
The ampersand at the end of a command line says every time you log in, create a file in your login
‘‘start this command running, then take further directory called .profile, and place in it the line
commands from the terminal immediately,’’ that is, startup. When the shell first gains control when
don’t wait for it to complete. Thus the script will you log in, it looks for the .profile file and does
begin, but you can do something else at the same whatever commands it finds in it. We’ll get back
time. Of course, to keep the output from interfer- to the shell in the section on programming.
ing with what you’re doing on the terminal, it
would be better to say
III. DOCUMENT PREPARATION
ed file <script >script.out &
UNIX systems are used extensively for docu-
which saves the output lines in a file called ment preparation. There are two major formatting
script.out. programs, that is, programs that produce a text
When you initiate a command with &, the with justified right margins, automatic page num-
system replies with a number called the process bering and titling, automatic hyphenation, and the
number, which identifies the command in case you like. nroff is designed to produce output on termi-
later want to stop it. If you do, you can say nals and line-printers. troff (pronounced ‘‘tee-
roff’’) instead drives a phototypesetter, which pro-
kill process-number
duces very high quality output on photographic
If you forget the process number, the command ps paper. This paper was formatted with troff.
will tell you about everything you have running.
(If you are desperate, kill 0 will kill all your pro- Formatting Packages
cesses.) And if you’re curious about other people, The basic idea of nroff and troff is that the
ps a will tell you about all programs that are cur- text to be formatted contains within it ‘‘formatting
rently running. commands’’ that indicate in detail how the format-
You can say ted text is to look. For example, there might be
commands that specify how long lines are, whether
(command-1; command-2; command-3) &
to use single or double spacing, and what running
to start three commands in the background, or you titles to use on each page.
can start a background pipeline with Because nroff and troff are relatively hard to
learn to use effectively, several ‘‘packages’’ of
command-1 | command-2 &
canned formatting requests are available to let you
Just as you can tell the editor or some similar specify paragraphs, running titles, footnotes, multi-
program to take its input from a file instead of column output, and so on, with little effort and
from the terminal, you can tell the shell to read a without having to learn nroff and troff. These
file to get commands. (Why not? The shell, after packages take a modest effort to learn, but the
all, is just a program, albeit a clever one.) For rewards for using them are so great that it is time
instance, suppose you want to set tabs on your ter- well spent.
minal, and find out the date and who’s on the sys- In this section, we will provide a hasty look at
tem every time you log in. Then you can put the the ‘‘manuscript’’ package known as −ms. For-
three necessary commands (tabs, date, who) into a matting requests typically consist of a period and
file, let’s call it startup, and then run it with two upper-case letters, such as .TL, which is used
to introduce a title, or .PP to begin a new para-
sh startup
graph.
This says to run the shell with the file startup as A document is typed so it looks something
input. The effect is as if you had typed the con- like this:
tents of startup on the terminal.
.TL
−− −−

- 10 -

title of document columns with elements of varying widths.


.AU refer prepares bibliographic citations from a
author name data base, in whatever style is defined by the for-
.SH matting package. It looks after all the details of
section heading numbering references in sequence, filling in page
.PP and volume numbers, getting the author’s initials
paragraph ... and the journal name right, and so on.
.PP spell and typo detect possible spelling mis-
another paragraph ... takes in a document. spell works by comparing
.SH the words in your document to a dictionary, print-
another section heading ing those that are not in the dictionary. It knows
.PP enough about English spelling to detect plurals and
etc. the like, so it does a very good job. typo looks for
words which are ‘‘unusual’’, and prints those.
The lines that begin with a period are the format-
Spelling mistakes tend to be more unusual, and
ting requests. For example, .PP calls for starting a
thus show up early when the most unusual words
new paragraph. The precise meaning of .PP
are printed first.
depends on what output device is being used (type-
grep looks through a set of files for lines that
setter or terminal, for instance), and on what publi-
contain a particular text pattern (rather like the edi-
cation the document will appear in. For example,
tor’s context search does, but on a bunch of files).
−ms normally assumes that a paragraph is pre-
For example,
ceded by a space (one line in nroff, 1⁄2 line in
troff), and the first word is indented. These rules grep ′ing$′ chap*
can be changed if you like, but they are changed by
will find all lines that end with the letters ing in the
changing the interpretation of .PP, not by re-typing
files chap*. (It is almost always a good practice to
the document.
put single quotes around the pattern you’re search-
To actually produce a document in standard
ing for, in case it contains characters like * or $
format using −ms, use the command
that have a special meaning to the shell.) grep is
troff −ms files ... often useful for finding out in which of a set of
files the misspelled words detected by spell are
for the typesetter, and
actually located.
nroff −ms files ... diff prints a list of the differences between
two files, so you can compare two versions of
for a terminal. The −ms argument tells troff and
something automatically (which certainly beats
nroff to use the manuscript package of formatting
proofreading by hand).
requests.
wc counts the words, lines and characters in a
There are several similar packages; check
set of files. tr translates characters into other char-
with a local expert to determine which ones are in
acters; for example it will convert upper to lower
common use on your machine.
case and vice versa. This translates upper into
lower:
Supporting Tools
In addition to the basic formatters, there is a tr A−Z a−z <input >output
host of supporting programs that help with docu-
sort sorts files in a variety of ways; cref
ment preparation. The list in the next few para-
makes cross-references; ptx makes a permuted
graphs is far from complete, so browse through the
index (keyword-in-context listing). sed provides
manual and check with people around you for
many of the editing facilities of ed, but can apply
other possibilities.
them to arbitrarily long inputs. awk provides the
eqn and neqn let you integrate mathematics
ability to do both pattern matching and numeric
into the text of a document, in an easy-to-learn lan-
computations, and to conveniently process fields
guage that closely resembles the way you would
within lines. These programs are for more
speak it aloud. For example, the eqn input
advanced users, and they are not limited to docu-
sum from i=0 to n x sub i ˜=˜ pi over 2 ment preparation. Put them on your list of things
to learn about.
produces the output
Most of these programs are either indepen-
n π
Σ
dently documented (like eqn and tbl), or are suffi-
xi =
i=0 2 ciently simple that the description in the UNIX Pro-
grammer’s Manual is adequate explanation.
The program tbl provides an analogous ser-
vice for preparing tabular material; it does all the
computations necessary to align complicated
-- --

- 11 -

Hints for Preparing Documents | tr ... delete punctuation, etc.


Most documents go through several versions | sort into dictionary order
(always more than you expected) before they are | uniq discard duplicates
finally finished. Accordingly, you should do what- | comm print words in text
ever possible to make the job of changing them but not in dictionary
easy.
More pieces have been added subsequently, but
First, when you do the purely mechanical
this goes a long way for such a small effort.
operations of typing, type so that subsequent edit-
The editor can be made to do things that
ing will be easy. Start each sentence on a new line.
would normally require special programs on other
Make lines short, and break lines at natural places,
systems. For example, to list the first and last lines
such as after commas and semicolons, rather than
of each of a set of files, such as a book, you could
randomly. Since most people change documents
laboriously type
by rewriting phrases and adding, deleting and rear-
ranging sentences, these precautions simplify any ed
editing you have to do later. e chap1.1
Keep the individual files of a document down 1p
to modest size, perhaps ten to fifteen thousand $p
characters. Larger files edit more slowly, and of e chap1.2
course if you make a dumb mistake it’s better to 1p
have clobbered a small file than a big one. Split $p
into files at natural boundaries in the document, for etc.
the same reasons that you start each sentence on a
But you can do the job much more easily. One
new line.
way is to type
The second aspect of making change easy is
to not commit yourself to formatting details too ls chap* >temp
early. One of the advantages of formatting pack-
to get the list of filenames into a file. Then edit
ages like −ms is that they permit you to delay deci-
this file to make the necessary series of editing
sions to the last possible moment. Indeed, until a
commands (using the global commands of ed), and
document is printed, it is not even decided whether
write it into script. Now the command
it will be typeset or put on a line printer.
As a rule of thumb, for all but the most trivial ed <script
jobs, you should type a document in terms of a set
will produce the same output as the laborious hand
of requests like .PP, and then define them appro-
typing. Alternately (and more easily), you can use
priately, either by using one of the canned pack-
the fact that the shell will perform loops, repeating
ages (the better way) or by defining your own
a set of commands over and over again for a set of
nroff and troff commands. As long as you have
arguments:
entered the text in some systematic way, it can
always be cleaned up and re-formatted by a judi- for i in chap*
cious combination of editing commands and do
request definitions. ed $i <script
done
IV. PROGRAMMING
This sets the shell variable i to each file name in
There will be no attempt made to teach any of
turn, then does the command. You can type this
the programming languages available but a few
command at the terminal, or put it in a file for later
words of advice are in order. One of the reasons
execution.
why the UNIX system is a productive programming
environment is that there is already a rich set of
Programming the Shell
tools available, and facilities like pipes, I/O redi-
An option often overlooked by newcomers is
rection, and the capabilities of the shell often make
that the shell is itself a programming language,
it possible to do a job by pasting together programs
with variables, control flow (if-else, while, for,
that already exist instead of writing from scratch.
case), subroutines, and interrupt handling. Since
there are many building-block programs, you can
The Shell
sometimes avoid writing a new program merely by
The pipe mechanism lets you fabricate quite
piecing together some of the building blocks with
complicated operations out of spare parts that
shell command files.
already exist. For example, the first draft of the
We will not go into any details here; examples
spell program was (roughly)
and rules can be found in An Introduction to the
cat ... collect the files UNIX Shell, by S. R. Bourne.
| tr ... put each word on a new line
-- --

- 12 -

Programming in C Other Languages


If you are undertaking anything substantial, C If you have to use Fortran, there are two pos-
is the only reasonable choice of programming lan- sibilities. You might consider Ratfor, which gives
guage: everything in the UNIX system is tuned to you the decent control structures and free-form
it. The system itself is written in C, as are most of input that characterize C, yet lets you write code
the programs that run on it. It is also a easy lan- that is still portable to other environments. Bear in
guage to use once you get started. C is introduced mind that UNIX Fortran tends to produce large and
and fully described in The C Programming Lan- relatively slow-running programs. Furthermore,
guage by B. W. Kernighan and D. M. Ritchie supporting software like adb, prof, etc., are all vir-
(Prentice-Hall, 1978). Several sections of the tually useless with Fortran programs. There may
manual describe the system interfaces, that is, how also be a Fortran 77 compiler on your system. If
you do I/O and similar functions. Read UNIX Pro- so, this is a viable alternative to Ratfor, and has the
gramming for more complicated things. non-trivial advantage that it is compatible with C
Most input and output in C is best handled and related programs. (The Ratfor processor and
with the standard I/O library, which provides a set C tools can be used with Fortran 77 too.)
of I/O functions that exist in compatible form on If your application requires you to translate a
most machines that have C compilers. In general, language into a set of actions or another language,
it’s wisest to confine the system interactions in a you are in effect building a compiler, though prob-
program to the facilities provided by this library. ably a small one. In that case, you should be using
C programs that don’t depend too much on the yacc compiler-compiler, which helps you
special features of UNIX (such as pipes) can be develop a compiler quickly. The lex lexical ana-
moved to other computers that have C compilers. lyzer generator does the same job for the simpler
The list of such machines grows daily; in addition languages that can be expressed as regular expres-
to the original PDP-11, it currently includes at least sions. It can be used by itself, or as a front end to
Honeywell 6000, IBM 370, Interdata 8/32, Data recognize inputs for a yacc-based program. Both
General Nova and Eclipse, HP 2100, Harris /7, yacc and lex require some sophistication to use,
VAX 11/780, SEL 86, and Zilog Z80. Calls to the but the initial effort of learning them can be repaid
standard I/O library will work on all of these many times over in programs that are easy to
machines. change later on.
There are a number of supporting programs Most UNIX systems also make available other
that go with C. lint checks C programs for poten- languages, such as Algol 68, APL, Basic, Lisp,
tial portability problems, and detects errors such as Pascal, and Snobol. Whether these are useful
mismatched argument types and uninitialized vari- depends largely on the local environment: if some-
ables. one cares about the language and has worked on it,
For larger programs (anything whose source it may be in good shape. If not, the odds are
is on more than one file) make allows you to spec- strong that it will be more trouble than it’s worth.
ify the dependencies among the source files and
the processing steps needed to make a new ver- V. UNIX READING LIST
sion; it then checks the times that the pieces were
last changed and does the minimal amount of General:
recompiling to create a consistent updated version. K. L. Thompson and D. M. Ritchie, The UNIX Pro-
The debugger adb is useful for digging grammer’s Manual, Bell Laboratories, 1978. Lists
through the dead bodies of C programs, but is commands, system routines and interfaces, file for-
rather hard to learn to use effectively. The most mats, and some of the maintenance procedures.
effective debugging tool is still careful thought, You can’t live without this, although you will prob-
coupled with judiciously placed print statements. ably only need to read section 1.
The C compiler provides a limited instrumen- Documents for Use with the UNIX Time-sharing
tation service, so you can find out where programs System. Volume 2 of the Programmer’s Manual.
spend their time and what parts are worth optimiz- This contains more extensive descriptions of major
ing. Compile the routines with the −p option; after commands, and tutorials and reference manuals.
the test run, use prof to print an execution profile. All of the papers listed below are in it, as are
The command time will give you the gross run- descriptions of most of the programs mentioned
time statistics of a program, but they are not super above.
accurate or reproducible. D. M. Ritchie and K. L. Thompson, ‘‘The UNIX
Time-sharing System,’’ CACM, July 1974. An
overview of the system, for people interested in
operating systems. Worth reading by anyone who
programs. Contains a remarkable number of one-
sentence observations on how to do things right.
-- --

- 13 -

The Bell System Technical Journal (BSTJ) Special S. I. Feldman, ‘‘MAKE — A Program for Main-
Issue on UNIX, July/August, 1978, contains many taining Computer Programs,’’ Bell Laboratories
papers describing recent developments, and some CSTR 57, 1977.
retrospective material. J. F. Maranzano and S. R. Bourne, ‘‘A Tutorial
The 2nd International Conference on Software Introduction to ADB,’’ Bell Laboratories CSTR
Engineering (October, 1976) contains several 62, 1977. An introduction to a powerful but com-
papers describing the use of the Programmer’s plex debugging tool.
Workbench (PWB) version of UNIX. S. I. Feldman and P. J. Weinberger, ‘‘A Portable
Fortran 77 Compiler,’’ Bell Laboratories, 1978. A
Document Preparation: full Fortran 77 for UNIX systems.
B. W. Kernighan, ‘‘A Tutorial Introduction to the
UNIX Text Editor’’ and ‘‘Advanced Editing on
UNIX,’’ Bell Laboratories, 1978. Beginners need
the introduction; the advanced material will help
you get the most out of the editor.
M. E. Lesk, ‘‘Typing Documents on UNIX,’’ Bell
Laboratories, 1978. Describes the −ms macro
package, which isolates the novice from the
vagaries of nroff and troff, and takes care of most
formatting situations. If this specific package isn’t
available on your system, something similar proba-
bly is. The most likely alternative is the
PWB/UNIX macro package −mm; see your local
guru if you use PWB/UNIX.
B. W. Kernighan and L. L. Cherry, ‘‘A System for
Typesetting Mathematics,’’ Bell Laboratories
Computing Science Tech. Rep. 17.
M. E. Lesk, ‘‘Tbl — A Program to Format
Tables,’’ Bell Laboratories CSTR 49, 1976.
J. F. Ossanna, Jr., ‘‘NROFF/TROFF User’s Man-
ual,’’ Bell Laboratories CSTR 54, 1976. troff is
the basic formatter used by −ms, eqn and tbl. The
reference manual is indispensable if you are going
to write or maintain these or similar programs.
But start with:
B. W. Kernighan, ‘‘A TROFF Tutorial,’’ Bell Labo-
ratories, 1976. An attempt to unravel the intrica-
cies of troff.

Programming:
B. W. Kernighan and D. M. Ritchie, The C Pro-
gramming Language, Prentice-Hall, 1978. Con-
tains a tutorial introduction, complete discussions
of all language features, and the reference manual.
B. W. Kernighan and D. M. Ritchie, ‘‘UNIX Pro-
gramming,’’ Bell Laboratories, 1978. Describes
how to interface with the system from C programs:
I/O calls, signals, processes.
S. R. Bourne, ‘‘An Introduction to the UNIX
Shell,’’ Bell Laboratories, 1978. An introduction
and reference manual for the Version 7 shell.
Mandatory reading if you intend to make effective
use of the programming power of this shell.
S. C. Johnson, ‘‘Yacc — Yet Another Compiler-
Compiler,’’ Bell Laboratories CSTR 32, 1978.
M. E. Lesk, ‘‘Lex — A Lexical Analyzer Genera-
tor,’’ Bell Laboratories CSTR 39, 1975.
S. C. Johnson, ‘‘Lint, a C Program Checker,’’ Bell
Laboratories CSTR 65, 1977.

You might also like