Ctut
Ctut
Mark Burgess
Faculty of Engineering, Oslo College
Ron Hale-Evans
Copyright c 2002 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this document under
the terms of the GNU Free Documentation License, Version 1.1 or any later
version published by the Free Software Foundation; there being no Invariant
Section, with the Front-Cover Texts being “A GNU Manual”, and with the
Back-Cover Texts as in (a) below. A copy of the license is included in the
section entitled “GNU Free Documentation License”.
(a) The FSF’s Back-Cover Text is: “You have freedom to copy and modify
this GNU Manual, like GNU software. Copies published by the Free Software
Foundation raise funds for GNU development.”
Preface xi
Preface
This book is a tutorial for the computer programming language C. Unlike BASIC or Pascal,
C was not written as a teaching aid, but as a professional tool. Programmers love C!
Moreover, C is a standard, widely-used language, and a single C program can often be
made to run on many different kinds of computer. As Richard M. Stallman remarks in
GNU Coding Standards, “Using another language is like using a non-standard feature: it
will cause trouble for users.” (See http://www.gnu.org/prep/standards_toc.html.)
Skeptics have said that everything that can go wrong in C, does. True, it can be unfor-
giving, and holds some difficulties that are not obvious at first, but that is because it does
not withhold its powerful capabilities from the beginner. If you have come to C seeking a
powerful language for writing everyday computer programs, you will not be disappointed.
To get the most from this book, you should have some basic computer literacy — you
should be able to run a program, edit a text file, and so on. You should also have access to
a computer running a GNU system such as GNU/Linux. (For more information on GNU
and the philosophy of free software, see http://www.gnu.org/philosophy/.)
The tutorial introduces basic ideas in a logical order and progresses steadily. You do
not need to follow the order of the chapters rigorously, but if you are a beginner to C, it is
recommended that you do. Later, you can return to this book and copy C code from it; the
many examples range from tiny programs that illustrate the use of one simple feature, to
complete applications that fill several pages. Along the way, there are also brief discussions
of the philosophy behind C.
Computer languages have been around so long that some jargon has developed. You
should not ignore this jargon entirely, because it is the language that programmers speak.
Jargon is explained wherever necessary, but kept to a minimum. There is also a glossary at
the back of the book.
The authors of this book hope you will learn everything you need to write simple C
programs from this book. Further, it is released under the GNU Free Documentation
License, so as the computers and robots in the fantasies of Douglas Adams say, “Share and
Enjoy!”
The first edition of this book was written in 1987, then updated and rewritten in 1999. It
was originally published by Dabs Press. After it went out of print, David Atherton of Dabs
and the original author, Mark Burgess, agreed to release the manuscript. At the request of
the Free Software Foundation, the book was further revised by Ron Hale-Evans in 2001 and
2002.
The current edition is written in Texinfo, which is a documentation system using a single
source file to produce both online information and printed output. You can read this tutorial
online with either the Emacs Info reader, the stand-alone Info reader, or a World Wide
Web browser, or you can read it as a printed book.
The advantages of C 1
1 Introduction
What is a high-level language? Why is C unusual?
Any sufficiently complex object has levels of detail; the amount of detail we see depends
on how closely we scrutinize the object. A computer has many levels of detail.
The terms low level and high level are often used to describe these layers of complexity
in computers. The low level is buried in the computer’s microchips and microcircuits. The
low level is the level at which the computer seems most primitive and mechanical, whereas
the high level describes the computer in less detail, and makes it easier to use.
You can see high levels and low levels in the workings of a car. In a car, the nuts, bolts,
and pistons of the low level can be grouped together conceptually to form the higher-level
engine. Without knowing anything about the nuts and bolts, you can treat the engine as a
black box: a simple unit that behaves in predictable ways. At an even higher level (the one
most people use when driving), you can see a car as a group of these black boxes, including
the engine, the steering, the brakes, and so on. At a high level, a computer also becomes a
group of black boxes.
C is a high-level language. The aim of any high-level computer language is to provide
an easy, natural way to give a list of instructions (a computer program) to a computer.
The native language of the computer is a stream of numbers called machine language. As
you might expect, the action resulting from a single machine language instruction is very
primitive, and many thousands of them can be required to do something substantial. A
high-level language provides a set of instructions you can recombine creatively and give
to the imaginary black boxes of the computer. The high-level language software will then
translate these high-level instructions into low-level machine language instructions.
• C allows commands that are invalid in other languages. This is no defect, but a powerful
freedom which, when used with caution, makes many things possible. It does mean
that there are concealed difficulties in C, but if you write carefully and thoughtfully,
you can create fast, efficient programs.
• With C, you can use every resource your computer offers. C tries to link closely with
the local environment, providing facilities for gaining access to common peripherals
like disk drives and printers. When new peripherals are invented, the GNU community
quickly provides the ability to program them in C as well. In fact, most of the GNU
project is written in C (as are many other operating systems).
For the reasons outlined above, C is the preeminent high-level language. Clearly, no
language can guarantee good programs, but C can provide a framework in which it is easy
to program well.
2 Using a compiler
How to use a compiler. What can go wrong.
The operating system is the layer of software that drives the hardware of a computer and
provides the user with a comfortable work environment. Operating systems vary, but most
have a shell, or text interface. You use the GNU shell every time you type in a command
that launches an email program or text editor under GNU.
In the following sections of this chapter, we will explore how to create a C program from
the GNU shell, and what might go wrong when you do.
2. Next, the compiler converts the intermediate code (if there is any) or the original
source code into an object code file, which contains machine language but is not yet
executable. The compiler builds a separate object file for each source file. These are
only temporary and are deleted by the compiler after compilation.
3. Finally, the compiler runs a linker. The linker merges the newly-created object code
with some standard, built-in object code to produce an executable file that can stand
alone.
GNU environments use a simple command to invoke the C compiler: gcc, which stands for
“GNU Compiler Collection”. (It used to stand for “GNU C Compiler”, but now GCC can
compile many more languages than just C.) Thus, to compile a small program, you will
usually type something like the following command:
gcc file_name
On GNU systems, this results in the creation of an executable program with the default
name ‘a.out’. To tell the compiler you would like the executable program to be called
something else, use the ‘-o’ option for setting the name of the object code:
gcc -o program_name file_name
For example, to create a program called ‘myprog’ from a file called ‘myprog.c’, write
gcc -o myprog myprog.c
To launch the resulting program ‘myprog’ from the same directory, type
./myprog
The file name endings, or file extensions, identify the contents of files to the compiler. For
example, the ‘.c’ suffix tells the compiler that the file contains C source code, and the other
letters indicate other kinds of files in a similar way.
2.4 Errors
Errors are mistakes that programmers make in their code. There are two main kinds of
errors.
• Compile-time errors are errors caught by the compiler. They can be syntax errors, such
as typing fro instead of for, or they can be errors caused by the incorrect construction
Errors 5
of your program. For example, you might tell the compiler that a certain variable is
an integer, then attempt to give it a non-integer value such as 5.23. (See Section 2.4.2
[Type errors], page 6.)
The compiler lists all compile-time errors at once, with the line number at which each
error occurred in the source code, and a message that explains what went wrong.
For example, suppose that, in your file ‘eg.c’ you write
y = sin (x];
instead of
y = sin (x);
(By the way, this is an example of assignment. With the equals sign (‘=’), you are
assigning the variable y (causing the variable y to contain) the sine of the variable x.
This is somewhat different from the way equals signs work in mathematics. In math,
an equals sign indicates that the numbers and variables on either side of it are already
equal; in C, an equals sign makes things equal. Sometimes it is useful to think of the
equals sign as an abbreviation for the phrase “becomes the value of”.)
Ignore the syntactic details of the statements above for now, except to note that clos-
ing the (x) with a square bracket instead of a parenthesis is an error in C. Upon
compilation, you will see something like this error message:
error
eg.c: In function ‘main’:
eg.c:8: parse error before ‘]’
(If you compile the program within Emacs, you can jump directly to the error. We will
discuss this feature later. See Chapter 23 [Debugging], page 207, for more information.)
A program with compile-time errors will cause the compiler to halt, and will not produce
an executable. However, the compiler will check the syntax up to the last line of your
source code before stopping, and it is common for a single real error, even something
as simple as a missing parenthesis, to result in a huge and confusing list of nonexistent
“errors” from the compiler. This can be shocking and disheartening to novices, but
you’ll get used to it with experience. (We will provide an example later in the book.
See Chapter 23 [Debugging], page 207.)
As a rule, the best way to approach this kind of problem is to look for the first error,
fix that, and then recompile. You will soon come to recognize when subsequent error
messages are due to independent problems and when they are due to a cascade.
• Run-time errors are errors that occur in a compiled and running program, sometimes
long after it has been compiled.
One kind of run-time error happens when you write a running program that does not
do what you intend. For example, you intend to send a letter to all drivers whose
licenses will expire in June, but instead, you send a letter to all drivers whose licenses
will ever expire.
Another kind of run-time error can cause your program to crash, or quit abruptly. For
example, you may tell the computer to examine a part of its memory that doesn’t exist,
or to divide some variable by zero. Fortunately, the GNU environment is extremely
stable, and very little will occur other than an error message in your terminal window
when you crash a program you are writing under GNU.
6 Chapter 2: Using a compiler
/******************************************************/
3.2 Comments
Annotating programs.
Comments are a way of inserting remarks and reminders into code without affecting its
behavior. Since comments are only read by other humans, you can put anything you wish
to in a comment, but it is better to be informative than humorous.
The compiler ignores comments, treating them as though they were whitespace (blank
characters, such as spaces, tabs, or carriage returns), and they are consequently ignored.
During compilation, comments are simply stripped out of the code, so programs can contain
any number of comments without losing speed.
Because a comment is treated as whitespace, it can be placed anywhere whitespace is
valid, even in the middle of a statement. (Such a practice can make your code difficult to
read, however.)
Any text sandwiched between ‘/*’ and ‘*/’ in C code is a comment. Here is an example
of a C comment:
/* ...... comment ......*/
Comments do not necessarily terminate at the end of a line, only with the characters
‘*/’. If you forget to close a comment with the characters ‘*/’, the compiler will display an
‘unterminated comment’ error when you try to compile your code.
3.3 Example 1
#include <stdio.h> /* header file */
Questions for Chapter 3 9
do_little();
/**********************************************/
do_little ()
{
4 Functions
{
variable declarations
statements
...
...
...
}
You may notice when reading the examples in this chapter that this format is somewhat
different from the one we have used so far. This format conforms to the ANSI Standard
and is better C. The other way is old-fashioned C, although GCC will still compile it.
Nevertheless, GCC is not guaranteed to do so in the future, and we will use ANSI Standard
C in this text from now on.
As shown above, a function can have a number of parameters, or pieces of information
from outside, and the function’s body consists of a number of declarations and statements,
enclosed by curly brackets: ‘{...}’.
Note that with GCC, you can also use dollar signs (‘$’) in identifiers. This is one of
GCC’s extensions to the C language, and is not part of the ANSI standard. It also may not
be supported under GCC on certain hardware platforms.
12 Chapter 4: Functions
c = a + b;
printf ("%d\n", c);
}
The variables a and b are parameters passed in from outside the function. The code defines
a, b, and c to be of type int, or integer.
The function above is not much use standing alone. Here is a main function that calls
the add_two_numbers function:
int main()
{
int var1, var2;
var1 = 1;
var2 = 53;
exit(0);
}
When these functions are incorporated into a C program, together they print the number
54, then they print the number 3, and then they stop.
total = a + b + c;
return total;
}
As soon as the return statement is met, calculate_bill stops executing and returns the
value total.
A function that returns a value must have a return statement. Forgetting it can ruin a
program. For instance if calculate_bill had read as follows, then the variable bill would
have had no meaningful value assigned to it, and you might have received a warning from
the compiler as well. (The word void below indicates that the function does not return a
value. In ANSI C, you must place it before the name of any such function.)
void calculate_bill (int a, int b, int c)
{
int total;
total = a + b + c;
}
On the other hand, you do not need to actually use a value when a function returns
one. For example, the C input/output functions printf and scanf return values, but the
values are rarely used. See hundefinedi [files], page hundefinedi, for more information on
these functions.
If we use the first version of the calculate_bill function (the one that contains the
line return total;), the value of the function can simply be discarded. (Of course, the
resulting program is not very useful, since it never displays a value to the user!)
int main()
{
calculate_bill (1, 2, 3);
exit (0);
}
{
int var_to_print;
int main()
{
print_stuff (23, 5);
exit (0);
}
The above program will print the text ‘var_to_print = 115’ and then quit.
Prototypes may seem to be a nuisance, but they overcome a problem intrinsic to com-
pilers, which is that they compile functions as they come upon them. Without function
prototypes, you usually cannot write code that calls a function before the function itself is
defined in the program. If you place prototypes for your functions in a header file, however,
you can call the functions from any source code file that includes the header. This is one
reason C is considered to be such a flexible programming language.
Some compilers avoid the use of prototypes by making a first pass just to see what
functions are there, and a second pass to do the work, but this takes about twice as long.
Programmers already hate the time compilers take, and do not want to use compilers
that make unnecessary passes on their source code, making prototypes a necessity. Also,
prototypes enable the C compiler to do more rigorous error checking, and that saves an
enormous amount of time and grief.
[Style], page 203.) A return code other than 0 indicates that some sort of error has occurred.
If your code terminates when it encounters an error, use exit, and specify a non-zero return
code.
system of the computer. On a typical 32-bit GNU system, the sizes of the integer types are
as follows.
Type Bits Possible Values
On some computers, the lowest possible value may be 1 less than shown here; for example,
the smallest possible short may be -32,768 rather than -32,767.
The word unsigned, when placed in front of integer types, means that only positive or
zero values can be used in that variable (i.e. it cannot have a minus sign). The advantage
is that larger numbers can then be stored in the same variable. The ANSI standard also
allows the word signed to be placed before an integer, to indicate the opposite of unsigned.
You may find the figures in the right-hand column confusing. They use a form of short-
hand for large numbers. For example, the number 5e2 means 5 ∗ 102 , or 500. 5e-2 means
5 ∗ 10− 2 (5/100, or 1/20). You can see, therefore, that the float, double, and long double
types can contain some very large and very small numbers indeed. (When you work with
large and small numbers in C, you will use this notation in your code.)
5.2 Declarations
To declare a variable, write the type followed by a list of variables of that type:
type_name variable_name_1, ..., variable_name_n ;
For example:
int last_year, cur_year;
long double earth_mass, mars_mass, venus_mass;
unsigned int num_pets;
5.3 Initialization
Assigning a variable its first value is called initializing the variable. When you declare a
variable in C, you can also initialize it at the same time. This is no more efficient in terms
of a running program than doing it in two stages, but sometimes creates tidier and more
compact code. Consider the following:
int initial_year;
float percent_complete;
initial_year = 1969;
percent_complete = 89.5;
The code above is equivalent to the code below, but the code below is more compact.
int initial_year = 1969;
float percent_complete = 89.5;
You can always write declarations and initializers this way, but you may not always want
to. (See Chapter 22 [Style], page 203.)
exact_length = 3.37;
rough_length = (int) exact_length;
In the example above, the cast operator rounds the number down when converting it from a
float to an integer, because an integer number cannot represent the fractional part after the
decimal point. Note that C always truncates, or rounds down, a number when converting
it to an integer. For example, both 3.1 and 3.9 are truncated to 3 when C is converting
them to integer values.
The cast operator works the other way around, too:
Storage classes 21
float exact_length;
int rough_length;
rough_length = 12;
exact_length = (float) rough_length;
In converting large integers to floating point numbers, you may lose some precision, since
the float type guarantees only 6 significant digits, and the double type guarantees only
10.
It does not always make sense to convert types. (See Chapter 20 [Data structures],
page 183, for examples of types that do not convert to other types well.)
#include <stdio.h>
my_float = 75.345;
my_int = (int) my_float;
my_ch = (int) my_float;
printf ("Convert from float my_float=%f to my_int=%d and my_ch=%c\n",
my_float, my_int, my_ch);
my_int = 69;
my_float = (float) my_int;
my_ch = my_int;
printf ("Convert from int my_int=%d to my_float=%f and my_ch=%c\n",
my_int, my_float, my_ch);
my_ch = ’*’;
my_int = my_ch;
my_float = (float) my_ch;
printf ("Convert from int my_ch=%c to my_int=%d and my_float=%f\n",
my_ch, my_int, my_float);
exit(0);
}
Here is the sort of output you should expect (floating point values may differ slightly):
Convert from float my_float=75.345001 to my_int=75 and my_ch=K
Convert from int my_int=69 to my_float=69.000000 and my_ch=E
Convert from int my_ch=* to my_int=42 and my_float=42.000000
22 Chapter 5: Variables and declarations
#include <stdio.h>
int my_var;
int main()
{
extern int my_var; void print_value()
{
my_var = 500; printf("my_var = %d\n", my_var);
print_value(); }
exit (0);
}
In this example, the variable my_var is created in the file ‘secondary.c’, assigned a value
in the file ‘main.c’, and printed out in the function print_value, which is defined in the
file ‘secondary.c’, but called from the file ‘main.c’.
See Section 17.4 [Compiling multiple files], page 157, for information on how to compile
a program whose source code is split among multiple files. For this example, you can
simply type the command gcc -o testprog main.c secondary.c, and run the program
with ./testprog.
thereby making code using that variable run faster. These days, most C compilers
(including GCC) are smart enough to optimize the code (make it faster and more
compact) without the register keyword.
• typedef allows you to define your own variable types. See Chapter 19 [More data
types], page 177, for more information.
6 Scope
int global_integer;
float global_floating_point;
int main ()
{
exit (0);
}
2. You can also declare variables immediately following the opening bracket (‘{’) of any
block of code. This area is called local scope, and variables declared here are called
local variables. A local variable is visible within its own block and the ones that block
contains, but invisible outside its own block.
#include <stdio.h>
int main()
{
int foo;
float bar, bas, quux;
exit (0);
}
int main()
{
int b;
{
int c;
exit (0);
}
Local variables are not visible outside their curly brackets. To use an “existence” rather
than a “visibility” metaphor, local variables are created when the opening brace is met, and
they are destroyed when the closing brace is met. (Do not take this too literally; they are
not created and destroyed in your C source code, but internally to the computer, when you
run the program.)
/* SCOPE */
/* */
/***************************************************************/
#include <stdio.h>
int main ()
{
int my_var = 3;
{
int my_var = 5;
printf ("my_var=%d\n", my_var);
}
exit(0);
}
When you run this example, it will print out the following text:
my_var=5
my_var=3
gnu_count = 45;
gnat_count = 5678;
5 = 2 + 3;
You will receive an error such as the following:
error
bad_example.c:3: invalid lvalue in assignment
You can’t assign a value to 5; it has its own value already! In other words, 5 is not an
lvalue.
7.3 Expressions
An expression is simply a string of operators, variables, numbers, or some combination,
that can be parsed by the compiler. All of the following are expressions:
19
1 + 2 + 3
my_var
my_var + some_function()
32 * circumference / 3.14
day_of_month % 7
Here is an example of some arithmetic expressions in C:
Parentheses and Priority 31
#include <stdio.h>
int main ()
{
int my_int;
my_int = 6;
printf ("my_int = %d, -my_int = %d\n", my_int, -my_int);
return 0;
}
The program above produces the output below:
Arithmetic Operators:
my_int = 6, -my_int = -6
int 1 + 2 = 3
int 5 - 1 = 4
int 5 * 2 = 10
9 div 4 = 2 remainder 1:
int 9 / 4 = 2
int 9 % 4 = 1
double 9 / 4 = 2.250000
Parentheses are classed as operators by the compiler; they have a value, in the sense
that they assume the value of whatever is inside them. For example, the value of (5 + 5)
is 10.
Notice that the ++ and -- operators can be placed before or after the variable. In the
cases above, the two forms work identically, but there is actually a subtle difference. (See
Section 18.1.2 [Postfix and prefix ++ and –], page 168, for more information.)
int main()
{
int my_int;
return 0;
}
The program above produces the output below:
34 Chapter 7: Expressions and operators
Assignment Operators:
my_int = 10 : 10
my_int++ : 11
my_int += 5 : 16
my_int-- : 15
my_int -= 2 : 13
my_int *= 5 : 65
my_int /= 2 : 32
my_int %= 3 : 2
The second to last line of output is
my_int /= 2 : 32
In this example, 65 divided by 2 using the /= operator results in 32, not 32.5. This is
because both operands, 65 and 2, are integers, type int, and when /= operates on two
integers, it produces an integer result. This example only uses integer values, since that is
how the numbers are declared. To get the fractional answer, you would have had to declare
the three numbers involved as floats.
The last line of output is
my_int %= 3 : 2
This is because 32 divided by 3 is 10 with a remainder of 2.
when evaluating expressions containing comparison operators, but it is easy to define the
strings ‘TRUE’ and ‘FALSE’ as macros, and they may well already be defined in a library
file you are using. (See Chapter 12 [Preprocessor directives], page 67, for information on
defining macros.)
#define TRUE 1
#define FALSE 0
Note that although any non-zero value in C is treated as true, you do not need to worry
about a comparison evaluating to anything other than 1 or 0. Try the following short
program:
#include <stdio.h>
int main ()
{
int truth, falsehood;
truth = (2 + 2 == 4);
falsehood = (2 + 2 == 5);
exit (0);
}
You should receive the following result:
truth is 1
falsehood is 0
7.9.1 Inclusive OR
Note well! Shakespeare might have been disappointed that, whatever the value of a variable
to_be, the result of
36 Chapter 7: Expressions and operators
to_be || !to_be
(i.e. “To be, or not to be?”) is always 1, or true. This is because one or the other of to_be
or !to_be must always be true, and as long as one side of an OR || expression is true, the
whole expression is true.
8 Parameters
Ways in and out of functions.
Parameters are the main way in C to transfer, or pass, information from function to
function. Consider a call to our old friend calculate_bill:
total = calculate_bill (20, 35, 28);
We are passing 20, 35, and 28 as parameters to calculate_bill so that it can add them
together and return the sum.
When you pass information to a function with parameters, in some cases the information
can go only one way, and the function returns only a single value (such as total in the
above snippet of code). In other cases, the information in the parameters can go both ways;
that is, the function called can alter the information in the parameters it is passed.
The former technique (passing information only one way) is called passing parameters
by value in computer programming jargon, and the latter technique (passing information
both ways) is referred to as passing parameters by reference.
For our purposes, at the moment, there are two (mutually exclusive) kinds of parameters:
• Value parameters are the kind that pass information one-way. They are so-called
because the function to which they are passed receives only a copy of their values,
and they cannot be altered as variable parameters can. The phrase “passing by value”
mentioned above is another way to talk about passing “value parameters”.
• Variable parameters are the kind that pass information back to the calling function.
They are so called because the function to which they are passed can alter them, just as
it can alter an ordinary variable. The phrase “passing by reference” mentioned above
is another way to talk about passing “variable parameters”.
Consider a slightly-expanded version of calculate_bill:
#include <stdio.h>
int main()
{
int bill;
int fred = 25;
int frank = 32;
int franny = 27;
exit (0);
}
}
Note that all of the parameters in this example are value parameters: the information
flows only one way. The values are passed to the function calculate_bill. The original
values are not changed. In slightly different jargon, we are “passing the parameters by value
only”. We are not passing them “by reference”; they are not “variable parameters”.
All parameters must have their types declared. This is true whether they are value
parameters or variable parameters. In the function calculate_bill above, the value pa-
rameters diner1, diner2, and diner3 are all declared to be of type int.
1
That is, unless you are competing in The International Obfuscated C Code Contest.
Actual parameters and formal parameters 39
fred = 20000;
frank = 50000;
franny = 20000;
exit (0);
}
As far as the function calculate_bill is concerned, fred, frank, and franny are still
25, 32, and 27 respectively. Changing their values to extortionate sums after passing them
to calculate_bill does nothing; calculate_bill has already created local copies of the
parameters, called diner1, diner2, and diner3 containing the earlier values.
Important: Even if we named the parameters in the definition of calculate_bill to
match the parameters of the function call in main (see example below), the result would be
the same: main would print out ‘$84.00’, not ‘$90000.00’. When passing data by value,
the parameters in the function call and the parameters in the function definition (which are
only copies of the parameters in the function call) are completely separate.
Just to remind you, this is the calculate_bill function:
int calculate_bill (int fred, int frank, int franny)
{
int total;
int main()
{
int bill;
int fred = 25;
int frank = 32;
int franny = 27;
exit (0);
}
In the function main in the example above, fred, frank, and franny are all actual
parameters when used to call calculate_bill. On the other hand, the corresponding
variables in calculate_bill (namely diner1, diner2 and diner3, respectively) are all
formal parameters because they appear in a function definition.
Although formal parameters are always variables (which does not mean that they are
always variable parameters), actual parameters do not have to be variables. You can use
numbers, expressions, or even function calls as actual parameters. Here are some examples
of valid actual parameters in the function call to calculate_bill:
bill = calculate_bill (25, 32, 27);
(The last example requires the inclusion of the math routines in ‘math.h’, and compilation
with the ‘-lm’ option. sqrt is the square-root function and returns a double, so it must be
cast into an int to be passed to calculate_bill.)
Unfortunately, the use of ‘stdarg.h’ is beyond the scope of this tutorial. For more
information on variadic functions, see the GNU C Library manual.
Questions for Chapter 8 41
9 Pointers
Making maps of data.
In one sense, any variable in C is just a convenient label for a chunk of the computer’s
memory that contains the variable’s data. A pointer, then, is a special kind of variable that
contains the location or address of that chunk of memory. (Pointers are so called because
they point to a chunk of memory.) The address contained by a pointer is a lengthy number
that enables you to pinpoint exactly where in the computer’s memory the variable resides.
Pointers are one of the more versatile features of C. There are many good reasons to use
them. Knowing a variable’s address in memory enables you to pass the variable to a function
by reference (See Section 9.4 [Variable parameters], page 47.)1 Also, since functions are just
chunks of code in the computer’s memory, and each of them has its own address, you can
create pointers to functions too, and knowing a function’s address in memory enables you to
pass functions as parameters too, giving your functions the ability to switch among calling
numerous functions. (See [Function pointers], page 248.)
Pointers are important when using text strings. In C, a text string is always accessed
with a pointer to a character — the first character of the text string. For example, the
following code will print the text string ‘Boy howdy!’:
char *greeting = "Boy howdy!";
printf ("%s\n\n", greeting);
See Chapter 15 [Strings], page 97.
Pointers are important for more advanced types of data as well. For example, there
is a data structure called a “linked list” that uses pointers to “glue” the items in the list
together. (See Chapter 20 [Data structures], page 183, for information on linked lists.)
Another use for pointers stems from functions like the C input routine scanf. This
function accepts information from the keyboard, just as printf sends output to the console.
However, scanf uses pointers to variables, not variables themselves. For example, the
following code reads an integer from the keyboard:
int my_integer;
scanf ("%d", &my_integer);
(See Section 16.2.9.1 [scanf], page 128, for more information.)
total_cost_ptr = &total_cost;
The ‘*’ symbol in the declaration of total_cost_ptr is the way to declare that variable
to be a pointer in C. (The ‘_ptr’ at the end of the variable name, on the other hand, is just
a way of reminding humans that the variable is a pointer.)
1
This, by the way, is how the phrase “pass by reference” entered the jargon. Like other pointers, a variable
parameter “makes a reference” to the address of a variable.
44 Chapter 9: Pointers
When you read C code to yourself, it is often useful to be able to pronounce C’s operators
aloud; you will find it can help you make sense of a difficult piece of code. For example,
you can pronounce the above statement float *total_cost_ptr as “Declare a float pointer
called total_cost_ptr”, and you can pronounce the statement total_cost_ptr = &total_
cost; as “Let total_cost_ptr take as its value the address of the variable total_cost”.
Here are some suggestions for pronouncing the * and & operators, which are always
written in front of a variable:
* “The contents of the address held in variable” or “the contents of the location
pointed to by variable”.
& “The address of variable” or “the address at which the variable variable is
stored”.
For instance:
&fred “The address of fred” or “the address at which the variable fred is stored”.
*fred_ptr
“The contents of the address held in fred_ptr” or “the contents of the location
pointed to by fred_ptr”.
The following examples show some common ways in which you might use the * and &
operators:
int some_var; /* 1 */
“Declare an integer variable called some_var.”
int *ptr_to_some_var; /* 2 */
“Declare an integer pointer called ptr_to_some_var.” (The
* in front of ptr_to_some_var is the way C declares
ptr_to_some_var as a pointer to an integer, rather than just an
integer.)
some_var = 42; /* 3 */
“Let some_var take the value 42.”
ptr_to_some_var = &some_var; /* 4 */
“Let ptr_to_some_var take the address of the variable
some_var as its value.” (Notice that only now does
ptr_to_some_var become a pointer to the particular variable
some_var — before this, it was merely a pointer that could
point to any integer variable.)
char *my_character_ptr;
float *my_float_ptr;
double *my_double_ptr;
However, GCC is fairly lenient about casting different types of pointer to one another
implicitly, or automatically, without your intervention. For example, the following code will
simply truncate the value of *float_ptr and print out 23. (As a bonus, pronunciation is
given for every significant line of the code in this example.)
#include <stdio.h>
/* Include the standard input/output header in this program */
int main()
/* Declare a function called main that returns an integer
and takes no parameters */
{
int *integer_ptr;
/* Declare an integer pointer called integer_ptr */
float *float_ptr;
/* Declare a floating-point pointer called float_ptr */
integer_ptr = &my_int;
/* Assign the address of the integer variable my_int
to the integer pointer variable integer_ptr */
float_ptr = &my_float;
/* Assign the address of the floating-point variable my_float
to the floating-point pointer variable float_ptr */
*integer_ptr = *float_ptr;
/* Assign the contents of the location pointed to by
the floating-point pointer variable float_ptr
to the location pointed to by the integer pointer variable
integer_ptr (the value assigned will be truncated) */
46 Chapter 9: Pointers
return 0;
/* Return a value of 0, indicating successful execution,
to the operating system */
There will still be times when you will want to convert one type of pointer into another.
For example, GCC will give a warning if you try to pass float pointers to a function that
accepts integer pointers. Not treating pointer types interchangeably will also help you
understand your own code better.
To convert pointer types, use the cast operator. (See Section 5.4 [The cast operator],
page 20.) As you know, the general form of the cast operator is as follows:
(type ) variable
This copies the value of the pointer my_integer to the pointer my_long_ptr. The cast
operator ensures that the data types match. (See Chapter 20 [Data structures], page 183,
for more details on pointer casting.)
First, the program allocates space for a pointer to an integer. Initially, the space will contain
garbage (random data). It will not contain actual data until the pointer is “pointed at”
such data. To cause the pointer to refer to a real variable, you need another statement,
such as the following:
my_int_ptr = &my_int;
On the other hand, if you use just the single initial assignment, int *my_int_ptr = 2;,
the program will try to fill the contents of the memory location pointed to by my_int_ptr
with the value 2. Since my_int_ptr is filled with garbage, it can be any address. This
means that the value 2 might be stored anywhere. anywhere, and if it overwrites something
important, it may cause the program to crash.
The compiler will warn you against this. Heed the warning!
Passing pointers correctly 47
int main();
void get_values (int *, int *);
int main()
{
int num1, num2;
get_values (&num1, &num2);
return 0;
}
{
*num_ptr1 = 10;
*num_ptr2 = 20;
}
Think carefully for a moment about what is happening in these fragments of code. The
variables num1 and num2 in main are ordinary integers, so when main prefixes them with
ampersands (&) while passing them to get_values, it is really passing integer pointers.
Remember, &num1 should be read as “the address of the variable num1”.
The code reads like this:
get_values (&num1, &num2);
int main();
void scale_dimensions (int *, int *);
int main()
{
int height,width;
height = 4;
width = 5;
return 0;
}
10 Decisions
Testing and Branching. Making conditions.
Until now, our code examples have been linear: control has flowed in one direction from
start to finish. In this chapter, we will examine ways to enable code to make decisions
and to choose among options. You will learn how to program code that will function in
situations similar to the following:
• If the user hits the jackpot, print a message to say so: ‘You’ve won!’
• If a bank balance is positive, then print ‘C’ for “credit”; otherwise, print ‘D’ for “debit”.
• If the user has typed in one of five choices, then do something that corresponds to the
choice, otherwise display an error message.
In the first case there is a simple “do or don’t” choice. In the second case, there are two
choices. The final case contains several possibilities.
C offers four main ways of coding decisions like the ones above. They are listed below.
if...
if (condition )
{
do something
}
if...else...
if (condition )
{
do something
}
else
{
do something else
}
...?...:...
(condition ) ? do something : do something else ;
switch
switch (condition )
{
case first case : do first thing
case second case : do second thing
case third case : do third thing
}
10.1 if
The first form of the if statement is an all-or-nothing choice: if some condition is satisfied,
do something; otherwise, do nothing. For example:
if (condition ) statement ;
or
if (condition )
{
compound statement
}
52 Chapter 10: Decisions
if (my_num > 0)
{
printf ("The number is positive.\n");
}
if (my_num < 0)
{
printf ("The number is negative.\n");
}
The same code could be written more compactly in the following way:
if (my_num == 0) printf ("The number is zero.\n");
if (my_num > 0) printf ("The number is positive.\n");
if (my_num < 0) printf ("The number is negative.\n");
It is often a good idea stylistically to use curly brackets in an if statement. It is no
less efficient from the compiler’s viewpoint, and sometimes you will want to include more
statements later. It also makes if statements stand out clearly in the code. However, curly
brackets make no sense for short statements such as the following:
if (my_num == 0) my_num++;
The if command by itself permits only limited decisions. With the addition of else in
the next section, however, if becomes much more flexible.
if (my_num > 0)
{
printf ("The number is positive.");
}
else
{
printf ("The number is zero or negative.");
}
It is not necessary to test my_num in the second block because that block is not executed
unless my_num is not greater than zero.
if (my_num > 2)
{
if (my_num < 4)
{
printf ("my_num is three");
}
}
Both of these code examples have the same result, but they arrive at it in different ways.
The first example, when translated into English, might read, “If my_num is greater than
two and my_num is less than four (and my_num is an integer), then my_num has to be three.”
The second method is more complicated. In English, it can be read, “If my_num is greater
than two, do what is in the first code block. Inside it, my_num is always greater than two;
otherwise the program would never have arrived there. Now, if my_num is also less than
four, then do what is inside the second code block. Inside that block, my_num is always less
than four. We also know it is more than two, since the whole of the second test happens
inside the block where that’s true. So, assuming my_num is an integer, it must be three.”
In short, there are two ways of making compound decisions in C. You make nested tests,
or you can use the comparison operators &&, ||, and so on. In situations where sequences of
comparison operators become too complex, nested tests are often a more attractive option.
Consider the following example:
if (i > 2)
{
/* i is greater than 2 here! */
}
else
{
/* i is less than or equal to 2 here! */
}
54 Chapter 10: Decisions
The code blocks in this example provide “safe zones” wherein you can rest assured that
certain conditions hold. This enables you to think and code in a structured way.
You can nest if statements in multiple levels, as in the following example:
#include <stdio.h>
int main ()
{
int grade;
int main()
{
int foo = 10;
int bar = 50;
int bas;
Example Listing 55
return 0;
}
The program will print ‘bas = 50’ as a result.
int main ()
{
int digit;
{
printf ("The Morse code of that digit is ");
morse (digit);
}
return 0;
}
int main ()
{
printf ("Will you join the Free Software movement? ");
if (yes())
{
printf("Great! The price of freedom is eternal vigilance!\n\n");
}
else
{
printf("Too bad. Maybe next life...\n\n");
}
Questions for Chapter 10 57
return 0;
}
int yes()
{
switch (getchar())
{
case ’y’ :
case ’Y’ : return 1;
default : return 0;
}
}
If the character is ‘y’, then the program falls through and meets the statement return 1. If
there were a break statement after case ’y’, then the program would not be able to reach
case ’Y’ unless an actual ‘Y’ were typed.
Note: The return statements substitute for break in the above code, but they do more
than break out of switch — they break out of the whole function. This can be a useful
trick.
11 Loops
Controlling repetitive processes. Nesting loops
Loops are a kind of C construct that enable the programmer to execute a sequence of
instructions over and over, with some condition specifying when they will stop. There are
three kinds of loop in C:
• while
• do . . . while
• for
11.1 while
The simplest of the three is the while loop. It looks like this:
while (condition )
{
do something
}
The condition (for example, (a > b)) is evaluated every time the loop is executed. If
the condition is true, then statements in the curly brackets are executed. If the condition
is false, then those statements are ignored, and the while loop ends. The program then
executes the next statement in the program.
The condition comes at the start of the loop, so it is tested at the start of every pass,
or time through the loop. If the condition is false before the loop has been executed even
once, then the statements inside the curly brackets will never be executed. (See Section 11.2
[do...while], page 60, for an example of a loop construction where this is not true.)
The following example prompts the user to type in a line of text, and then counts all the
spaces in the line. The loop terminates when the user hits the hRETi key and then prints
out the number of spaces. (See Section 16.3.1 [getchar], page 130, for more information on
the standard library getchar function.)
#include <stdio.h>
int main()
{
char ch;
int count = 0;
int main();
void get_substring();
int main()
{
char ch;
do
{
ch = getchar();
if (ch == ’"’)
{
putchar(ch);
get_substring();
}
}
while (ch != ’\n’);
return 0;
}
for 61
void get_substring()
{
char ch;
do
{
ch = getchar();
putchar(ch);
if (ch == ’\n’)
{
printf ("\nString was not closed ");
printf ("before end of line.\n");
break;
}
}
while (ch != ’"’);
printf ("\n\n");
}
11.3 for
The most complex loop in C is the for loop. The for construct, as it was developed
in earlier computer languages such as BASIC and Pascal, was intended to behave in the
following way:
For all values of variable from value1 to value2, in steps of value3, repeat the
following sequence of commands. . .
The for loop in C is much more versatile than its counterpart in those earlier languages.
The for loop looks like this in C:
for (initialization ; condition ; increment )
{
do something ;
}
In normal usage, these expressions have the following significance.
• initialization
This is an expression that initializes the control variable, or the variable tested in
the condition part of the for statement. (Sometimes this variable is called the loop’s
index.) The initialization part is only carried out once before the start of the loop.
Example: index = 1.
• condition
This is a conditional expression that is tested every time through the loop, just as in a
while loop. It is evaluated at the beginning of every loop, and the loop is only executed
if the expression is true. Example: index <= 20.
• increment
This is an expression that is used to alter the value of the control variable. In earlier
languages, this usually meant adding or subtracting 1 from the variable. In C, it can
be almost anything. Examples: index++, index *= 20, or index /= 2.3.
For example, the following for loop prints out the integers from 1 to 10:
62 Chapter 11: Loops
int my_int;
The following example prints out all prime numbers between 1 and the macro value
MAX_INT. (A prime numbers is a number that cannot be divided by any number except 1
and itself without leaving a remainder.) This program checks whether a number is a prime
by dividing it by all smaller integers up to half its size. (See Chapter 12 [Preprocessor
directives], page 67, for more information on macros.)
#include <stdio.h>
int main ()
{
int poss_prime;
return (TRUE);
}
int main()
{
int up, down;
return 0;
}
64 Chapter 11: Loops
#include <stdio.h>
int main()
{
printf ("%d\n\n", returner(5, 10));
printf ("%d\n\n", returner(5, 5000));
return 0;
}
foo++;
}
return foo;
}
The function returner contains a while loop that increments the variable foo and tests it
against a value of 1000. However, if at any point the value of foo exceeds the value of the
variable bar, the function will exit the loop, immediately returning the value of foo to the
calling function. Otherwise, when foo reaches 1000, the function will increment foo one
more time and return it to main.
Because of the values it passes to returner, the main function will first print a value of
11, then 1001. Can you see why?
Here is an example that uses the continue statement to avoid division by zero (which
causes a run-time error):
for (my_int = -10; my_int <= 10; my_int++)
{
if (my_int == 0)
{
continue;
}
#define SIZE 5
int main()
{
int square_y, square_x;
printf ("\n");
printf ("\n");
return 0;
}
The output of the above code looks like this:
*****
*****
*****
*****
*****
12 Preprocessor directives
#include <stdio.h>
This directive tells the preprocessor to include the file ‘stdio.h’; in other words, to treat
it as though it were part of the program text.
A file to be included may itself contain #include directives, thus encompassing other
files. When this happens, the included files are said to be nested.
#else This is part of an #if preprocessor statement and works in the same way with
#if that the regular C else does with the regular if.
#error This forces the compiler to abort. Also intended for debugging.
Below is an example of conditional compilation. The following code displays ‘23’ to the
screen.
68 Chapter 12: Preprocessor directives
#include <stdio.h>
int my_int = 0;
int main ()
{
set_my_int();
printf("%d\n", my_int);
return 0;
}
12.2 Macros
Macros can make long, ungainly pieces of code into short words. The simplest use of macros
is to give constant values meaningful names. For example:
#define MY_PHONE 5551234
This allows the programmer to use the word MY_PHONE to mean the number 5551234.
In this case, the word is longer than the number, but it is more meaningful and makes a
program read more naturally. It can also be centralised in a header file, where it is easily
changed; this eliminates tedious search-and-replace procedures on code if the value appears
frequently in the code. It has been said, with some humorous exaggeration, that the only
values that should appear “naked” in C code instead of as macros or variables are 1 and 0.
The difference between defining a macro for 5551234 called MY_PHONE and declaring a
long integer variable called my_phone with the same value is that the variable my_phone
has the value 5551234 only provisionally; it can be incremented with the statement my_
phone++;, for example. In some sense, however, the macro MY_PHONE is that value, and
only that value — the C preprocessor simply searches through the C code before it is
compiled and replaces every instance of MY_PHONE with 5551234. Issuing the command
MY_PHONE++; is no more or less sensible than issuing the command 5551234++;.
Any piece of C code can be made into a macro, Macros are not merely constants referred
to at compile time, but are strings that are physically replaced with their values by the
preprocessor before compilation. For example:
#define SUM 1 + 2 + 3 + 4
would allow SUM to be used instead of 1 + 2 + 3 + 4. Usually, this would equal 10, so that in
the statement example1 = SUM + 10;, the variable example1 equals 20. Sometimes, though,
this macro will be evaluated differently; for instance, in the statement example2 = SUM *
Macro functions 69
10;, the variable example2 equals 46, instead of 100, as you might think. Can you figure
out why? Hint: it has to do with the order of operations.
The quotation marks in the following macro allow the string to be called by the identifier
SONG instead of typing it out over and over. Because the text ‘99 bottles of beer on the
wall...’ is enclosed by double quotation marks, it will never be interpreted as C code.
#define SONG "99 bottles of beer on the wall..."
Macros cannot define more than a single line, but they can be used anywhere except
inside strings. (Anything enclosed in string quotes is assumed to be untouchable by the
compiler.)
Some macros are defined already in the file ‘stdio.h’, for example, NULL (the value 0).
There are a few more directives for macro definition besides #define:
#ifdef This is a kind of #if that is followed by a macro name. If that macro is defined
then this directive is true. #ifdef works with #else in the same way that #if
does.
#ifndef This is the opposite of #ifdef. It is also followed by a macro name. If that
name is not defined then this is true. It also works with #else.
Here is a code example using some macro definition directives from this section, and
some conditional compilation directives from the last section as well.
#include <stdio.h>
int my_int = 0;
#undef CHOICE
#ifdef CHOICE
void set_my_int()
{
my_int = 23;
}
#else
void set_my_int()
{
my_int = 17;
}
#endif
int main ()
{
set_my_int();
printf("%d\n", my_int);
return 0;
}
This macro uses the ?...:... command to return a positive number no matter what value
is assigned to my_val — if my_val is defined as a positive number, the macro returns the
same number, and if my_val is defined as a negative number, the macro returns its negative
(which will be positive). (See Chapter 10 [Decisions], page 51, for more information on
the ?...:... structure. If you write ABS(-4), then the preprocessor will substitute -4 for
my_val; if you write ABS(i), then the preprocessor will substitute i for my_val, and so on.
Macros can take more than one parameter, as in the code example below.
One caveat: macros are substituted whole wherever they are used in a program: this
is potentially a huge amount of code repetition. The advantage of a macro over an actual
function, however, is speed. No time is taken up in passing control to a new function,
because control never leaves the home function; the macro just makes the function a bit
longer.
A second caveat: function calls cannot be used as macro parameters. The following code
will not work:
ABS (cos(36))
int main ()
{
printf (STRING1);
printf (STRING2);
printf ("%d\n", EXPRESSION1);
printf ("%d\n", EXPRESSION2);
printf ("%d\n", ABS(-5));
printf ("Biggest of 1, 2, and 3 is %d\n", BIGGEST(1,2,3));
return 0;
}
A macro definition
must be all on one line!
10
20
5
Biggest of 1, 2, and 3 is 3
#ifndef BITS_PER_LONG
#ifdef _LP64
#define BITS_PER_LONG 64
#else
#define BITS_PER_LONG 32
#endif
#endif
In the middle of this set of macros, from ‘config.h’, the Emacs programmer used
the characters ‘/*’ and ‘*/’ to create an ordinary C comment. C comments can be
interspersed with macros freely.
The macro BITS_PER_INT is defined here to be 32 (but only if it is not already defined,
thanks to the #ifndef directive). The Emacs code will then treat integers as having
32 bits. (See Section 5.1 [Integer variables], page 17.)
The second chunk of macro code in this example checks to see whether BITS_PER_LONG
is defined. If it is not, but _LP64 is defined, it defines BITS_PER_LONG to be 64, so
that all long integers will be treated as having 64 bits. (_LP64 is a GCC macro that is
defined on 64-bit systems. It stands for “longs and pointers are 64 bits”.) If _LP64 is
not present, the code assumes it is on a 32-bit system and defines BITS_PER_LONG to
be 32.
• ‘emacs/src/lisp.h’
72 Chapter 12: Preprocessor directives
This set of macros, from ‘lisp.h’, again checks to see whether _LP64 is defined. If it is,
it defines EMACS_INT as long (if it is not already defined), and BITS_PER_EMACS_INT
to be the same as BITS_PER_LONG, which was defined in ‘config.h’, above. It then
defines EMACS_UINT to be an unsigned long, if it is not already defined.
If _LP64 is not defined, it is assumed we are on a 32-bit system. EMACS_INT is defined
to be an int if it is not already defined, and EMACS_UINT is defined to be an unsigned
int if it is not already defined.
Again, note that the programmer has freely interspersed a comment with the prepro-
cessor code.
• ‘emacs/src/lisp.h’
/* These values are overridden by the m- file on some machines. */
#ifndef VALBITS
#define VALBITS (BITS_PER_EMACS_INT - 4)
#endif
Here is another example from ‘lisp.h’. The macro VALBITS, which defines another
size of integer internal to Emacs, is defined as four less than BITS_PER_EMACS_INT —
that is, 60 on 64-bit systems, and 28 on 32-bit systems.
• ‘emacs/src/lisp.h’
#ifndef XINT /* Some machines need to do this differently. */
#define XINT(a) ((EMACS_INT) (((a) << (BITS_PER_EMACS_INT - VALBITS)) \
>> (BITS_PER_EMACS_INT - VALBITS)))
#endif
The interesting feature of the XINT macro above is not only that it is a function, but
that it is broken across multiple lines with the backslash character (‘\’). The GCC
preprocessor simply deletes the backslash, deletes the preceding whitespace from the
next line, and appends it where the backslash was. In this way, it is possible to treat
long, multi-line macros as though they are actually on a single line. (See Chapter 18
[Advanced operators], page 167, for more information on the the advanced operators
<< and >>.)
Questions 73
12.4 Questions
1. Define a macro called BIRTHDAY which equals the day of the month upon which your
birthday falls.
2. Write an instruction to the preprocessor to include the math library ‘math.h’.
3. A macro is always a number. True or false?
74 Chapter 12: Preprocessor directives
Header files 75
13 Libraries
Plug-in C expansions. Header files.
The core of the C language is small and simple, but special functionality is provided in
the form of external libraries of ready-made functions. Standardized libraries make C code
extremely portable, or easy to compile on many different computers.
Libraries are files of ready-compiled code that the compiler merges, or links, with a C
program during compilation. For example, there are libraries of mathematical functions,
string handling functions, and input/output functions. Indeed, most of the facilities C offers
are provided as libraries.
Some libraries are provided for you. You can also make your own, but to do so, you
will need to know how GNU builds libraries. We will discuss that later. (See Section 17.6
[Building a library], page 163.)
Most C programs include at least one library. You need to ensure both that the library
is linked to your program and that its header files are included in your program.
The standard C library, or ‘glibc’, is linked automatically with every program, but
header files are never included automatically, so you must always include them yourself.
Thus, you must always include ‘stdio.h’ in your program if you intend to use the standard
input/output features of C, even though ‘glibc’, which contains the input/output routines,
is linked automatically.
Other libraries, however, are not linked automatically. You must link them to your
program yourself. For example, to link the math library ‘libm.so’, type
gcc -o program_name program_name.c -lm
The command-line option to link ‘libm.so’ is simply ‘-lm’, without the ‘lib’ or the
‘.so’, or in the case of static libraries, ‘.a’. (See Section 13.2 [Kinds of library], page 77.)
The ‘-l’ option was created because the average GNU system already has many libraries,
and more can be added at any time. This means that sometimes two libraries provide
alternate definitions of the same function. With judicious use of the ‘-l’ option, however,
you can usually clarify to the compiler which definition of the function should be used.
Libraries specified earlier on the command line take precedence over those defined later,
and code from later libraries is only linked in if it matches a reference (function definition,
macro, global variable, etc.) that is still undefined. (See Section 17.4 [Compiling multiple
files], page 157, for more information.)
In summary, you must always do two things:
• link the library with a ‘-l’ option to gcc (a step that may be skipped in the case of
‘glibc’).
• include the library header files (a step you must always follow, even for ‘glibc’).
the names of functions or macros in the header file to mean anything other than what the
library specifies, in any source code file that includes the header file.
The most commonly used header file is for the standard input/output routines in ‘glibc’
and is called ‘stdio.h’. This and other header files are included with the #include com-
mand at the top of a source code file. For example,
#include "name.h"
includes a header file from the current directory (the directory in which your C source code
file appears), and
#include <name.h>
includes a file from a system directory — a standard GNU directory like ‘/usr/include’.
(The #include command is actually a preprocessor directive, or instruction to a program
used by the C compiler to simplify C code. (See Chapter 12 [Preprocessor directives],
page 67, for more information.)
Here is an example that uses the #include directive to include the standard ‘stdio.h’
header in order to print a greeting on the screen with the printf command. (The characters
‘\n’ cause printf to move the cursor to the next line.)
#include <stdio.h>
int main ()
{
printf ("C standard I/O file is included.\n");
printf ("Hello world!\n");
return 0;
}
If you save this code in a file called ‘hello.c’, you can compile this program with the
following command:
gcc -o hello hello.c
As mentioned earlier, you can use some library functions without having to link library
files explicitly, since every program is always linked with the standard C library. This
is called ‘libc’ on older operating systems such as Unix, but ‘glibc’ (“GNU libc”) on
GNU systems. The ‘glibc’ file includes standard functions for input/output, date and
time calculation, string manipulation, memory allocation, mathematics, and other language
features.
Most of the standard ‘glibc’ functions can be incorporated into your program just by
using the #include directive to include the proper header files. For example, since ‘glibc’
includes the standard input/output routines, all you need to do to be able to call printf
is put the line #include <stdio.h> at the beginning of your program, as in the example
that follows.
Note that ‘stdio.h’ is just one of the many header files you will eventually use to access
‘glibc’. The GNU C library is automatically linked with every C program, but you will
eventually need a variety of header files to access it. These header files are not included in
your code automatically — you must include them yourself!
Kinds of library 77
#include <stdio.h>
#include <math.h>
int main ()
{
double x, y;
y = sin (x);
printf ("Math library ready\n");
return 0;
}
However, programs that use a special function outside of ‘glibc’ — including mathe-
matical functions that are nominally part of ‘glibc’, such as function sin in the example
above! — must use the ‘-l’ option to gcc in order to link the appropriate libraries. If
you saved this code above in a file called ‘math.c’, you could compile it with the following
command:
gcc -o math math.c -lm
The option ‘-lm’ links in the library ‘libm.so’, which is where the mathematics routines
are actually located on a GNU system.
To learn which header files you must include in your program in order to use the features
of ‘glibc’ that interest you, consult section “Table of Contents” in The GNU C Library
Reference Manual. This document lists all the functions, data types, and so on contained
in ‘glibc’, arranged by topic and header file. (See Section 13.3 [Common library functions],
page 78, for a partial list of these header files.)
Note: Strictly speaking, you need not always use a system header file to access the
functions in a library. It is possible to write your own declarations that mimic the ones in
the standard header files. You might want to do this if the standard header files are too
large, for example. In practice, however, this rarely happens, and this technique is better
left to advanced C programmers; using the header files that came with your GNU system
is a more reliable way to access libraries.
that you need into your executable, so that it will run on systems that don’t have those
libraries. (It is also sometimes easier to debug a program that is linked to static libraries
than one linked to shared libraries. See Section 23.5 [Introduction to GDB], page 213, for
more information.)
The file name for a library always starts with ‘lib’ and ends with either ‘.a’ (if it is
static) or ‘.so’ (if it is shared). For example, ‘libm.a’ is the static version of the C math
library, and ‘libm.so’ is the shared version. As explained above, you must use the ‘-l’
option with the name of a library, minus its ‘lib’ prefix and ‘.a’ or ‘.so’ suffix, to link that
library to your program (except the library ‘glibc’, which is always linked). For example,
the following shell command creates an executable program called ‘math’ from the source
code file ‘math.c’ and the library ‘libm.so’.
gcc -o math math.c -lm
The shared version of the library is always linked by default. If you want to link the static
version of the library, you must use the GCC option ‘--static’. The following example
links ‘libm.a’ instead of ‘libm.so’.
gcc -o math math.c -lm --static
Type ‘info gcc’ at your shell prompt for more information about GCC options.
1
The version of ‘ctype.h’ in the ‘/usr/include’ directory proper is the one that comes with ‘glibc’;
the one in ‘/usr/include/linux’ is a special version associated with the Linux kernel. You can
specify the one you want with a full pathname inside double quotes (for example, #include
"/usr/include/linux/ctype.h"), or you can use the ‘-I’ option of gcc to force GCC to search a set of
directories in a specific order. See Section 17.6 [Building a library], page 163, for more information.)
Character handling 79
isalnum Returns true if and only if the parameter is alphanumeric: that is, an alphabetic
character (see isalpha) or a digit (see isdigit).
isalpha Returns true if and only if the parameter is alphabetic. An alphabetic character
is any character from ‘A’ through ‘Z’ or ‘a’ through ‘z’.
isascii Returns true if and only if the parameter is a valid ASCII character: that is, it
has an integer value in the range 0 through 127. (Remember, the char type in
C is actually a kind of integer!)
iscntrl Returns true if and only if the parameter is a control character. Control char-
acters vary from system to system, but are usually defined as characters in the
range 0 to 31.
isdigit Returns true if and only if the parameter is a digit in the range 0 through 9.
isgraph Returns true if and only if the parameter is graphic: that is, if the charac-
ter is either alphanumeric (see isalnum) or punctuation (see ispunct). All
graphical characters are valid ASCII characters, but ASCII also includes non-
graphical characters such as control characters (see iscntrl) and whitespace
(see isspace).
islower Returns true if and only if the parameter is a lower-case alphabetic character
(see isalpha).
isprint Returns true if and only if the parameter is a printable character: that is, the
character is either graphical (see isgraph) or a space character.
ispunct Returns true if and only if the parameter is a punctuation character.
isspace Returns true if and only if the parameter is a whitespace character. What
is defined as whitespace varies from system to system, but it usually includes
space characters and tab characters, and sometimes newline characters.
isupper Returns true if and only if the parameter is an upper-case alphabetic character
(see isalpha).
isxdigit Returns true if and only if the parameter is a valid hexadecimal digit: that is,
a decimal digit (see isdigit), or a letter from ‘a’ through ‘f’ or ‘A’ through ‘F’.
toascii Returns the parameter stripped of its eighth bit, so that it has an integer value
from 0 through 127 and is therefore a valid ASCII character. (See isascii.)
tolower Converts a character into its lower-case counterpart. Does not affect characters
which are already in lower case.
toupper Converts a character into its upper-case counterpart. Does not affect characters
which are already in upper case.
/********************************************************/
/* */
/* Demonstration of character utility functions */
/* */
/********************************************************/
#include <stdio.h>
#include <ctype.h>
80 Chapter 13: Libraries
{
char ch;
printf ("\n\n");
return 0;
}
The output of the above code example is as follows:
VALID CHARACTERS FROM isgraph:
0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j
k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f
82 Chapter 13: Libraries
ceil Returns the ceiling for the parameter: that is, the integer just above it. In
effect, rounds the parameter up.
cos Returns the cosine of the parameter in radians. (The parameter is also assumed
to be specified in radians.)
cosh Returns the hyperbolic cosine of the parameter.
exp Returns the exponential function of the parameter (i.e. e to the power of the
parameter).
fabs Returns the absolute or unsigned value of the parameter in brackets. This is
the version that is a proper function; see abs if you want one that is a macro.
floor Returns the floor for the parameter: that is, the integer just below it. In effect,
rounds the parameter down to the nearest integer value, i.e. truncates it.
log Returns the natural logarithm of the parameter. The parameter used must be
greater than zero, but does not have to be declared as unsigned.
Questions for Chapter 13 83
log10 Returns the base 10 logarithm of the parameter. The parameter used must be
greater than zero, but does not have to be declared as unsigned.
pow Returns the first parameter raised to the power of the second.
result = pow (x,y); /*raise x to the power y */
result = pow (x,2); /* square x */
sin Returns the sine of the parameter in radians. (The parameter is also assumed
to be specified in radians.)
sinh Returns the hyperbolic sine of the parameter. (Pronounced “shine” or “sinch”.)
sqrt Returns the positive square root of the parameter.
tan Returns the tangent of the parameter in radians. (The parameter is also as-
sumed to be specified in radians.)
tanh Returns the hyperbolic tangent of the parameter.
Here is a code example that uses a few of the math library routines listed above.
#include <stdio.h>
#include <math.h>
int main()
{
double my_pi;
my_pi = 4 * atan(1.0);
return 0;
}
If you save the above example as ‘pi.c’, you will have to enter a command such as the
one below to compile it.
gcc pi.c -o pi -lm
When you compile and run the code example, it should print the following results:
my_pi = 3.14159265358979311599796346854419
M_PI = 3.14159265358979311599796346854419
14 Arrays
Rows and tables of storage.
Suppose you have a long list of numbers, but you don’t want to assign them to variables
individually. For example, you are writing a simple program for a restaurant to keep a
list of the amount each diner has on his or her tab, but you don’t want to go through the
tedium of writing a list like the following:
alfies_tab = 88.33;
bettys_tab = 17.23;
charlies_tab = 55.55;
etc.
A list like that could run to hundreds or thousands of entries, and for each diner you’d
have to write “special-case” code referring to every diner’s data individually. No, what you
really want is a single table in which you can find the tab corresponding to a particular
diner. You can then look up the tab of the diner with dining club card number 7712 in row
number 7712 of the table much more easily.
This is why arrays were invented. Arrays are a convenient way to group many variables
under a single variable name. They are like pigeonholes, with each compartment storing a
single value. Arrays can be one-dimensional like a list, two-dimensional like a chessboard,
or three-dimensional like an apartment building — in fact, they can have any arbitrary
dimensionality, including ones humans cannot visualise easily.
An array is defined using square brackets [...]. For example: an array of three integers
called my_list would be declared thus:
int my_list[3];
This statement would cause space for three adjacent integers to be created in memory, as
in the diagram below. Notice that there is no space between the name of the array above
(my_array) and the opening square bracket ‘[’.
------------------------------------
my_list: | | | |
------------------------------------
The number in the square brackets of the declaration is referred to as the subscript of the
array, and it must be an integer greater than or equal to zero.
The three integer “pigeonholes” in the above array are called its locations, and the values
filling them are called the array’s elements. The position of an element in the array is called
its index (the plural is indices). In the following example, 5, 17, and 23 are the array’s
elements, and 0, 1, and 2 are its corresponding indices.
Notice also that although we are creating space for three integers, arrays in C are zero-
based, so the indices of the array run (0, 1, 2). If arrays in C were one-based, the indices
would run (1, 2, 3).
int my_list[3];
my_list[0] = 5;
my_list[1] = 17;
my_list[2] = 23;
86 Chapter 14: Arrays
The above example would result in an array that “looks like” the following diagram. (Of
course, an array is merely an arrangement of bytes in the computer’s memory, so it does
not look like much of anything, literally speaking.)
index: 0 1 2
------------------------------------
my_list: | 5 | 17 | 23 |
------------------------------------
Note that every element in an array must be of the same type, for example, integer. It is not
possible in C to have arrays that contain multiple data types. However, if you want an array
with multiple data types, you might instead be able to use multiple arrays of different data
types that contain the same number of elements. For example, to continue our restaurant
tab example above, one array, diner_names might contain a list of the names of the diners.
If you are looking for a particular diner, say Xavier Nougat, you might find that the index
of his name in diner_names is 7498. If you have programmed an associated floating-point
array called diner_tabs, you might look up element 7498 in that array and find that his
tab is $99.34.
#include <stdio.h>
#define ARRAY_SIZE 10
int main ()
{
int index, my_array[ARRAY_SIZE];
return 0;
}
The output from the above example is as follows:
my_array[0] = 0
my_array[1] = 0
my_array[2] = 0
my_array[3] = 0
my_array[4] = 0
my_array[5] = 0
my_array[6] = 0
my_array[7] = 0
my_array[8] = 0
my_array[9] = 0
You can use similar code to fill the array with different values. The following code
example is nearly identical to the one above, but the line my_array[index] = index; fills
each element of the array with its own index:
#include <stdio.h>
#define ARRAY_SIZE 5
int main ()
{
int index, my_array[ARRAY_SIZE];
return 0;
}
The output is as follows:
my_array[0] = 0
my_array[1] = 1
my_array[2] = 2
my_array[3] = 3
my_array[4] = 4
Here is a human’s-eye view of the internal representation of the array (how the array "looks"
to the computer):
88 Chapter 14: Arrays
index 0 1 2 3 4
-------------------
element | 0 | 1 | 2 | 3 | 4 |
-------------------
You can use loops to do more than initialize an array. The next code example demon-
strates the use of for loops with an array to find prime numbers. The example uses a
mathematical device called the Sieve of Erastosthenes. Erastosthenes of Cyrene discovered
that one can find all prime numbers by first writing down a list of integers from 2 (the first
prime number) up to some arbitrary number, then deleting all multiples of 2 (which are
by definition not prime numbers), finding the next undeleted number after 2 (which is 3),
deleting all its multiples, finding the next undeleted number after that (5), deleting all its
multiples, and so on. When you have finished this process, all numbers that remain are
primes.
The following code example creates a Sieve of Erastosthenes for integers up to 4999,
initializes all elements with 1, then deletes all composite (non-prime) numbers by replacing
the elements that have an index equal to the composite with the macro DELETED, which
equals 0.
#include <stdio.h>
int sieve[ARRAY_SIZE];
int main ()
{
printf ("Results of Sieve of Erastosthenes:\n\n");
fill_sieve();
delete_nonprimes();
print_primes();
}
fill_sieve ()
{
int index;
delete_nonprimes ()
{
int index;
for (index = prime * multiplier; index < ARRAY_SIZE; index = prime * multiplier++)
sieve[index] = DELETED;
}
print_primes ()
{
int index;
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97
101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191
193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283
293 307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397 401
409 419 421 431 433 439 443 449 457 461 463 467 479 487 491 499 ...
of columns in the array), so the column index is changing faster than the row index, as the
one-dimensional representation of the array inside the computer is traversed.
You can represent a three-dimensional array, such as a cube, in a similar way:
variable_type array_name [size1 ][size2 ][size3 ]
Arrays do not have to be shaped like squares and cubes; you can give each dimension of
the array a different size, as follows:
int non_cube[2][6][8];
Three-dimensional arrays (and higher) are stored in the same basic way as two-dimensional
ones. They are kept in computer memory as a linear sequence of variables, and the last
index is always the one that varies fastest (then the next-to-last, and so on).
#define SIZE1 3
#define SIZE2 3
#define SIZE3 3
int main ()
{
int fast, faster, fastest;
int my_array[SIZE1][SIZE2][SIZE3];
my_array[1][0][1] DONE
my_array[1][0][2] DONE
my_array[1][1][0] DONE
my_array[1][1][1] DONE
my_array[1][1][2] DONE
my_array[1][2][0] DONE
my_array[1][2][1] DONE
my_array[1][2][2] DONE
my_array[2][0][0] DONE
my_array[2][0][1] DONE
my_array[2][0][2] DONE
my_array[2][1][0] DONE
my_array[2][1][1] DONE
my_array[2][1][2] DONE
my_array[2][2][0] DONE
my_array[2][2][1] DONE
my_array[2][2][2] DONE
Note: Although in this example we have followed the order in which indices vary inside
the computer, you do not have to do so in your own code. For example, we could have
switched the nesting of the innermost fastest and outermost fast loops, and every element
would still have been initialized. It is better, however, to be systematic about initializing
multidimensional arrays.
my_array[0] = 42;
my_array[1] = 52;
my_array[2] = 23;
my_array[3] = 100;
...
The second method is more efficient and less tedious. It uses a single assignment operator
(=) and a few curly brackets ({...}).
Recall that arrays are stored by row, with the last index varying fastest. A 3 by 3 array
could be initialized in the following way:
int my_array[3][3] =
{
{10, 23, 42},
{1, 654, 0},
{40652, 22, 0}
};
#include <stdio.h>
int main()
{
int row, column;
int my_array[3][3] =
{
{10, 23, 42},
{1, 654, 0},
{40652, 22, 0}
};
printf("\n");
return 0;
}
The internal curly brackets are unnecessary, but they help to distinguish the rows of the
array. The following code has the same effect as the first example:
int my_array[3][3] =
{
10, 23, 42,
1, 654, 0,
40652, 22, 0
};
Using any of these three array initializations, the program above will print the following
text:
10 23 42
1 654 0
40652 22 0
Note 1: Be careful to place commas after every array element except the last one before a
closing curly bracket (‘}’). Be sure you also place a semicolon after the final curly bracket
of an array initializer, since here curly brackets are not delimiting a code block.
Note 2: All the expressions in an array initializer must be constants, not variables; that
is, values such as 235 and ’q’ are acceptable, depending on the type of the array, but
expressions such as the integer variable my_int are not.
Note 3: If there are not enough expressions in the array initializer to fill the array, the
remaining elements will be set to 0 if the array is static, but will be filled with garbage
otherwise.
94 Chapter 14: Arrays
In the following example, notice how the array my_array in main is passed to the function
multiply as an actual parameter with the name my_array, but that the formal parameter
in the multiply function is defined as int *the_array: that is, an integer pointer. This
is the basis for much that you will hear spoken about the “equivalence of pointers and
arrays” — much that is best ignored until you have more C programming experience. The
important thing to understand is that arrays passed as parameters are considered to be
pointers by the functions receiving them. Therefore, they are always variable parameters,
which means that other functions can modify the original copy of the variable, just as
the function multiply does with the array my_array below. (See Chapter 8 [Parameters],
page 37.)
#include <stdio.h>
int main()
{
int index;
int my_array[5] = {0, 1, 2, 3, 4};
printf("\n\n");
return 0;
}
Even though the function multiply is declared void and therefore does not return a result,
it can still modify my_array directly, because it is a variable parameter. Therefore, the
result of the program above is as follows:
0 2 4 6 8
If you find the interchangeability of arrays and pointers as formal parameters in function
declarations to be confusing, you can always avoid the use of pointers, and declare formal
parameters to be arrays, as in the new version of the multiply function below. The result
is the same.
Questions for Chapter 14 95
15 Strings
A string can contain any character, including special control characters, such as the tab
character ‘\t’, the newline character ‘\n’, the “bell” character ‘\7’ (which causes the ter-
minal to beep when it is displayed), and so on.
We have been using string values since we introduced the printf command early in the
book. (See Chapter 3 [The form of a C program], page 7.) To cause your terminal to beep
twice, include the following statement in a C program:
printf("This is a string value. Beep! Beep! \7\7");
#include <stdio.h>
#include <string.h>
int main()
{
/* Example 1 */
char string1[] = "A string declared as an array.\n";
/* Example 2 */
char *string2 = "A string declared as a pointer.\n";
/* Example 3 */
char string3[30];
strcpy(string3, "A string constant copied in.\n");
printf (string1);
printf (string2);
printf (string3);
return 0;
}
1. char string1[] = "A string declared as an array.\n";
This is usually the best way to declare and initialize a string. The character array is
declared explicitly. There is no size declaration for the array; just enough memory is
allocated for the string, because the compiler knows how long the string constant is.
The compiler stores the string constant in the character array and adds a null character
(‘\0’) to the end.
2. char *string2 = "A string declared as a pointer.\n";
The second of these initializations is a pointer to an array of characters. Just as in the
last example, the compiler calculates the size of the array from the string constant and
adds a null character. The compiler then assigns a pointer to the first character of the
character array to the variable string2.
Note: Most string functions will accept strings declared in either of these two ways.
Consider the printf statements at the end of the example program above — the
statements to print the variables string1 and string2 are identical.
3. char string3[30];
Declaring a string in this way is useful when you don’t know what the string variable
will contain, but have a general idea of the length of its contents (in this case, the
string can be a maximum of 30 characters long). The drawback is that you will either
have to use some kind of string function to assign the variable a value, as the next line
of code does ( strcpy(string3, "A string constant copied in.\n");), or you will
have to assign the elements of the array the hard way, character by character. (See
Section 15.4 [String library functions], page 99, for more information on the function
strcpy.)
char *menu[] =
{
" -------------------------------------- ",
" | ++ MENU ++ |",
" | ~~~~~~~~~~~~ |",
" | (0) Edit Preferences |",
" | (1) Print Charge Sheet |",
" | (2) Print Log Sheet |",
" | (3) Calculate Bill |",
" | (q) Quit |",
" | |",
" | |",
" | Please enter choice below. |",
" | |",
" -------------------------------------- "
};
int main()
{
int line_num;
return 0;
}
Notice that the string array menu is declared char *menu[]. This method of defining
a two-dimensional string array is a combination of methods 1 and 2 for initializing strings
from the last section. (See Section 15.2 [Initializing strings], page 97.) This is the most
convenient method; if you try to define menu with char menu[][], the compiler will return
an “unspecified bounds error”. You can get around this by declaring the second subscript
of menu explicitly (e.g. char menu[][80]), but that necessitates you know the maximum
length of the strings you are storing in the array, which is something you may not know
and that it may be tedious to find out.
The elements of menu are initialized with string constants in the same way that an integer
array, for example, is initialized with integers, separating each element with a comma. (See
Section 14.5 [Initializing arrays], page 92.)
100 Chapter 15: Strings
int main()
{
double my_value;
char my_string[] = "+1776.23";
my_value = atof(my_string);
printf("%f\n", my_value);
return 0;
}
The output from the above code is ‘1776.230000’.
• atoi Converts an ASCII string to its integer equivalent; for example, converts ‘-23.5’
to the value -23.
int my_value;
char my_string[] = "-23.5";
my_value = atoi(my_string);
printf("%d\n", my_value);
• atol Converts an ASCII string to its long integer equivalent; for example, converts
‘+2000000000’ to the value 2000000000.
long my_value;
char my_string[] = "+2000000000";
my_value = atol(my_string);
printf("%ld\n", my_value);
• strcat Concatenates two strings: that is, joins them together into one string. Example:
char string1[50] = "Hello, ";
char string2[] = "world!\n";
strcat (string1, string2);
printf (string1);
The example above attaches the contents of string2 to the current contents of string1.
The array string1 then contains the string ‘Hello, world!\n’.
Notice that string1 was declared to be 50 characters long, more than enough to contain
the initial values of both string1 and string2. You must be careful to allocate enough
space in the string variable that will receive the concatenated data; otherwise, your
program is likely to crash. Again, on a GNU system, although your program won’t
run, nothing more drastic than an error message from the operating system is likely to
occur in such a case.
• strcmp Compares two strings and returns a value that indicates which string comes
first alphabetically. Example:
String library functions 101
int comparison;
char string1[] = "alpha";
char string2[] = "beta";
/* Example 1 */
strcpy (dest_string, source_string);
printf ("%s\n", dest_string);
/* Example 2 */
strcpy (dest_string, "Are we having fun yet?");
printf ("%s\n", dest_string);
The example above produces this output:
Are we not men?
Are we having fun yet?
Notes:
• The destination string in strcmp comes first, then the source string. This works
in exactly the opposite way from the GNU/Linux shell command, cp.
• You can use strcmp to copy one string variable into another (Example 1), or to
copy a string constant into a string variable (Example 2).
• Note the use of the characters ‘%s’ in the printf statements to display a string,
rather than ‘%d’ to display an integer or ‘%f’ to display a float.
• strlen Returns an integer that gives the length of a string in characters, not including
the null character at the end of the string. The following example displays the number
‘5’.
int string_length
char my_string[] = "fnord";
• strncat Works like strcat, but concatenates only a specified number of characters.
The example below displays the string ‘Hello, world! Bye’.
char string1[50] = "Hello, world! ";
char string2[] = "Bye now!";
strncat (string1, string2, 3);
printf ("%s\n", string1);
• strncmp Works like strcmp, but compares only a specified number of characters of
both strings. The example below displays ‘0’, because ‘dogberry’ and ‘dogwood’ are
identical for their first three characters.
int comparison;
char string1[] = "dogberry";
char string2[] = "dogwood";
fopen, as opposed to the low-level file-opening function open. Some of them are more
generalized versions of functions with which you may already be familiar; for example, the
function fprintf behaves like the familiar printf, but takes an additional parameter — a
stream — and sends all its output to that stream instead of simply sending its output to
‘stdout’, as printf does.
FILE *my_stream;
char my_filename = "foo";
my_stream2 = fopen (my_filename, "r");
The second parameter is a string containing one of the following sets of characters:
r Open the file for reading only. The file must already exist.
w Open the file for writing only. If the file already exists, its current contents are
deleted. If the file does not already exist, it is created.
r+ Open the file for reading and writing. The file must already exist. The contents
of the file are initially unchanged, but the file position is set to the beginning
of the file.
w+ Open the file for both writing and reading. If the file already exists, its current
contents are deleted. If the file does not already exist, it is created.
a Open the file for appending only. Appending to a file is the same as writing to
it, except that data is only written to the current end of the file. If the file does
not already exist, it is created.
a+ Open the file for both appending and reading. If the file exists, its contents are
unchanged until appended to. If the file does not exist, it is created. The initial
file position for reading is at the beginning of the file, but the file position for
appending is at the end of the file.
See Section 16.1.4 [File position], page 109, for more information on the concept of file
position.
106 Chapter 16: Input and output
You can also append the character ‘x’ after any of the strings in the table above. This
character causes fopen to fail rather than opening the file if the file already exists. If you
append ‘x’ to any of the arguments above, you are guaranteed not to clobber (that is,
accidentally destroy) any file you attempt to open. (Any other characters in this parameter
are ignored on a GNU system, but may be meaningful on other systems.)
The following example illustrates the proper use of fopen to open a text file for reading
(as well as highlighting the fact that you should clean up after yourself by closing files after
you are done with them). Try running it once, then running it a second time after creating
the text file ‘snazzyjazz.txt’ in the current directory with a GNU command such as touch
snazzyjazz.txt.
#include <stdio.h>
int main()
{
FILE *my_stream;
if (my_stream == NULL)
{
printf ("File could not be opened\n");
}
else
{
printf ("File opened! Closing it now...\n");
/* Close stream; skip error-checking for brevity of example */
fclose (my_stream);
}
return 0;
}
See Section 16.1.2 [Closing a file], page 106, for more information on the function fclose.
int main()
{
FILE *my_stream;
char my_filename[] = "snazzyjazz.txt";
int close_error;
if (close_error != 0)
{
printf ("File could not be closed.\n");
}
else
{
printf ("File closed.\n");
}
return 0;
}
Here is an example that creates an array and fills it with multiples of 2, prints it out,
writes the array’s data to a file with fwrite, zeroes the array and prints it out, reads the
data from the file back into the array with fread, then prints the array out again so you
can compare its data with the first set of data.
#include <stdio.h>
int main()
{
int row, column;
FILE *my_stream;
int close_error;
char my_filename[] = "my_numbers.dat";
size_t object_size = sizeof(int);
size_t object_count = 25;
size_t op_return;
int my_array[5][5] =
{
2, 4, 6, 8, 10,
12, 14, 16, 18, 20,
22, 24, 26, 28, 30,
32, 34, 36, 38, 40,
42, 44, 46, 48, 50
};
printf ("Initial values of array:\n");
for (row = 0; row <= 4; row++)
{
for (column = 0; column <=4; column++)
{
printf ("%d ", my_array[row][column]);
}
printf ("\n");
}
return 0;
}
If all goes well, the code example above will produce the following output:
Initial values of array:
2 4 6 8 10
12 14 16 18 20
22 24 26 28 30
32 34 36 38 40
42 44 46 48 50
Successfully wrote data to file.
Zeroing array...
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
Now reading data back in...
Successfully read data from file.
2 4 6 8 10
12 14 16 18 20
22 24 26 28 30
32 34 36 38 40
42 44 46 48 50
If you attempt to view the file ‘my_numbers.dat’ produced by the program above with a
GNU command such as more numbers.dat, you will see only garbage, because the infor-
mation is stored in binary format, not readable by humans. After attempting to view this
binary file, your terminal may continue to show only garbage and you may have to reset it.
You may be able to do this with a menu option (if you are running gnome-terminal, for
example), or you may have to type reset blindly.
110 Chapter 16: Input and output
An example of these functions will not be useful until we have introduced single-character
I/O. See Section 16.3.3 [getc and fgetc], page 131, if you want to read a code example that
uses the ftell, fseek, and rewind functions.
2
Strictly speaking, there are multiple levels of buffering on a GNU system. Even after flushing characters
to a file, data from the file may remain in memory, unwritten to disk. On GNU systems, there is
an independently-running system program, or daemon, that periodically commits relevant data still in
memory to disk. Under GNU/Linux, this daemon is called ‘bdflush’.
112 Chapter 16: Input and output
16.2.1.1 puts
The most convenient function for printing a simple message on standard outout is puts. It
is even simpler than printf, since you do not need to include a newline character — puts
does that for you.
Using puts couldn’t be simpler. Here is an example:
puts ("Hello, multiverse.");
This code example will print the string ‘Hello, multiverse.’ to standard output.
The puts function is safe and simple, but not very flexible. See Section 16.2.2 [Formatted
string output], page 114, if you want to print fancier output.
16.2.1.2 fputs
The fputs (“file put string”) function is similar to the puts function in almost every respect,
except that it accepts a second parameter, a stream to which to write the string. It does
not add a newline character, however; it only writes the characters in the string. It returns
EOF if an error occurs; otherwise it returns a non-negative integer value.
Here is a brief code example that creates a text file and uses fputs to write into it the
phrase ‘If it’s not too late... make it a cheeseburger.’, followed by a newline char-
acter. This example also demonstrates the use of the fflush function. (See Section 16.1.5
[Stream buffering], page 111, for more information on this function.)
#include <stdio.h>
int main()
{
FILE *my_stream;
char my_filename[] = "snazzyjazz.txt";
int flush_status;
/*
Since the stream is fully-buffered by default, not line-buffered,
it needs to be flushed periodically. We’ll flush it here for
demonstration purposes, even though we’re about to close it.
*/
flush_status = fflush (my_stream);
if (flush_status != 0)
{
puts ("Error flushing stream!");
}
else
{
puts ("Stream flushed.");
}
return 0;
}
114 Chapter 16: Input and output
16.2.2.1 printf
If you have been reading the book closely up to this point, you have seen the use of the
printf function many times. To recap, this function prints a text string to the terminal
(or, to be more precise, the text stream ‘stdout’). For example, the following line of code
prints the string ‘Hello there!’, followed by a newline character, to the console:
printf ("Hello there!\n");
You probably also remember that you can incorporate numeric constants and variables
into your strings. Consider the following code example:
printf ("I’m free! I’m free! (So what? I’m %d.)\n", 4);
The previous example is equivalent to the following one:
int age = 4;
printf ("I’m free! I’m free! (So what? I’m %d.)\n", age);
Both of the code examples above produce the following output:
I’m free! I’m free! (So what? I’m 4.)
You may recall that besides using ‘%d’ with printf to print integers, we have also used
‘%f’ on occasion to print floating-point numbers, and that on occasion we have used more
than one argument. Consider this example:
printf ("I’m free! I’m free! (So what? I’m %d.) Well, I’m %f.\n", 4, 4.5);
That example produces the following output:
I’m free! I’m free! (So what? I’m 4.) Well, I’m 4.500000.
In fact, printf is a very flexible function. The general scheme is that you provide it with
a format string or template string (such as ‘"So what? I’m %d."’), which can contain zero
or more conversion specifications, conversion specifiers, or sometimes just conversions (in
this case ‘%d’), and zero or more arguments (for example, ‘4’). Each conversion specification
is said to specify a conversion, that is, how to convert its corresponding argument into a
printable string. After the template string, you supply one argument for each conversion
specifier in the template string. The printf function then prints the template string,
including each argument as converted to a printable sub-string by its conversion specifier,
and returns an integer containing the number of characters printed, or a negative value if
there was an error.
Formatted output conversion specifiers 115
‘Space character’
If the number does not start with a plus or minus sign, prefix it with a
space character instead. This flag is ignored if the ‘+’ flag is specified.
‘#’
For ‘%e’, ‘%E’, and ‘%f’, forces the number to include a decimal point, even
if no digits follow. For ‘%x’ and ‘%X’, prefixes ‘0x’ or ‘0X’, respectively.
‘’’
Separate the digits of the integer part of the number into groups, using
a locale-specific character. In the United States, for example, this will
usually be a comma, so that one million will be rendered ‘1,000,000’.
GNU systems only.
‘0’
Pad the field with zeroes instead of spaces; any sign or indication of base
(such as ‘0x’) will be printed before the zeroes. This flag is ignored if the
‘-’ flag or a precision is specified.
In the example given above, ‘%-17.7ld’, the flag given is ‘-’.
• An optional non-negative decimal integer specifying the minimum field width within
which the conversion will be printed. If the conversion contains fewer characters, it
will be padded with spaces (or zeroes, if the ‘0’ flag was specified). If the conversion
contains more characters, it will not be truncated, and will overflow the field. The
output will be right-justified within the field, unless the ‘-’ flag was specified. In the
example given above, ‘%-17.7ld’, the field width is ‘17’.
• For numeric conversions, an optional precision that specifies the number of digits to be
written. If it is specified, it consists of a dot character (‘.’), followed by a non-negative
decimal integer (which may be omitted, and defaults to zero if it is). In the example
given above, ‘%-17.7ld’, the precision is ‘.7’. Leading zeroes are produced if necessary.
If you don’t specify a precision, the number is printed with as many digits as necessary
(with a default of six digits after the decimal point). If you supply an argument of zero
with and explicit precision of zero, printf will not print any characters. Specifying a
precision for a string conversion (‘%s’) indicates the maximum number of characters to
write.
• An optional type modifier character from the table below. This character specifies
the data type of the argument if it is different from the default. In the example given
above, ‘%-17.7ld’, the type modifier character is ‘l’; normally, the ‘d’ output conversion
character expects a data type of int, but the ‘l’ specifies that a long int is being used
instead.
The numeric conversions usually expect an argument of either type int, unsigned
int, or double. (The ‘%c’ conversion converts its argument to unsigned char.) For
the integer conversions (‘%d’ and ‘%i’), char and short arguments are automatically
converted to type int, and for the unsigned integer conversions (‘%u’, ‘%x’, and ‘%X’),
they are converted to type unsigned int. For the floating-point conversions (‘%e’, ‘%E’,
and ‘%f’), all float arguments are converted to type double. You can use one of the
type modifiers from the table below to specify another type of argument.
Formatted output conversion specifiers 117
‘l’ Specifies that the argument is a long int (for ‘%d’ and ‘%i’), or an
unsigned long int (for ‘%u’, ‘%x’, and ‘%X’).
‘L’ Specifies that the argument is a long double for the floating-point conver-
sions (‘%e’, ‘%E’, and ‘%f’). Same as ‘ll’, for integer conversions (‘%d’ and
‘%i’).
‘ll’ Specifies that the argument is a long long int (for ‘%d’ and ‘%i’). On
systems that do not have extra-long integers, this has the same effect as
‘l’.
‘q’ Same as ‘ll’; comes from calling extra-long integers “quad ints”.
‘Z’ Specifies that the argument is of type size_t. (The size_t type is used to
specify the sizes of blocks of memory, and many functions in this chapter
use it.)
Make sure that your conversion specifiers use valid syntax; if they do not, if you do not
supply enough arguments for all conversion specifiers, or if any arguments are of the wrong
type, unpredictable results may follow. Supplying too many arguments is not a problem,
however; the extra arguments are simply ignored.
Here is a code example that shows various uses of printf.
#include <stdio.h>
#include <errno.h>
int main()
{
int my_integer = -42;
unsigned int my_ui = 23;
float my_float = 3.56;
double my_double = 424242.171717;
char my_char = ’w’;
char my_string[] = "Pardon me, may I borrow your nose?";
errno = EACCES;
printf ("errno string (EACCES): %m\n");
return 0;
}
The code example above produces the following output on a GNU system:
118 Chapter 16: Input and output
Integer: -42
Unsigned integer: 23
The same, as hexadecimal: 0xffffffd6 0x17
Floating-point: 3.560000
Double, exponential notation: 4.24242171717e+05
Single character: w
String: Pardon me, may I borrow your nose?
errno string (EACCES): Permission denied
16.2.3 fprintf
The fprintf (“file print formatted”) command is identical to printf, except that its first
parameter is a stream to which to send output. The following code example is the same as
the one for printf, except that it sends its output to the text file ‘snazzyjazz.txt’.
#include <stdio.h>
#include <errno.h>
int main()
{
int my_integer = -42;
unsigned int my_ui = 23;
float my_float = 3.56;
double my_double = 424242.171717;
char my_char = ’w’;
char my_string[] = "Pardon me, may I borrow your nose?";
FILE *my_stream;
char my_filename[] = "snazzyjazz.txt";
my_stream = fopen (my_filename, "w");
errno = EACCES;
fprintf (my_stream, "errno string (EACCES): %m\n");
return 0;
}
16.2.4 asprintf
The asprintf (mnemonic: “allocating string print formatted”) command is identical to
printf, except that its first parameter is a string to which to send output. It terminates
the string with a null character. It returns the number of characters stored in the string,
not including the terminating null.
sprintf 119
The asprintf function is nearly identical to the simpler sprintf, but is much safer,
because it dynamically allocates the string to which it sends output, so that the string will
never overflow. The first parameter is a pointer to a string variable, that is, it is of type
char **. The return value is the number of characters allocated to the buffer, or a negative
value if an error occurred.
The following code example prints the string ‘Being 4 is cool, but being free is
best of all.’ to the string variable my_string, then prints the string on the screen.
Notice that my_string is not initially allocated any space at all; asprintf allocates the
space itself. (See Section 16.2.1.1 [puts], page 113, for more information on the puts
function.)
#include <stdio.h>
int main()
{
char *my_string;
asprintf (&my_string, "Being %d is cool, but being free is best of all.", 4);
puts (my_string);
return 0;
}
16.2.5.1 sprintf
The sprintf (“string print formatted”) command is similar to asprintf, except that it is
much less safe. Its first parameter is a string to which to send output. It terminates the
string with a null character. It returns the number of characters stored in the string, not
including the terminating null.
This function will behave unpredictably if the string to which it is printing overlaps any
of its arguments. It is dangerous because the characters output to the string may overflow
it. This problem cannot be solved with the field width modifier to the conversion specifier,
because only the minimum field width can be specified with it. To avoid this problem,
it is better to use asprintf, but there is a lot of C code that still uses sprintf, so it is
important to know about it. (See Section 16.2.4 [asprintf], page 118.)
The following code example prints the string ‘Being 4 is cool, but being free is
best of all.’ to the string variable my_string then prints the string on the screen.
Notice that my_string has been allocated 100 bytes of space, enough to contain the
characters output to it. (See Section 16.2.1.1 [puts], page 113, for more information on the
puts function.)
120 Chapter 16: Input and output
#include <stdio.h>
int main()
{
char my_string[100];
sprintf (my_string, "Being %d is cool, but being free is best of all.", 4);
puts (my_string);
return 0;
}
16.2.6.1 getline
The getline function is the preferred method for reading lines of text from a stream,
including standard input. The other standard functions, including gets, fgets, and scanf,
are too unreliable. (Doubtless, in some programs you will see code that uses these unreliable
functions, and at times you will come across compilers that cannot handle the safer getline
function. As a professional, you should avoid unreliable functions and any compiler that
requires you to be unsafe.)
The getline function reads an entire line from a stream, up to and including the next
newline character. It takes three parameters. The first is a pointer to a block allocated
with malloc or calloc. (These two functions allocate computer memory for the program
when it is run. See Section 20.2 [Memory allocation], page 189, for more information.) This
parameter is of type char **; it will contain the line read by getline when it returns. The
second parameter is a pointer to a variable of type size_t; this parameter specifies the size
in bytes of the block of memory pointed to by the first parameter. The third parameter is
simply the stream from which to read the line.
The pointer to the block of memory allocated for getline is merely a suggestion. The
getline function will automatically enlarge the block of memory as needed, via the realloc
function, so there is never a shortage of space — one reason why getline is so safe. Not
only that, but getline will also tell you the new size of the block by the value returned in
the second parameter.
If an error occurs, such as end of file being reached without reading any bytes, getline
returns -1. Otherwise, the first parameter will contain a pointer to the string containing the
line that was read, and getline returns the number of characters read (up to and including
the newline, but not the final null character). The return value is of type ssize_t.
Although the second parameter is of type pointer to string (char **), you cannot treat
it as an ordinary string, since it may contain null characters before the final null character
marking the end of the line. The return value enables you to distinguish null characters
that getline read as part of the line, by specifying the size of the line. Any characters in
the block up to the number of bytes specified by the return value are part of the line; any
characters after that number of bytes are not.
Here is a short code example that demonstrates how to use getline to read a line of
text from the keyboard safely. Try typing more than 100 characters. Notice that getline
gets 121
can safely handle your line of input, no matter how long it is. Also note that the puts
command used to display the line of text read will be inadequate if the line contains any
null characters, since it will stop displaying text at the first null, but that since it is difficult
to enter null characters from the keyboard, this is generally not a consideration.
#include <stdio.h>
int main()
{
int bytes_read;
int nbytes = 100;
char *my_string;
if (bytes_read == -1)
{
puts ("ERROR!");
}
else
{
puts ("You typed:");
puts (my_string);
}
return 0;
}
16.2.6.2 getdelim
The getdelim function is a more general form of the getline function; whereas getline
stops reading input at the first newline character it encounters, the getdelim function
enables you to specify other delimiter characters than newline. In fact, getline simply
calls getdelim and specifies that the delimiter character is a newline.
The syntax for getdelim is nearly the same as that of getline, except that the third
parameter specifies the delimiter character, and the fourth parameter is the stream from
which to read. You can exactly duplicate the getline example in the last section with
getdelim, by replacing the line
bytes_read = getline (&my_string, &nbytes, stdin);
16.2.7.1 gets
If you want to read a string from standard input, you can use the gets function, the name
of which stands for “get string”. However, this function is deprecated — that means it is
obsolete and it is strongly suggested you do not use it — because it is dangerous. It is
dangerous because it provides no protection against overflowing the string into which it is
saving data. Programs that use gets can actually be a security problem on your computer.
Since it is sometimes used in older code (which is why the GNU C Library still provides
it), we will examine it briefly; nevertheless, you should always use the function getline
instead. (See Section 16.2.6.1 [getline], page 120.)
The gets function takes one parameter, the string in which to store the data read. It
reads characters from standard input up to the next newline character (that is, when the
user presses hRETURNi), discards the newline character, and copies the rest into the string
passed to it. If there was no error, it returns the same string (as a return value, which may
be discarded); otherwise, if there was an error, it returns a null pointer.
Here is a short code example that uses gets:
#include <stdio.h>
int main()
{
char my_string[500];
printf("Type something.\n");
gets(my_string);
printf ("You typed: %s\n", my_string);
return 0;
}
If you attempt to compile the example above, it will compile and will run properly, but
GCC will warn you against the use of a deprecated function, as follows:
/tmp/ccPW3krf.o: In function ‘main’:
/tmp/ccPW3krf.o(.text+0x24): the ‘gets’ function
is dangerous and should not be used.
Remember! Never use this function in your own code. Always use getline instead.
16.2.7.2 fgets
The fgets (“file get string”) function is similar to the gets function. This function is
deprecated — that means it is obsolete and it is strongly suggested you do not use it —
because it is dangerous. It is dangerous because if the input data contains a null character,
you can’t tell. Don’t use fgets unless you know the data cannot contain a null. Don’t use
it to read files edited by the user because, if the user inserts a null character, you should
either handle it properly or print a clear error message. Always use getline or getdelim
instead of fgets if you can.
Rather than reading a string from standard input, as gets does, fgets reads it from
a specified stream, up to and including a newline character. It stores the string in the
string variable passed to it, adding a null character to terminate the string. This function
takes three parameters: the first is the string into which to read data, the second is the
maximum number of characters to read. (You must supply at least this many characters of
space in the string, or your program will probably crash, but at least the fgets function
protects against overflowing the string and creating a security hazard, unlike gets.) The
Formatted string input 123
third parameter is the stream from which to read. The number of characters that fgets
reads is actually one less than than number specified; it stores the null character in the
extra character space.
If there is no error, fgets returns the string read as a return value, which may be
discarded. Otherwise, for example if the stream is already at end of file, it returns a null
pointer.
Unfortunately, like the gets function, fgets is deprecated, in this case because when
fgets cannot tell whether a null character is included in the string it reads. If a null
character is read by fgets, it will be stored in the string along with the rest of the characters
read. Since a null character terminates a string in C, C will then consider your string to
end prematurely, right before the first null character. Only use fgets if you are certain the
data read cannot contain a null; otherwise, use getline.
Here is a code example that uses fgets. It will create a text file containing the string
‘Hidee ho!’ plus a newline, read it back with fgets, and print it on standard output. Notice
that although 100 characters are allocated for the string my_string, and requested to be
read in the fgets call, there are not that many characters in the file. The fgets function
only reads the string up to the newline character; the important thing is to allocate enough
space in the string variable to contain the string to be read.
#include <stdio.h>
int main()
{
int input_character;
FILE *my_stream;
char my_filename[] = "snazzyjazz.txt";
char my_string[100];
return 0;
}
such as integers, floating point numbers, and character sequences, and store the values read
in variables.
16.2.8.1 sscanf
The sscanf function accepts a string from which to read input, then, in a manner similar to
printf and related functions, it accepts a template string and a series of related arguments.
It tries to match the template string to the string from which it is reading input, using
conversion specifier like those of printf.
The sscanf function is just like the deprecated parent scanf function, except that the
first argument of sscanf specifies a string from which to read, whereas scanf can only read
from standard input. Reaching the end of the string is treated as an end-of-file condition.
Here is an example of sscanf in action:
sscanf (input_string, "%as %as %as", &str_arg1, &str_arg2, &str_arg3);
If the string sscanf is scanning overlaps with any of the arguments, unexpected results
will follow, as in the following example. Don’t do this!
sscanf (input_string, "%as", &input_string);
Here is a good code example that parses input from the user with sscanf. It prompts
the user to enter three integers separated by whitespace, then reads an arbitrarily long line
of text from the user with getline. It then checks whether exactly three arguments were
assigned by sscanf. If the line read does not contain the data requested (for example, if it
contains a floating-point number or any alphabetic characters), the program prints an error
message and prompts the user for three integers again. When the program finally receives
exactly the data it was looking for from the user, it prints out a message acknowledging the
input, and then prints the three integers.
It is this flexibility of input and great ease of recovery from errors that makes the
getline/sscanf combination so vastly superior to scanf alone. Simply put, you should
never use scanf where you can use this combination instead.
#include <stdio.h>
int main()
{
int nbytes = 100;
char *my_string;
int int1, int2, int3;
int args_assigned;
args_assigned = 0;
while (args_assigned != 3)
{
puts ("Please enter three integers separated by whitespace.");
my_string = (char *) malloc (nbytes + 1);
getline (&my_string, &nbytes, stdin);
args_assigned = sscanf (my_string, "%d %d %d", &int1, &int2, &int3);
if (args_assigned != 3)
puts ("\nInput invalid!");
}
Formatted input conversion specifiers 125
return 0;
}
Template strings for sscanf and related functions are somewhat more free-form than
those for printf. For example, most conversion specifiers ignore any preceding whitespace.
Further, you cannot specify a precision for sscanf conversion specifiers, as you can for those
of printf.
Another important difference between sscanf and printf is that the arguments to
sscanf must be pointers; this allows sscanf to return values in the variables they point
to. If you forget to pass pointers to sscanf, you may receive some strange errors, and it
is easy to forget to do so; therefore, this is one of the first things you should check if code
containing a call to sscanf begins to go awry.
A sscanf template string can contain any number of any number of whitespace char-
acters, any number of ordinary, non-whitespace characters, and any number of conversion
specifiers starting with ‘%’. A whitespace character in the template string matches zero or
more whitespace characters in the input string. Ordinary, non-whitespace characters must
correspond exactly in the template string and the input stream; otherwise, a matching error
occurs. Thus, the template string ‘" foo "’ matches ‘"foo"’ and ‘" foo "’, but not ‘" food
"’.
If you create an input conversion specifier with invalid syntax, or if you don’t supply
enough arguments for all the conversion specifiers in the template string, your code may do
unexpected things, so be careful. Extra arguments, however, are simply ignored.
Conversion specifiers start with a percent sign (‘%’) and terminate with a character from
the following table:
‘z’ Specifies that the argument to which the value read should be assigned is
of type size_t. (The size_t type is used to specify the sizes of blocks of
memory, and many functions in this chapter use it.) Valid for the ‘%d’ and
‘%i’ conversions.
16.2.9.1 scanf
The first of the functions we will examine is scanf (“scan formatted”). The scanf function
is considered dangerous for a number of reasons. First, if used improperly, it can cause
your program to crash by reading character strings that overflow the string variables meant
to contain them, just like gets. (See Section 16.2.7.1 [gets], page 121.) Second, scanf
can hang if it encounters unexpected non-numeric input while reading a line from standard
input. Finally, it is difficult to recover from errors when the scanf template string does not
match the input exactly.
If you are going to read input from the keyboard, it is far better to read it with getline
and parse the resulting string with sscanf (“string scan formatted”) than to use scanf
directly. However, since sscanf uses nearly the same syntax as sscanf, as does the related
fscanf, and since scanf is a standard C function, it is important to learn about it.
If scanf cannot match the template string to the input string, it will return immediately
— and it will leave the first non-matching character as the next character to read from the
stream. This is called a matching error, and is the main reason scanf tends to hang
when reading input from the keyboard; a second call to scanf will almost certainly choke,
since the file position indicator of the stream is not pointing where scanf will expect it to.
Normally, scanf returns the number of assignments made to the arguments it was passed,
so check the return value to see if scanf found all the items you expected.
type char *, and scanf will allocate however large a buffer the string requires, and return
the result in your argument. This is a GNU-only extension to scanf functionality.
Here is a code example that shows first how to safely read a string of fixed maximum
length by allocating a buffer and specifying a field width, then how to safely read a string
of any length by using the ‘a’ flag.
#include <stdio.h>
int main()
{
int bytes_read;
int nbytes = 100;
char *string1, *string2;
return 0;
}
There are a couple of things to notice about this example program. First, notice that
the second argument passed to the first scanf call is string1, not &string1. The scanf
function requires pointers as the arguments corresponding to its conversions, but a string
variable is already a pointer (of type char *), so you do not need the extra layer of indirection
here. However, you do need it for the second call to scanf. We passed it an argument of
&string2 rather than string2, because we are using the ‘a’ flag, which allocates a string
variable big enough to contain the characters it read, then returns a pointer to it.
The second thing to notice is what happens if you type a string of more than 20 characters
at the first prompt. The first scanf call will only read the first 20 characters, then the
second scanf call will gobble up all the remaining characters without even waiting for a
response to the second prompt. This is because scanf does not read a line at a time,
the way the getline function does. Instead, it immediately matches attempts to match
its template string to whatever characters are in the stdin stream. The second scanf call
matches all remaining characters from the overly-long string, stopping at the first whitespace
character. Thus, if you type ‘12345678901234567890xxxxx’ in response to the first prompt,
the program will immediately print the following text without pausing:
You typed the following string:
12345678901234567890
(See Section 16.2.8.1 [sscanf], page 124, for a better example of how to parse input from
the user.)
130 Chapter 16: Input and output
16.2.10 fscanf
The fscanf function is just like the scanf function, except that the first argument of fscanf
specifies a stream from which to read, whereas scanf can only read from standard input.
Here is a code example that generates a text file containing five numbers with fprintf,
then reads them back in with fscanf. Note the use of the ‘#’ flags in the ‘%#d’ conversions
in the fprintf call; this is a good way to generate data in a format that scanf and related
functions can easily read with the ‘%i’ input conversion.
#include <stdio.h>
#include <errno.h>
int main()
{
float f1, f2;
int i1, i2;
FILE *my_stream;
char my_filename[] = "snazzyjazz.txt";
return 0;
}
This code example prints the following output on the screen:
Float 1 = 23.500000
Float 2 = -12000000.000000
Integer 1 = 100
Integer 2 = 5
If you examine the text file ‘snazzyjazz.txt’, you will see it contains the following text:
23.500000 -12000000.000000 100 5
16.3.1 getchar
If you want to read a single character from standard input, you can use the getchar
function. This function takes no parameters, but reads the next character from ‘stdin’ as
getc and fgetc 131
an unsigned char, and returns its value, converted to an integer. Here is a short program
that uses getchar:
#include <stdio.h>
int main()
{
int input_character;
return 0;
}
Note that because stdin is line-buffered, getchar will not return a value until you hit
the hRETURNi key. However, getchar still only reads one character from stdin, so if you
type ‘hellohellohello’ at the prompt, the program above will still only get once character.
It will print the following line, and then terminate:
The key you hit was ’h’.
Bye!
16.3.2 putchar
If you want to print a single character on standard output, you can use the putchar function.
It takes a single integer parameter containing a character (the argument can be a single-
quoted text character, as in the example below), and sends the character to stdout. If a
write error occurs, putchar returns EOF; otherwise, it returns the integer it was passed.
This can simply be disregarded, as in the example below.
Here is a short code example that makes use of putchar. It prints an ‘X’, a space, and
then a line of ten exclamation marks (‘!!!!!!!!!!’) on the screen, then outputs a newline
so that the next shell prompt will not occur on the same line. Notice the use of the for
loop; by this means, putchar can be used not just for one character, but multiple times.
#include <stdio.h>
int main()
{
int i;
putchar (’X’);
putchar (’ ’);
for (i=1; i<=10; i++)
{
putchar (’!’);
}
putchar (’\n’);
return 0;
}
the stream from which to read. It reads the next character from the specified stream as an
unsigned char, and returns its value, converted to an integer. If a read error occurs or the
end of the file is reached, getc returns EOF instead.
Here is a code example that makes use of getc. This code example creates a text
file called ‘snazzyjazz.txt’ with fopen, writes the alphabet in upper-case letters plus a
newline to it with fprintf, reads the file position with ftell, and gets the character there
with getc. It then seeks position 25 with fseek and repeats the process, attempts to read
past the end of the file and reports end-of-file status with feof, and generates an error by
attempting to write to a read-only stream. It then reports the error status with ferror,
returns to the start of the file with rewind and prints the first character, and finally attempts
to close the file and prints a status message indicating whether it could do so.
See Section 16.1.4 [File position], page 109, for information on ftell, fseek, and rewind.
See Section 16.1.6 [End-of-file and error functions], page 112, for more information on feof
and ferror.
#include <stdio.h>
int main()
{
int input_char;
FILE *my_stream;
char my_filename[] = "snazzyjazz.txt";
long position;
int eof_status, error_status, close_error;
printf ("Rewinding...\n");
rewind (my_stream);
position = ftell (my_stream);
input_char = getc (my_stream);
printf ("Character at position %d = ’%c’.\n", position, input_char);
return 0;
}
There is another function in the GNU C Library called fgetc. It is identical to getc in
most respects, except that getc is usually implemented as a macro function and is highly
optimised, so is preferable in most situations. (In situations where you are reading from
standard input, getc is about as fast as fgetc, since humans type slowly compared to how
fast computers can read their input, but when you are reading from a stream that is not
interactively produced by a human, fgetc is probably better.)
The following code example creates a text file called ‘snazzyjazz.txt’. It then writes
an ‘X’, a space, and then a line of ten exclamation marks (‘!!!!!!!!!!’) to the file, and
a newline character to it using the putc function. Notice the use of the for loop; by this
means, putchar can be used not just for one character, but multiple times. , then writes
ten exclamation mark characters (‘!!!!!!!!!!’)
#include <stdio.h>
int main()
{
int i;
FILE *my_stream;
char my_filename[] = "snazzyjazz.txt";
return 0;
}
There is another function in the GNU C Library called fputc. It is identical to putc in
most respects, except that putc is usually implemented as a macro function and is highly
optimised, so is preferable in most situations.
16.3.5 ungetc()
Every time a character is read from a stream by a function like getc, the file position
indicator advances by 1. It is possible to reverse the motion of the file position indicator
with the function ungetc, which steps the file position indicator back by one byte within
the file and reverses the effect of the last character read operation. (This is called unreading
the character or pushing it back onto the stream.)
The intended purpose is to leave the indicator in the correct file position when other
functions have moved too far ahead in the stream. Programs can therefore peek ahead, or
get a glimpse of the input they will read next, then reset the file position with ungetc.
On GNU systems, you cannot call ungetc twice in a row without reading at least one
character in between; in other words, GNU only supports one character of pushback.
Pushing back characters does not change the file being accessed at all; ungetc only
affects the stream buffer, not the file. If fseek, rewind, or some other file positioning
function is called, any character due to be pushed back by ungetc is discarded.
Unreading a character on a stream that is at end-of-file resets the end-of-file indicator
for the stream, because there is once again a character available to be read. However, if the
character pushed back onto the stream is EOF, ungetc does nothing and just returns EOF.
Here is a code example that reads all the whitespace at the beginning of a file with
getc, then backs up one byte to the first non-whitespace character, and reads all following
characters up to a newline character with the getline function. (See Section 16.2.6.1
[getline], page 120, for more information on that function.)
#include <stdio.h>
int main()
{
int in_char;
FILE *my_stream;
char *my_string = NULL;
size_t nchars = 0;
Programming with pipes 135
return 0;
}
The code example will skip all initial whitespace in the file ‘snazzyjazz.txt’, and display
the following text on standard output:
String read:
Here’s some non-whitespace.
all lines that were passed to it that contain the string ‘init’. The output of this whole
process will probably look something like this on your system:
1 ? 00:00:11 init
4884 tty6 00:00:00 xinit
The pipe symbol ‘|’ is very handy for command-line pipes and pipes within shell scripts,
but it is also possible to set up and use pipes within C programs. The two main C functions
to remember in this regard are popen and pclose.
The popen function accepts as its first argument a string containing a shell command,
such as lpr. Its second argument is a string containing either the mode argument ‘r’ or
‘w’. If you specify ‘r’, the pipe will be open for reading; if you specify ‘w’, it will be open
for writing. The return value is a stream open for reading or writing, as the case may be;
if there is an error, popen returns a null pointer.
The pclose function closes a pipe opened by popen. It accepts a single argument, the
stream to close. It waits for the stream to close, and returns the status code returned by
the program that was called by popen.
If you open the pipe for reading or writing, in between the popen and pclose calls, it
is possible to read from or write to the pipe in the same way that you might read from or
write to any other stream, with high-level input/output calls such as getdelim, fprintf
and so on.
The following program example shows how to pipe the output of the ps -A command to
the grep init command, exactly as in the GNU/Linux command line example above. The
output of this program should be almost exactly the same as sample output shown above.
#include <stdio.h>
#include <stdlib.h>
int
main ()
{
FILE *ps_pipe;
FILE *grep_pipe;
int bytes_read;
int nbytes = 100;
char *my_string;
/* Exit! */
return 0;
}
The word component below refers to part of a full file name. For example, in the file
name ‘/home/fred/snozzberry.txt’, ‘fred’ is a component that designates a subdirectory
of the directory ‘/home’, and ‘snozzberry.txt’ is the name of the file proper.
Most functions that accept file name arguments can detect the following error conditions.
These are known as the usual file name errors. The names of the errors, such as EACCES,
are compounded of ‘E’ for “error” and a term indicating the type of error, such as ‘ACCES’
for “access”.
EACCES The program is not permitted to search within one of the directories in the file
name.
ENAMETOOLONG
Either the full file name is too long, or some component is too long. GNU does
not limit the overall length of file names, but depending on which file system you
are using, the length of component names may be limited. (For example, you
may be running GNU/Linux but accessing a Macintosh HFS disk; the names
of Macintosh files cannot be longer than 31 characters.)
ENOENT Either some component of the file name does not exist, or some component is
a symbolic link whose target file does not exist.
ENOTDIR One of the file name components that is supposed to be a directory is not a
directory.
ELOOP Too many symbolic links had to be followed to find the file. (GNU has a limit
on how many symbolic links can be followed at once, as a basic way to detect
recursive (looping) links.)
You can display English text for each of these errors with the ‘m’ conversion specifier of
the printf function, as in the following short example.
errno = EACCES;
printf ("errno string (EACCES): %m\n");
This example prints the following string:
errno string (EACCES): Permission denied
See Section 16.2.2.2 [Formatted output conversion specifiers], page 115, for more information
on the ‘m’ conversion specifier.
The following flags are the more important ones for a beginning C programmer to know.
There are a number of file status flags which are relevant only to more advanced program-
mers; for more details, see section “File Status Flags” in The GNU C Library Reference
Manual.)
Note that these flags are defined in macros in the GNU C Library header file ‘fcntl.h’,
so remember to insert the line #include <fcntl.h> at the beginning of any source code
file that uses them.
O_RDONLY Open the file for read access.
O_WRONLY Open the file for write access.
O_RDWR Open the file for both read and write access. Same as O_RDONLY | O_WRONLY.
O_READ
Same as O_RDWR. GNU systems only.
O_WRITE Same as O_WRONLY. GNU systems only.
O_EXEC Open the file for executing. GNU systems only.
O_CREAT The file will be created if it doesn’t already exist.
O_EXCL If O_CREAT is set as well, then open fails if the specified file exists already. Set
this flag if you want to ensure you will not clobber an existing file.
O_TRUNC Truncate the file to a length of zero bytes. This option is not useful for direc-
tories or other such special files. You must have write permission for the file,
but you do not need to open it for write access to truncate it (under GNU).
O_APPEND Open the file for appending. All write operations then write the data at the
end of the file. This is the only way to ensure that the data you write will always
go to the end of the file, even if there are other write operations happening at
the same time.
The open function normally returns a non-negative integer file descriptor connected to
the specified file. If there is an error, open will return -1 instead. In that case, you can
check the errno variable to see which error occurred. In addition to the usual file name
errors, open can set errno to the following values. (It can also specify a few other errors of
interest only to advanced C programmers. See section “Opening and Closing Files” in The
GNU C Library Reference Manual, for a full list of error values. See Section 16.5.1 [Usual
file name errors], page 137, for a list of the usual file name errors.).
EACCES The file exists but is cannot be does not have read or write access (as requested),
or the file does not exist but cannot be created because the directory does not
have write access.
EEXIST Both O_CREAT and O_EXCL are set, and the named file already exists. To open
it would clobber it, so it will not be opened.
EISDIR Write access to the file was requested, but the file is actually a directory.
EMFILE Your program has too many files open.
ENOENT The file named does not exist, and O_CREAT was not specified, so the file will
not be created.
140 Chapter 16: Input and output
ENOSPC The file cannot be created, because the disk is out of space.
EROFS The file is on a read-only file system, but either one of O_WRONLY, O_RDWR, or
O_TRUNC was specified, or O_CREAT was set and the file does not exist.
See Section 16.5.3 [Closing files at a low level], page 140, for a code example using both
the low-level file functions open and close.
Remember, close a stream by using fclose instead. This allows the necessary system
bookkeeping to take place before the file is closed.
Here is a code example using both the low-level file functions open and close.
#include <stdio.h>
#include <fcntl.h>
int main()
{
/*
Open my_filename for writing. Create it if it does not exist.
Do not clobber it if it does.
*/
return 0;
}
Running the above code example for the first time should produce no errors, and should
create an empty text file called ‘snazzyjazz17.txt’. Running it a second time should
display the following errors on your monitor, since the file ‘snazzyjazz17.txt’ already
exists, and should not be clobbered according to the flags passed to open.
Open failed.
Close failed.
EBADF The file descriptor passed to read is not valid, or is not open for reading.
EIO There was a hardware error. (This error code also applies to more abstruse
conditions detailed in the GNU C Library manual.)
See Section 16.5.5 [Writing files at a low level], page 142, for a code example that uses
the read function.
142 Chapter 16: Input and output
int main()
{
char my_write_str[] = "1234567890";
char my_read_str[100];
char my_filename[] = "snazzyjazz.txt";
int my_file_descriptor, close_err;
Finding file positions at a low level 143
close (my_file_descriptor);
return 0;
}
The lseek function is called by many high-level file position functions, including fseek,
rewind, and ftell.
and the old name are directories, and write permission is refused for at least
one of them.
EBUSY One of the directories used by the old name or the new name is being used by
the system and cannot be changed.
ENOTEMPTY
The directory was not empty, so cannot be deleted. This code is synonymous
with EEXIST, but GNU always returns ENOTEMPTY.
EINVAL The old name is a directory that contains the new name.
EISDIR The new name is a directory, but the old name is not.
EMLINK The parent directory of the new name would contain too many entries if the
new name were created.
ENOENT The old name does not exist.
ENOSPC The directory that would contain the new name has no room for another entry,
and cannot be expanded.
EROFS The rename operation would involve writing on a read-only file system.
EXDEV The new name and the old name are on different file systems.
16.6 Questions
1. What are the following?
1. File name
2. File descriptor
3. Stream
2. What is a pseudo-device name?
3. Where does ‘stdin’ usually get its input?
4. Where does ‘stdout’ usually send its output?
5. Write a program that simply prints out the following string to the screen: ‘6.23e+00’.
6. Investigate what happens when you type the wrong conversion specifier in a program.
e.g. try printing an integer with ‘%f’ or a floating point number with ‘%c’. This is
bound to go wrong – but how will it go wrong?
7. What is wrong with the following statements?
1. printf (x);
2. printf ("%d");
3. printf ();
4. printf ("Number = %d");
Hint: if you don’t know, try them in a program!
8. What is a whitespace character?
9. Write a program that aceepts two integers from the user, multiplies them together, and
prints the answer on your printer. Try to make the input as safe as possible.
146 Chapter 16: Input and output
10. Write a program that simply echoes all the input to the output.
11. Write a program that strips all space characters out of the input and replaces each
string of them with a single newline character.
12. The scanf function always takes pointer arguments. True or false?
13. What is the basic difference between high-level and low-level file routines?
14. Write a statement that opens a high level file for reading.
15. Write a statement that opens a low level file for writing.
16. Write a program that checks for illegal characters in text files. The only valid characters
are ASCII codes 10, 13, and 32..126.
17. What statement performs formatted writing to text files?
18. Poke around in the header files on your system so you can see what is defined where.
argc and argv 147
int main()
{
return 0;
}
From now on, our examples may look a bit more like this:
#include <stdio.h>
return 0;
}
As you can see, main now has arguments. The name of the variable argc stands for
“argument count”; argc contains the number of arguments passed to the program. The
name of the variable argv stands for “argument vector”. A vector is a one-dimensional
array, and argv is a one-dimensional array of strings. Each string is one of the arguments
that was passed to the program.
For example, the command line
148 Chapter 17: Putting a program together
if (argc > 1)
{
for (count = 1; count < argc; count++)
{
printf("argv[%d] = %s\n", count, argv[count]);
}
}
else
{
printf("The command had no other arguments.\n");
}
return 0;
}
If you name your executable ‘fubar’, and call it with the command ‘./fubar a b c’, it
will print out the following text:
This program was called with "./fubar".
argv[1] = a
argv[2] = b
argv[3] = c
provides for these. For the modest price of setting up your command line arguments in a
structured way, and with surprisingly few lines of code, you can obtain all the perks of a
“real” GNU program, such as “automagically”-generated output to the ‘--help’, ‘--usage’,
and ‘--version’ options, as defined by the GNU coding standards. Using argp results in
a more consistent look-and-feel for programs that use it, and makes it less likely that the
built-in documentation for a program will be wrong or out of date.
POSIX, the Portable Operating System Interface standard, recommends the following
conventions for command-line arguments. The argp interface makes implementing them
easy.
• Command-line arguments are options if they begin with a hyphen (‘-’).
• Multiple options may follow a hyphen in a cluster if they do not take arguments. Thus,
‘-abc’ and ‘-a -b -c’ are the same.
• Option names are single alphanumeric characters.
• Options may require an argument. For example, the ‘-o’ option of the ld command
requires an output file name.
• The whitespace separating an option and its argument is optional. Thus, ‘-o foo’ and
‘-ofoo’ are the same.
• Options usually precede non-option arguments. (In fact, argp is more flexible than this;
if you want to suppress this flexibility, define the _POSIX_OPTION_ORDER environment
variable.)
• The argument ‘--’ terminates all options; all following command-line arguments are
considered non-option arguments, even if they begin with a hyphen.
• A single hyphen as an argument is considered a non-option argument; by convention,
it is used to specify input from standard input or output to standard output.
• Options may appear in any order, even multiple times. The meaning of this is left to
the application.
In addition, GNU adds long options, like the ‘--help’, ‘--usage’, and ‘--version’
options mentioned above. A long option starts with ‘--’, which is then followed by a string
of alphanumeric characters and hyphens. Option names are usually one to three words
long, with hyphens to separate words. Users can abbreviate the option names as long as
the abbreviations are unique. A long option (such as ‘--verbose’) often has a short-option
synonym (such as ‘-v’).
Long options can accept optional (that is, non-necessary) arguments. You can specify
an argument for a long option as follows:
‘--’option-name ‘=’value
You may not type whitespace between the option name and the equals sign, or between the
equals sign and the option value.
complicated and flexible facility of the GNU C Library, consult section “Parsing Program
Options with Argp” in The GNU C Library Reference Manual. Nevertheless, what you
learn in this chapter may be all you need to develop a program that is compliant with GNU
coding standards, with respect to command-line options.
The main interface to argp is the argp_parse function. Usually, the only argument-
parsing code you will need in main is a call to this function. The first parameter it takes
is of type const struct argp *argp, and specifies an ARGP structure (see below). (A value
of zero is the same as a structure containing all zeros.) The second parameter is simply
argc, the third simply argv. The fourth parameter is a set of flags that modify the parsing
behaviour; setting this to zero usually doesn’t hurt unless you’re doing something fancy, and
the same goes for the fifth parameter. The sixth parameter can be useful; in the example
below, we use it to pass information from main to our function parse_opt, which does
most of the work of initalizing internal variables (fields in the arguments structure) based
on command-line options and arguments.
The argp_parse returns a value of type error_t: usually either 0 for success, ENOMEM if
a memory allocation error occurred, or EINVAL if an unknown option or argument was met
with.
For this example, we are using only the first four fields in ARGP, which are usually all
that is needed. The rest of the fields will default to zero. The four fields are, in order:
1. OPTIONS: A pointer to a vector the elements of which are of type struct argp_option,
which contains four fields. The vector elements specify which options this parser un-
derstands. If you assign your option structure by initializing the array as we do in
this section’s main example, unspecified fields will default to zero, and need not be
specified. The whole vector may contain zero if there are no options at all. It should
in any case be terminated by an entry with a zero in all fields (as we do by specifying
the last item in the options vector to be {0} in the main example below.
The four main argp_option structure fields are as follows. (We will ignore the fifth
one, which is relatively unimportant and will simply default to zero if you do not specify
it.)
1. NAME: The name of this option’s long option (may be zero). To specify multiple
names for an option, follow it with additional entries, with the OPTION_ALIAS flag
set.
2. KEY: The integer key to pass to the PARSER function when parsing the current
option; this is the same as the name of the current option’s short option, if it is a
printable ASCII character.
3. ARG: The name of this option’s argument, if any.
4. FLAGS: Flags describing this option. You can specify multiple flags with logical
OR (for example, OPTION_ARG_OPTIONAL | OPTION_ALIAS).
Some of the available options are:
• OPTION_ARG_OPTIONAL: The argument to the current option is optional.
• OPTION_ALIAS: The current option is an alias for the previous option.
• OPTION_HIDDEN: Don’t show the current option in --help output.
5. DOC: A documentation string for the current option; will be shown in --help
output.
argp example 151
2. PARSER: A pointer to a function to be called by argp for each option parsed. It should
return one of the following values:
• 0: Success.
• ARGP_ERR_UNKNOWN: The given key was not recognized.
• An errno value indicating some other error. (See Section 16.5.1 [Usual file name
errors], page 137.)
There are also some utility functions associated with argp, such as argp_usage, which
prints out the standard usage message. We use this function in the parse_opt function in
the following example. See section “Functions For Use in Argp Parsers” in The GNU C
Library Reference Manual, for more of these utility functions.
#include <stdio.h>
#include <argp.h>
/*
OPTIONS. Field 1 in ARGP.
Order of fields: {NAME, KEY, ARG, FLAGS, DOC}.
*/
static struct argp_option options[] =
{
{"verbose", ’v’, 0, 0, "Produce verbose output"},
{"alpha", ’a’, "STRING1", 0,
"Do something with STRING1 related to the letter A"},
{"bravo", ’b’, "STRING2", 0,
"Do something with STRING2 related to the letter B"},
{"output", ’o’, "OUTFILE", 0,
"Output to OUTFILE instead of to standard output"},
{0}
};
/*
PARSER. Field 2 in ARGP.
Order of parameters: KEY, ARG, STATE.
*/
static error_t
parse_opt (int key, char *arg, struct argp_state *state)
{
struct arguments *arguments = state->input;
switch (key)
{
case ’v’:
arguments->verbose = 1;
break;
case ’a’:
arguments->string1 = arg;
break;
case ’b’:
arguments->string2 = arg;
break;
case ’o’:
arguments->outfile = arg;
break;
argp example 153
case ARGP_KEY_ARG:
if (state->arg_num >= 2)
{
argp_usage(state);
}
arguments->args[state->arg_num] = arg;
break;
case ARGP_KEY_END:
if (state->arg_num < 2)
{
argp_usage (state);
}
break;
default:
return ARGP_ERR_UNKNOWN;
}
return 0;
}
/*
ARGS_DOC. Field 3 in ARGP.
A description of the non-option command-line arguments
that we accept.
*/
static char args_doc[] = "ARG1 ARG2";
/*
DOC. Field 4 in ARGP.
Program documentation.
*/
static char doc[] =
"argex -- A program to demonstrate how to code command-line options
and arguments.\vFrom the GNU C Tutorial.";
/*
The ARGP structure itself.
*/
static struct argp argp = {options, parse_opt, args_doc, doc};
/*
The main function.
Notice how now the only function call needed to process
all command-line options and arguments nicely
is argp_parse.
*/
int main (int argc, char **argv)
{
struct arguments arguments;
FILE *outstream;
154 Chapter 17: Putting a program together
char waters[] =
"a place to stay
enough to eat
somewhere old heroes shuffle safely down the street
where you can speak out loud
about your doubts and fears
and what’s more no-one ever disappears
you never hear their standard issue kicking in your door
you can relax on both sides of the tracks
and maniacs don’t blow holes in bandsmen by remote control
and everyone has recourse to the law
and no-one kills the children anymore
and no-one kills the children anymore
--\"the gunners dream\", Roger Waters, 1983\n";
return 0;
}
Compile the code, then experiment! For example, here is the program output if you
simply type argex:
Usage: argex [-v?V] [-a STRING1] [-b STRING2] [-o OUTFILE] [--alpha=STRING1]
[--bravo=STRING2] [--output=OUTFILE] [--verbose] [--help] [--usage]
[--version] ARG1 ARG2
ARG1 = Foo
ARG2 = Bar
And finally, here is the output from argex --verbose -a 123 --bravo=456 Foo Bar:
alpha = 123
bravo = 456
ARG1 = Foo
ARG2 = Bar
a place to stay
enough to eat
somewhere old heroes shuffle safely down the street
where you can speak out loud
about your doubts and fears
and what’s more no-one ever disappears
you never hear their standard issue kicking in your door
you can relax on both sides of the tracks
and maniacs don’t blow holes in bandsmen by remote control
and everyone has recourse to the law
and no-one kills the children anymore
and no-one kills the children anymore
--"the gunners dream", Roger Waters, 1983
You can of course also send the output to a text file with the ‘-o’ or ‘--output’ option.
return 0;
}
Notice that envp is an array of strings, just as argv is. It consists of a list of the
environment variables of your shell, in the following format:
NAME =value
Just as you can manually process command-line options from argv, so can you man-
ually process environment variables from envp. However, the simplest way to access the
value of an environment variable is with the getenv function, defined in the system header
‘stdlib.h’. It takes a single argument, a string containing the name of the variable whose
value you wish to discover. It returns that value, or a null pointer if the variable is not
defined.
#include <stdio.h>
#include <stdlib.h>
home = getenv("HOME");
host = getenv("HOSTNAME");
return 0;
}
When you run this code, it will print out a line like the following one.
Your home directory is /home/rwhe on linnaeus.
Writing a makefile 157
Note: Do not modify strings returned from getenv; they are pointers to data that
belongs to the system. If you want to process a value returned from getenv, copy it to
another string first with strcpy. (See Chapter 15 [Strings], page 97.) If you want to change
an environment variable from within your program (not usually advisable), use the putenv,
setenv, and unsetenv functions. See section “Environment Access” in The GNU C Library
Reference Manual, for more information on these functions.
In this section, we will discuss a simple makefile that describes how to compile and link
a text editor which consists of eight C source files and three header files. The makefile can
also tell make how to run miscellaneous commands when explicitly asked (for example, to
remove certain files as a clean-up operation).
Although the examples in this section show C programs, you can use make with any
programming language whose compiler can be run with a shell command. Indeed, make
is not limited to programs. You can use it to describe any task where some files must be
updated automatically from others whenever the others change.
Your makefile describes the relationships among files in your program and provides
commands for updating each file. In a program, typically, the executable file is updated
from object files, which are in turn made by compiling source files.
Once a suitable makefile exists, each time you change some source files, this simple shell
command:
make
suffices to perform all necessary recompilations. The make program uses the makefile data-
base and the last-modification times of the files to decide which of the files need to be
updated. For each of those files, it issues the commands recorded in the database.
You can provide command line arguments to make to control which files should be
recompiled, or how.
When make recompiles the editor, each changed C source file must be recompiled. If a
header file has changed, each C source file that includes the header file must be recompiled to
be safe. Each compilation produces an object file corresponding to the source file. Finally,
if any source file has been recompiled, all the object files, whether newly made or saved
from previous compilations, must be linked together to produce the new executable editor.
A rule, then, explains how and when to remake certain files which are the targets of the
particular rule. make carries out the commands on the prerequisites to create or update the
target. A rule can also explain how and when to carry out an action.
A makefile may contain other text besides rules, but a simple makefile need only contain
rules. Rules may look somewhat more complicated than shown in this template, but all fit
the pattern more or less.
make will recompile the object files ‘kbd.o’, ‘command.o’ and ‘files.o’, and then link the
file ‘edit’.
edit : $(objects)
cc -o edit $(objects)
main.o : main.c defs.h
cc -c main.c
kbd.o : kbd.c defs.h command.h
cc -c kbd.c
command.o : command.c defs.h command.h
cc -c command.c
display.o : display.c defs.h buffer.h
cc -c display.c
insert.o : insert.c defs.h buffer.h
cc -c insert.c
search.o : search.c defs.h buffer.h
cc -c search.c
files.o : files.c defs.h buffer.h command.h
cc -c files.c
utils.o : utils.c defs.h
cc -c utils.c
clean :
rm edit $(objects)
command gcc -c main.c -o main.o to compile ‘main.c’ into ‘main.o’. We can therefore
omit the commands from the rules for the object files.
When a ‘.c’ file is used automatically in this way, it is also automatically added to the
list of prerequisites. We can therefore omit the ‘.c’ files from the prerequisites, provided
we omit the commands.
Here is the entire example, with both of these changes, and the variable objects as
suggested above:
objects = main.o kbd.o command.o display.o \
insert.o search.o files.o utils.o
edit : $(objects)
cc -o edit $(objects)
main.o : defs.h
kbd.o : defs.h command.h
command.o : defs.h command.h
display.o : defs.h buffer.h
insert.o : defs.h buffer.h
search.o : defs.h buffer.h
files.o : defs.h buffer.h command.h
utils.o : defs.h
.PHONY : clean
clean :
-rm edit $(objects)
This is how we would write the makefile in actual practice. (See Section 17.5.7 [Rules for
cleaning the directory], page 163, for the complications associated with clean.)
Because implicit rules are so convenient, they are important. You will see them used
frequently.
edit : $(objects)
cc -o edit $(objects)
$(objects) : defs.h
kbd.o command.o files.o : command.h
display.o insert.o search.o files.o : buffer.h
Here ‘defs.h’ is given as a prerequisite of all the object files, and ‘command.h’ and
‘buffer.h’ are prerequisites of the specific object files listed for them.
Whether this is better is a matter of taste: it is more compact, but some people dislike
it because they find it clearer to put all the information about each target in one place.
Building a library 163
2.
Now we will create a library.
• To create a static library called ‘liblprprint.a’ containing this function, just
type the following two command lines in your GNU shell:
gcc -c lpr_print.c
ar rs liblprprint.a lpr_print.o
The ‘-c’ option to gcc produces only a ‘.o’ object code file, without linking it,
while the ar command (with its ‘rs’ options) permits the creation of an archive
file, which can contain a bundle of other files that can be re-extracted later (for
example, when executing library code). In this case, we are only archiving one
object code file, but in some cases, you might want to archive multiple ones. (See
the man page for ar for more information.)
• To create a shared library called ‘liblprprint.so’ instead, enter the following
sequence of commands:1
gcc -c -fpic lpr_print.c
gcc -shared -o liblprprint.so lpr_print.o
(For the record, ‘pic’ stands for “position-independent code”, an object-code for-
mat required for shared libraries. You might need to use the option ‘-fPIC’ instead
of ‘-fpic’ if your library is very large.)
3. Now create a header file that will allow users access to the functions in your library.
You should provide one function prototype for each function in your library. Here is a
header file for the library we have created, called ‘liblprprint.h’.
/*
liblprprint.h:
routines in liblprprint.a
and liblprprint.so
*/
your ‘.bashrc’ or ‘.bash_profile’ file. If you don’t execute this command before you
attempt to run a program using your shared library, you will probably receive an error.
6.
Now you can write programs that use your library. Consider the following short pro-
gram, called ‘printer.c’:
#include <liblprprint.h>
To compile this program using your static library, type something like the following
command line:
gcc --static -I../include -L../lib -o printer printer.c -llprprint
The ‘--static’ option forces your static library to be linked; the default is your
shared version. The ‘-llprprint’ option makes GCC link in the ‘liblprprint’
library, just as you would need to type ‘-lm’ to link in the ‘libm’ math library.
The ‘-I../include’ and ‘-L../lib’ options specify that the compiler should look
in the ‘../include’ directory for include files and in the ‘../lib’ directory for
library files. This assumes that you have created the ‘include’ and ‘lib’ directories
in your home directory as outlined above, and that you are compiling your code
in a subdirectory of your home directory. If you are working two directories down,
you would specify ‘-I../../include’, and so on.
The above command line assumes you are using only one ‘.c’ source code file; if
you are using more than one, simply include them on the command line as well.
(See Section 17.4 [Compiling multiple files], page 157.)
Note: Using the ‘--static’ option will force the compiler to link all libraries you
are using statically. If you want to use the static version of your library, but some
shared versions of other libraries, you can omit the ‘--static’ option from the
command line and specify the static version of your library explicitly, as follows:
gcc -I../include -L../lib -o printer printer.c ../lib/liblprprint.a
• To compile this program using your shared library, type something like the follow-
ing command line.
gcc -I../include -L../lib -o printer printer.c -llprprint
7. The executable produced is called ‘printer’. Try it!
17.7 Questions
1. What is the name of the preferred method for handling command-line options?
2. What does the ‘-c’ option of the gcc command do?
3. What information does the argc variable contain?
4. What information does the argv variable contain?
5. What information does the envp variable contain?
166 Chapter 17: Putting a program together
Hidden assignments 167
18 Advanced operators
Concise expressions
In this chapter, we will examine some advanced mathematical and logical operators in
C.
b = 0;
c = 0;
Note: Don’t confuse this technique with a logical test for equality. In the above example,
both b and c are set to 0. Consider the following, superficially similar, test for equality,
however:
b = (c == 0);
In this case, b will only be assigned a zero value (FALSE) if c does not equal 0. If c does
equal 0, then b will be assigned a non-zero value for TRUE, probably 1. (See Section 7.8
[Comparisons and logic], page 34, for more information.)
Any number of these assignments can be strung together:
a = (b = (c = (d = (e = 5))));
or simply:
a = b = c = d = e = 5;
This elegant syntax compresses five lines of code into a single line.
There are other uses for treating assignment expressions as values. Thanks to C’s flexible
syntax, they can be used anywhere a value can be used. Consider how an assignment
expression might be used as a parameter to a function. The following statement gets a
character from standard input and passes it to a function called process_character.
process_character (input_char = getchar());
This is a perfectly valid statement in C, because the hidden assignment statements passes
the value it assigns on to process_character. The assignment is carried out first and then
the process_character function is called, so this is merely a more compact way of writing
the following statements.
input_char = getchar();
process_character (input_char);
All the same remarks apply about the specialized assignment operators +=, *=, /=, and
so on.
The following example makes use of a hidden assignment in a while loop to print out
all values from 0.2 to 20.0 in steps of 0.2.
#include <stdio.h>
printf ("\n");
return 0;
}
more complicated than assignments because they exist in two forms, postfix (for example,
my_var++) and prefix (for example, ++my_var).
Postfix and prefix forms have subtly different meanings. Take the following example:
int my_int = 3;
printf ("%d\n", my_int++);
The increment operator is hidden in the parameter list of the printf call. The variable
my_int has a value before the ++ operator acts on it (3) and afterwards (4).
Which value is passed to printf? Is my_int incremented before or after the printf
call? This is where the two forms of the operator (postfix and prefix) come into play.
In the example above, then, the value passed to printf is 3, and when the printf
function returns, the value of my_int is incremented to 4. The alternative is to write
int my_int = 3;
printf ("%d\n", ++my_int);
The same remarks apply to the decrement operator as to the increment operator.
#define ARRAY_SIZE 20
return 0;
}
This is a convenient way to initialize an array to zero. Notice that the body of the loop is
completely empty!
Strings can benefit from hidden operators as well. If the standard library function
strlen, which finds the length of a string, were not available, it would be easy to write it
with hidden operators:
170 Chapter 18: Advanced operators
#include <stdio.h>
return (count);
}
return 0;
}
The my_strlen function increments count while the end of string marker ‘\0’ is not found.
Again, notice that the body of the loop in this function is completely empty.
a = (b = 2, c = 3, d = 4);
printf ("a=%d\nb=%d\nc=%d\nd=%d\n",
a, b, c, d);
return 0;
}
The value of (b = 2, c = 3, d = 4) is 4 because the value of its rightmost sub-expression,
d = 4, is 4. The value of a is thus also 4. When run, this example prints out the following
text:
Bitwise operators 171
a=4
b=2
c=3
d=4
The comma operator is very useful in for loops. (See Section 11.4 [The flexibility of
for], page 62, for an example.)
1 >> 1 == 0
2 >> 1 == 1
2 >> 2 == 0
n >> n == 0
One common use of bit-shifting is to scan through the bits of a bit-string one by one in
a loop. This is done with bit masks, as described in the next section.
18.3.3.5 Masks
Bit strings and bitwise operators are often used to make masks. A mask is a bit string that
“fits over” another bit string and produces a desired result, such as singling out particular
bits from the second bit string, when the two bit strings are operated upon. This is par-
ticularly useful for handling flags; programmers often wish to know whether one particular
flag is set in a bit string, but may not care about the others. For example, you might create
a mask that only allows the flag of interest to have a non-zero value, then AND that mask
with the bit string containing the flag.
Consider the following mask, and two bit strings from which we want to extract the final
bit:
mask = 00000001
value1 = 10011011
value2 = 10011100
mask == 00000111
See Section 16.5.2 [Opening files at a low level], page 138, for a code example that
actually uses bitwise OR to join together several flags.
It should be emphasized that the flag and mask examples are written in pseudo-code, that
is, a means of expressing information that resembles source code, but cannot be compiled.
It is not possible to use binary numbers directly in C.
The following code example shows how bit masks and bit-shifts can be combined. It
accepts a decimal number from the user between 0 and 128, and prints out a binary number
in response.
Questions 18 175
#include <stdio.h>
#define NUM_OF_BITS 8
args_assigned = 0;
input_int = -1;
while ((args_assigned != 1) ||
(input_int < 0) || (input_int > 128))
{
puts ("Please enter an integer from 0 to 128.");
my_string = (char *) malloc (nbytes + 1);
getline (&my_string, &nbytes, stdin);
args_assigned = sscanf (my_string, "%d", &input_int);
if ((args_assigned != 1) ||
(input_int < 0) || (input_int > 128))
puts ("\nInput invalid!");
}
/*
Convert decimal numbers into binary
Keep shifting my_short by one to the left
and test the highest bit. This does
NOT preserve the value of my_short!
*/
printf ("\n");
return 0;
}
18.4 Questions 18
1. Hidden operators can be used in return statements, for example,
return (++x);
Would there be any point in writing the following?
176 Chapter 18: Advanced operators
return (x++);
2. What distinguishes a bit string from an ordinary variable? Can any variable be a bit
string?
3. What is the difference between an inclusive OR operation and an exclusive OR opera-
tion?
4. Find out what the decimal values of the following operations are.
1. 7 & 2
2. 1 & 1
3. 15 & 3
4. 15 & 7
5. 15 & 7 & 3
Try to explain the results. (Hint: sketch out the numbers as bit strings.)
5. Find out what the decimal values of the following operations are.
1. 1 | 2
2. 1 | 2 | 3
6. Find out the decimal values of the following operations.
1. 1 & (~1)
2. 23 & (~23)
3. 2012 & (~2012)
(Hint: write a short program to work them out.)
enum 177
19.1 enum
The enum type specifier is short for “enumerated data”. The user can define a fixed set of
words that a variable of type enum can take as its value. The words are assigned integer
values by the compiler so that code can compare enum variables. Consider the following
code example:
#include <stdio.h>
return 0;
}
This example defines an enumerated variable type called compass_direction, which can
be assigned one of four enumerated values: north, east, south, or west. It then declares
a variable called my_direction of the enumerated compass_direction type, and assigns
my_direction the value west.
Why go to all this trouble? Because enumerated data types allow the programmer to
forget about any numbers that the computer might need in order to process a list of words,
and simply concentrate on using the words themselves. It’s a higher-level way of doing
things; in fact, at a lower level, the computer assigns each possible value in an enumerated
data type an integer cconstant — one that you do not need to worry about.
Enumerated variables have a natural partner in the switch statement, as in the following
code example.
178 Chapter 19: More data types
#include <stdio.h>
enum compass_direction
{
north,
east,
south,
west
};
switch (my_direction)
{
case north:
puts("North? Say hello to the polar bears!");
break;
case south:
puts("South? Say hello to Tux the penguin!");
break;
case east:
puts("If you go far enough east, you’ll be west!");
break;
case west:
puts("If you go far enough west, you’ll be east!");
break;
}
return 0;
}
In this example, the compass_direction type has been made global, so that the get_
direction function can return that type. The main function prompts the user, ‘Which way
are you going?’, then calls the “dummy” function get_direction. In a “real” program,
such a function would accept input from the user and return an enumerated value to main,
but in this case it merely returns the value south. The output from this code example is
therefore as follows:
Which way are you going?
South? Say hello to Tux the penguin!
As mentioned above, enumerated values are converted into integer values internally by
the compiler. It is practically never necessary to know what integer values the compiler
assigns to the enumerated words in the list, but it may be useful to know the order of the
void 179
enumerated items with respect to one another. The following code example demonstrates
this.
#include <stdio.h>
planet1 = Mars;
planet2 = Earth;
return 0;
}
The output from this example reads as follows:
Mars is farther from the Sun than Earth is.
19.2 void
The void data type was introduced to make C syntactically consistent. The main reason
for void is to declare functions that have no return value. The word “void” is therefore
used in the sense of “empty” rather than that of “invalid”.
C functions are considered by the compiler to return type int unless otherwise specified.
Although the data returned by a function can legally be ignored by the function calling it,
the void data type was introduced by the ANSI standard so that C compilers can issue
warnings when an integer value is not returned by a function that is supposed to return
one. If you want to write a function that does not return a value, simply declare it void. A
function declared void has no return value and simply returns with the command return;.
Variables can be declared void as well as functions:
void my_variable;
void *my_pointer;
A variable that is itself declared void (such as my_variable above) is useless; it cannot
be assigned a value, cannot be cast to another type, in fact, cannot be used in any way.
Void pointers (type void *) are a different case, however. A void pointer is a generic
pointer; any pointer can be cast to a void pointer and back without any loss of information.
180 Chapter 19: More data types
Any type of pointer can be assigned to (or compared with) a void pointer, without casting
the pointer explicitly.
Finally, a function call can be cast to void in order to explicitly discard a return value.
For example, printf returns a value, but it is seldom used. Nevertheless, the two lines of
code that follow are equivalent:
printf ("Hullo!\n");
(void) printf ("Hullo!\n");
There is no good reason to prefer the second line to the first, however, so using the more
concise form is preferred.
19.3 volatile
The volatile type qualifier was introduced by the ANSI Standard to permit the use of
memory-mapped variables, that is, variables whose value changes autonomously based on
input from hardware. One might declare a volatile variable volatile float temperature;
whose value fluctuated according to readings from a digital thermometer connected to the
computer.
There is another use for the volatile qualifier that has to do with multiprocessing
operating systems. Independent processes that share common memory might each change
the value of a variable independently. The volatile keyword serves as a warning to the
compiler that it should not optimize the code containing the variable (that is, compile it so
that it will run in the most efficient way possible) by storing the value of the variable and
referring to it repeatedly, but should reread the value of the variable every time. (Volatile
variables are also flagged by the compiler as not to be stored in read-only memory.)
19.4 Constants
Constants in C usually refer to two things: either a type of variable whose value cannot
change declared with the const qualifier (in this case, “variable” is something of a mis-
nomer), or a string or numeric value incorporated directly into C code, such as ‘1000’. We
will examine both kinds of constant in the next two sections.
19.4.1 const
Sometime a variable must be assigned a value once and once only; for example, it might be
in read-only memory. The reserved word const is, like static and volatile, a data type
qualifier that can be applied to many different data types. It declares a variable to be a
constant, whose value cannot be reassigned. A const must be assigned a value when it is
declared.
const double avogadro = 6.02e23;
const int moon_landing = 1969;
You can also declare constant arrays:
const int my_array[] =
{0, 1, 2, 3, 4, 5, 6, 7, 8};
Any attempt to assign a new value to a const variable will result in a compile-time error
such as the following:
const.c: In function ‘main’:
const.c:11: warning: assignment of read-only variable ‘avogadro’
typedef 181
Similarly, you can declare a value to be a float by appending the letter ‘F’ to it. Of
course, numeric constants containing a decimal point are automatically considered floats.
The following constants are both floating-point numbers:
#define MY_FLOAT1 23F;
#define MY_FLOAT2 23.5001;
You can declare a hexadecimal (base-16) number by prefixing it with ‘0x’; you can declare
an octal (base-8) number by prefixing it with ‘0’. For example:
int my_hex_integer = 0xFF; /* hex FF */
int my_octal_integer = 077; /* octal 77 */
You can use this sort of notation with strings and character constants too. ASCII
character values range from 0 to 255. You can print any character in this range by prefixing a
hexadecimal value with ‘\x’ or an octal value with ‘\’. Consider the following code example,
which demonstrates how to print the letter ‘A’, using either a hexadecimal character code
(‘\x41’) or an octal one (‘\101’).
#include <stdio.h>
return 0;
}
Of course, you can assign a variable declared with the const qualifier (the first kind
of “constant” we examined) a constant expression declared with one of the typographical
expressions above. For example:
const int my_hex_integer = 0xFF; /* hex FF */
const int my_octal_integer = 077; /* octal 77 */
19.6 typedef
You can define your own data types in C with the typedef command, which may be written
inside functions or in global scope. This statement is used as follows:
typedef existing_type new_type ;
You can then use the new type to declare variables, as in the following code example,
which declares a new type called my_type and declares three variables to be of that type.
#include <stdio.h>
var1 = 10;
var2 = 20;
var3 = 30;
return 0;
}
The new type called my_type behaves just like an integer. Why, then, would we use it
instead of integer?
Actually, you will seldom wish to rename an existing data type. The most important
use for typedef is in renaming structures and unions, whose names can become long and
tedious to declare otherwise. We’ll investigate structures and unions in the next chapter.
(See Chapter 20 [Data structures], page 183.)
19.7 Questions 19
1. Enumerated names are given integer values by the compiler so that it can do multipli-
cation and division with them. True or false?
2. Does void do anything which C cannot already do without this type?
3. What type qualifier might a variable accessed directly by a timer be given?
4. Write a statement which declares a new type "real" to be like the usual type "double".
5. Variables declared with the qualifier const can be of any type. True or false?
Structure declarations 183
20 Data structures
Grouping data. Tidying up programs.
It would be hard for a program to manipulate data if it were scattered around with no
particular structure. C therefore has several facilities to group data together in convenient
packages, or data structures. One type of data structure in C is the struct (or structure)
data type, which is a group of variables clustered together with a common name. A related
data type is the union, which can contain any type of variable, but only one at a time.
Finally, structures and unions can be linked together into complex data structures such as
lists and trees. This chapter explores all of these kinds of data structure.
It is important to distinguish the terms structure and data structure. “Data structure”
is a generic term that refers to any pattern of data in a computer program. An array is a
data structure, as is a string. A structure is a particular data type in C, the struct; all
struct variables (structures) are data structures, but not all data structures are structures.
20.1 struct
A structure is a group of one or more variables under a single name. Unlike arrays, structures
can contain a combination of different types of data; they can even contain arrays. A
structure can be arbitrarily complex.
Every type of structure that is defined is given a name, and the variables it contains
(called members) are also given names. Finally, every variable declared to be of a particular
structure type has its own name as well, just as any other variable does.
personal_data person001;
personal_data person002;
personal_data person003;
Note that this use of the typedef command parallels the usage we have already seen:
typedef existing_type new_type
In the example above of using typedef to declare a new type of structure, the metasyn-
tactic variable new type corresponds to the identifier personal_data, and the metasyntac-
tic variable existing type corresponds to the following code:
struct
{
char name[100];
char address[200];
int year_of_birth;
int month_of_birth;
int day_of_birth;
}
Structure type and variable declarations can be either local or global, depending on their
placement in the code, just as any other declaration can be.
You can get and set the values of the members of a structure with the ‘.’ dot character.
This is called the member operator. The general form of a member reference is:
structure_name.member_name
In the following example, the year 1852 is assigned to the year_of_birth member of the
structure variable person1, of type struct personal_data. Similarly, month 5 is assigned
to the month_of_birth member, and day 4 is assigned to the day_of_birth member.
struct personal_data person1;
person1.year_of_birth = 1852;
person1.month_of_birth = 5;
person1.day_of_birth = 4;
Besides the dot operator, C also provides a special -> member operator for use in con-
junction with pointers, because pointers and structures are used together so often. (See
Section 20.1.5 [Pointers to structures], page 187.)
Structures are easy to use For example, you can assign one structure to another structure
of the same type (unlike strings, for example, which must use the string library routine
strcpy). Here is an example of assigning one structure to another:
struct personal_data person1, person2;
person2 = person1;
The members of the person2 variable now contain all the data of the members of the
person1 variable.
Structures are passed as parameters in the usual way:
my_structure_fn (person2);
You would declare such a function thus:
void my_structure_fn (struct personal_data some_struct)
{
}
Note that in order to declare this function, the struct personal_data type must be de-
clared globally.
Finally, a function that returns a structure variable would be declared thusly:
struct personal_data structure_returning_fn ()
{
struct personal_data random_person;
return random_person;
}
Of course, random_person is a good name for the variable returned by this bare-bones
function, because without unless one writes code to initialize it, it can only be filled with
garbage values.
The value of a member of a structure in an array can be assigned to another variable, or the
value of a variable can be assigned to a member. For example, the following code assigns the
number 1965 to the year_of_birth member of the fourth element of my_struct_array:
my_struct_array[3].year_of_birth = 1965;
(Like all other arrays in C, struct arrays start their numbering at zero.)
The following code assigns the value of the year_of_birth member of the fourth element
of my_struct_array to the variable yob:
yob = my_struct_array[3].year_of_birth;
Finally, the following example assigns the values of all the members of the second element
of my_struct_array, namely my_struct_array[1], to the third element, so my_struct_
array[2] takes the overall value of my_struct_array[1].
my_struct_array[2] = my_struct_array[1];
struct second_structure_type
{
double double_member;
struct first_structure_type struct_member;
};
The first structure type is incorporated as a member of the second structure type. You can
initialize a variable of the second type as follows:
struct second_structure_type demo;
demo.double_member = 12345.6789;
demo.struct_member.integer_member = 5;
demo.struct_member.float_member = 1023.17;
The member operator ‘.’ is used to access members of structures that are themselves mem-
bers of a larger structure. No parentheses are needed to force a special order of evaluation;
a member operator expression is simply evaluated from left to right.
In principle, structures can be nested indefinitely. Statements such as the following are
syntactically acceptable, but bad style. (See Chapter 22 [Style], page 203.)
my_structure.member1.member2.member3.member4 = 5;
What happens if a structure contains an instance of its own type, however? For example:
struct regression
{
int int_member;
struct regression self_member;
};
In order to compile a statement of this type, your computer would theoretically need an
infinite amount of memory. In practice, however, you will simply receive an error message
along the following lines:
Initializing structures 187
The compiler is telling you that self_member has been declared before its data type,
regression has been fully declared — naturally, since you’re declaring self_member in the
middle of declaring its own data type!
my_struct_ptr = &person1;
(*my_struct_ptr).day_of_birth = 23;
This code example says, in effect, “Let the member day_of_birth of the structure pointed
to by my_struct_ptr take the value 23.” Notice the use of parentheses to avoid confusion
about the precedence of the ‘*’ and ‘.’ operators.
There is a better way to write the above code, however, using a new operator: ‘->’. This
is an arrow made out of a minus sign and a greater than symbol, and it is used as follows:
my_struct_ptr->day_of_birth = 23;
The ‘->’ enables you to access the members of a structure directly via its pointer. This
statement means the same as the last line of the previous code example, but is consider-
ably clearer. The ‘->’ operator will come in very handy when manipulating complex data
structures. (See Section 20.4 [Complex data structures], page 192.)
#include <stdio.h>
return 0;
}
Any trailing items not initialized by data you specify are set to zero.
Finally, to free up the memory allocated to a block and return it to the common pool of
memory available to your program, use the free function, which takes only one argument,
the pointer to the block you wish to free. It does not return a value.
free (my_string);
It is also possible to allocate the memory for a structure when it is needed and use the
‘->’ operator to access the members of the structure, since we must access the structure
via a pointer. (See the code sample following the next paragraph for an example of how to
do this.) If you are creating complex data structures that require hundreds or thousands of
structure variables (or more), the ability to create and destroy them dynamically can mean
quite a savings in memory.
It’s easy enough to allocate a block of memory when you know you want 1000 bytes for a
string, but how do you know how much memory to allocate for a structure? For this task, C
provides the sizeof function, which calculates the size of an object. For example, sizeof
(int) returns the numbers of bytes occupied by an integer variable. Similarly, sizeof
(struct personal_data) returns the number of bytes occupied by our personal_data
structure. To allocate a pointer to one of these structures, then set the year_of_birth
member to 1852, you would write something like the following:
struct personal_data* my_struct_ptr;
20.3 union
A union is like a structure in which all of the members are stored at the same address. Only
one member can be in a union at one time. The union data type was invented to prevent
the computer from breaking its memory up into many inefficiently sized chunks, a condition
that is called memory fragmentation.
The union data type prevents fragmentation by creating a standard size for certain data.
When the computer allocates memory for a program, it usually does so in one large block
of bytes. Every variable allocated when the program runs occupies a segment of that block.
When a variable is freed, it leaves a “hole” in the block allocated for the program. If this
hole is of an unusual size, the computer may have difficulty allocating another variable to
“fill” the hole, thus leading to inefficient memory usage. Since unions have a standard data
size, however, any “hole” left in memory by freeing a union can be filled by another instance
of the same type of union. A union works because the space allocated for it is the space
taken by its largest member; thus, the small-scale memory inefficiency of allocating space
for the worst case leads to memory efficiency on a larger scale.
Just like structures, the members of unions can be accessed with the ‘.’ and ‘->’ oper-
ators. However, unlike structures, the variables my_union1 and my_union2 above can be
treated as either integers or floating-point variables at different times during the program.
For example, if you write my_union1.int_member = 5;, then the program sees my_union1
as being an integer. (This is only a manner of speaking. However, my_union1 by itself
does not have a value; only its members have values.) On the other hand, if you then
type my_union1.float_member = 7.7;, the my_union variable loses its integer value. It is
crucial to remember that a union variable can only have one type at the same time.
Notice that we used all-uppercase letters for the enumerated values. We would have received
a syntax error if we had actually used the C keywords int and float.
Associated union and enumerated variables can now be declared in pairs:
union int_or_float my_union1;
enum which_member my_union_status1;
These variables could even be grouped into a structure for ease of use:
struct multitype
{
union int_or_float number;
enum which_member status;
};
You would then make assignments to the members of this structure in pairs:
my_multi.number.int_member = 5;
my_multi.status = INT;
192 Chapter 20: Data structures
^
|
v
^
|
v
South Haven
Once you have a structure diagram that represents your information, you can create a
data structure that translates the structure diagram into the computer’s memory. In this
case, we can create a “town structure” that contains pointers to the towns that lie at the
end of roads in the various compass directions. The town structure might look something
like this:
struct town
{
struct town *north;
struct town *south;
struct town *east;
struct town *west;
char name[50];
};
If the user of this hypothetical application wishes to know what is to the north of a
particular town, the program only has to check that town’s north pointer.
inconvenient to enter new data at run time because you would have to know the name of
the variable in which to store the data when you wrote the program. For another thing,
variables with names are permanent — they cannot be freed and their memory reallocated,
so you might have to allocate an impractically large block of memory for your program at
compile time, even though you might need to store much of the data you entered at run
time temporarily.
Fortunately, complex data structures are built out of dynamically allocated memory,
which does not have these limitations. All your program needs to do is keep track of a
pointer to a dynamically allocated block, and it will always be able to find the block.
A complex data structure is usually built out of the following components:
nodes Dynamically-allocated blocks of data, usually structures.
links Pointers from nodes to their related nodes.
root The node where a data structure starts, also known as the root node. The
address of the root of a data structure must be stored explicitly in a C variable,
or else you will lose track of it.
There are some advantages to the use of dynamic storage for data structures:
• As mentioned above, since memory is allocated as needed, we don’t need to declare
how much we shall use in advance.
• Complex data structures can be made up of lots of “lesser” data structures in a modular
way, making them easier to program.
• Using pointers to connect structures means that they can be re-connected in different
ways as the need arises. (Data structures can be sorted, for example.)
20.6 Questions 20
1. What is the difference between a structure and a union?
2. What is a member?
3. If foo is a structure variable, how would you find out the value of its member bar?
4. If foo is a pointer to a structure variable, how would you find out the value of its
member bar?
5. How are data usually linked to make a complex data structure?
6. Every structure variable in a complex data structure must have its own variable name.
True or false?
7. How are the members of structures accessed in a data structure?
8. Write a small program to make linked list that contains three nodes long and set all
their values to be zero. Can you automate this program with a loop? Can you make it
work for any number of nodes?
196 Chapter 20: Data structures
The stack in detail 197
21 Recursion
The program that swallowed its tail.
This chapter is about functions that call themselves. Consider the program below:
#include <stdio.h>
void black_hole()
{
black_hole();
}
works, in other words, like the stack of dinner plates you keep in your kitchen cabinet. As
you wash plates, you pile them one by one on top of the stack, and when you want a plate,
you take one from the top of the stack. The stack of plates in your cabinet is therefore also
a last in, first out structure, like the computer’s stack.
When one C function calls a second function, the computer leaves itself an address at
the top of the stack of where it should return when it has finished executing the second
function. If the second function calls a third function, the computer will push another
address onto the stack. When the third function has finished executing, the computer pops
the top address off the stack, which tells it where in the second function it should return.
When the second function has finished, the computer again pops the top address off the
stack — which tells it where in the first function it should return. Perhaps the first function
then calls another function, and the whole process starts again.
What happens when black_hole calls itself? The computer makes a note of the address
it must return to and pushes that address onto the top of the stack. It begins executing
black_hole again, and encounters another call to black_hole. The computer pushes
another address onto the top of the stack, and begins executing black_hole again. Since
the program has no chance of popping addresses off the stack, as the process continues, the
stack gets filled up with addresses. Eventually, the stack fills up and the program crashes.
factorial(3) == 1 * 2 * 3 == 6
factorial(4) == 1 * 2 * 3 * 4 == 24
factorial(3) == 1 * 2 * 3 * 4 * 5 == 120
Formally, the factorial function is defined by two equations. (Again, these are in pseu-
docode).
factorial(n) = n * factorial(n-1)
factorial(0) = 1
The first of these statements is recursive, because it defines the value of factorial(n)
in terms of factorial(n-1). The second statement allows the function to “bottom out”.
Here is a short code example that incorporates a factorial function.
#include <stdio.h>
Note: Make sure that the test for whether to bottom out your recursive function does
not depend on a global variable.
200 Chapter 21: Recursion
Suppose you have a global variable called countdown, which your recursive function
decrements by 1 every time it is called. When countdown equals zero, your recursive
function bottoms out. However, since other functions than the recursive function have
access to global variables, it is possible that another function might independently change
countdown in such a way that your recursive function would never bottom out — perhaps
by continually incrementing it, or perhaps even by setting it to a negative number.
The following code example makes use of recursion to print the value contained in the
last node in a linked list.
#include <stdio.h>
struct list_node
{
int data;
struct list_node *next;
};
/* Initialize list. */
root = (struct list_node *) malloc (sizeof (struct list_node));
root->data = 1;
old = root;
return 0;
}
This example program prints out the following line:
Data in last node is 3.
The last_node function, when passed a pointer to a node (such as the root), follows the
linked list to its end from that point, and returns a pointer to that node. It does so through
recursion. When it is passed a pointer to a node, it checks whether that node’s next link is
a null pointer. If the pointer is null, last_node has found the last node, and bottoms out,
returning a pointer to the current node; otherwise, it calls itself with a pointer to the next
node as a parameter.
21.5 Questions 21
1. What is a recursive function?
2. What is a program stack, and what is it for?
3. State the major disadvantage of recursion.
202 Chapter 21: Recursion
Comments and style 203
22 Style
C has no rules about when to start new lines, where to place whitespace, and so on. Users
are free to choose a style which best suits them, but unless a strict style is adopted, sloppy
programs tend to result.
In older compilers, memory restrictions sometimes necessitated bizarre, cryptic styles
in the interest of efficiency. However, contemporary compilers such as GCC have no such
restrictions, and have optimizers that can produce faster code than most programmers
could write themselves by hand, so there are no excuses not to write programs as clearly as
possible.
No simple set of rules will ever provide a complete methodology for writing good pro-
grams. In the end, experience and good judgment are the factors which determine whether
you will write good programs. Nevertheless, a few guidelines to good style can be stated.
Many of the guidelines in this chapter are the distilled wisdom of countless C program-
mers over the decades that C has existed, and some come directly from the GNU Coding
Standards. That document contains more good advice than can be crammed into this short
chapter, so if you plan to write programs for the Free Software Foundation, you are urged
to consult section “Table of Contents” in GNU Coding Standards.
When you split an expression into multiple lines, split it before an operator, not after
one. Here is the right way:
if (foo_this_is_long && bar > win (x, y, z)
&& remaining_condition)
Don’t declare multiple variables in one declaration that spans lines. Start a new decla-
ration on each line instead. For example, instead of this:
int foo,
bar;
or this:
int foo;
int bar;
204 Chapter 22: Style
• Local variables may be impractical, however, if they mean passing the same dozen
parameters to multiple functions; in such cases, global variables will often streamline
your code.
• Data structures that are important to the whole program should be defined globally.
In “real programs” such as GNU Emacs, there are far more global variables than there
are local variables visible in any one function.
Finally, don’t use local variables or parameters that have the same names as global
identifiers. This can make debugging very difficult.
22.8 Questions 22
1. Where should the name of a program and the opening bracket of a function definition
begin?
2. In what human language should comments be written for the GNU Project? Why?
3. Which is better as the name of a variable: plotArea, PlotArea, or plot_area? Why?
4. Why is it important to initialize a variable near where it is used in a long function?
5. Give an example of a case where using local variables is impractical.
206 Chapter 22: Style
parse error at. . . , parse error before. . . 207
23 Debugging
True artificial intelligence has not yet been achieved. C compilers are not intelligent, but
unconscious: mechanical in the derogatory sense of the word. Therefore, debugging your
programs can be a difficult process. A single typographical error can cause a compiler to
completely misunderstand your code and generate a misleading error message. Sometimes
a long string of compiler error messages are generated because of a single error in your code.
To minimize the time you spend debugging, it is useful to become familiar with the most
common compiler messages and their probable causes.
The first section in this chapter lists some of these common compile-time errors and
what to do about them. The next two sections discuss run-time errors in general, and
mathematical errors in particular. The final section introduces GDB, the GNU Debugger,
and explains some simple steps you can take to debug your programs with it.
Adding a semicolon (‘;’) at the end of the line printf ("Hello, world!") will get rid
of this error.
Notice that the error refers to line 6, but the error is actually on the previous line. This
is quite common. Since C compilers are lenient about where you place whitespace, the
compiler treats line 5 and line 6 as a single line that reads as follows:
printf ("Hello, world!\n") return 0;
Of course this code makes no sense, and that is why the compiler complains.
Often a missing curly bracket will cause one of these errors. For example, the following
code:
208 Chapter 23: Debugging
#include <stdio.h>
return 0;
}
Because there is no closing curly bracket for the if statement, the compiler thinks the
curly bracket that terminates the main function actually terminates the if statement. When
it does not find a curly bracket on line 11 of the program to terminate the main function,
it complains. One way to avoid this problem is to type both members of a matching pair
of brackets before you fill them in.
void print_hello()
{
printf ("Hello!\n");
}
The answer, however, is very simple. C is case-sensitive. The main function calls the
function Print_hello (with a capital ‘P’), but the correct name of the function is print_
hello (with a lower-case ‘p’). The linker could not find a function with the name Print_
hello.
#include <stdio.h>
The compiler never found a close quote (‘"’) for the string ‘Hello!\n’. It read all
the text up from the quote in the line printf("Hello!\n); to the first quote in the line
printf("Hello again!\n"); as a single string. Notice that GCC helpfully suggests that it
is line 5 that actually contains the unterminated string. GCC is pretty smart as C compilers
go.
void set_value()
{
int my_int = 5;
}
The variable my_int is local to the function set_value, so referring to it from within main
results in the following error:
undec.c: In function ‘main’:
undec.c:10: ‘my_int’ undeclared (first use in this function)
undec.c:10: (Each undeclared identifier is reported only once
undec.c:10: for each function it appears in.)
#include <stdio.h>
return 0;
}
The tweedledee function takes three parameters, but main passes it two, whereas the
tweedledum function takes two parameters, but main passes it three. The result is a pair
of straightforward error messages:
params.c: In function ‘main’:
params.c:14: too few arguments to function ‘tweedledee’
params.c:15: too many arguments to function ‘tweedledum’
Unwarranted assumptions about storage 211
This is one reason for the existence of function prototypes. Before the ANSI Standard,
compilers did not complain about this kind of error. If you were working with a library
of functions with which you were not familiar, and you passed one the wrong number of
parameters, the error was sometimes difficult to track. Contemporary C compilers such as
GCC that follow the standard make finding parameter mismatch errors simple.
if (my_int = 1)
{
printf ("Hello!\n");
}
return 0;
}
What will this program do? If you guessed that it will print ‘Hello!’, you are correct.
The assignment operator (=) was used by mistake instead of the equality operator (==).
What is being tested in the above if statement is not whether my_int has a value of 1
(which would be written if my_int == 1), but instead what the value is of the assignment
statement my_int = 1. Since the value of an assignment statement is always the result of
the assignment, and my_int is here being assigned a value of 1, the result is 1, which C
considers to be equivalent to TRUE. Thus, the program prints out its greeting.
Even the best C programmers make this mistake from time to time, and tracking down
an error like this can be maddening. Using the ‘-Wall’ option of GCC can help at least a
little by giving you a warning like the following:
equals.c: In function ‘main’:
equals.c:7: warning: suggest parentheses around assignment used as truth value
my_array[0] = 0;
my_array[1] = 0;
my_array[2] = 0;
*my_array = 0;
*(my_array + (1 * sizeof(int))) = 0;
*(my_array + (2 * sizeof(int))) = 0;
While it is true that the variable my_array used without its square brackets is a pointer
to the first element of the array, you must not assume that you can simply calculate a
pointer to the third element with code like the following:
my_array + 2 * sizeof(int);
23.6 Questions 23
Spot the errors in the following:
Blah blah blah.
Example programs 215
24 Example programs
The aim of this section is to provide a substantial example of C programming, using in-
put from and output to disk, GNU-style long options, and the linked list data structure
(including insertion, deletion, and sorting of nodes).
#include <stdio.h>
#include <string.h>
#include <argp.h>
struct personal_data
{
char name[NAME_LEN];
char address[ADDR_LEN];
struct personal_data *next;
};
/*
OPTIONS. Field 1 in ARGP.
Order of fields: {NAME, KEY, ARG, FLAGS, DOC}.
*/
static struct argp_option options[] =
{
{"verbose", ’v’, 0, 0, "Produce verbose output"},
{0}
};
/*
PARSER. Field 2 in ARGP.
216 Chapter 24: Example programs
switch (key)
{
case ’v’:
arguments->verbose = 1;
break;
case ’i’:
arguments->infile = arg;
break;
case ’o’:
arguments->outfile = arg;
break;
case ARGP_KEY_ARG:
if (state->arg_num >= 1)
{
argp_usage(state);
}
arguments->args[state->arg_num] = arg;
break;
case ARGP_KEY_END:
if (state->arg_num < 1)
{
argp_usage (state);
}
break;
default:
return ARGP_ERR_UNKNOWN;
}
return 0;
}
/*
ARGS_DOC. Field 3 in ARGP.
A description of the non-option command-line arguments
that we accept.
*/
static char args_doc[] = "ARG";
/*
DOC. Field 4 in ARGP.
Program documentation.
*/
static char doc[] =
"bigex -- Add ARG new names to an address book file.\vThe largest code example in the GNU C Tutorial.";
/*
The ARGP structure itself.
*/
static struct argp argp = {options, parse_opt, args_doc, doc};
Example programs 217
struct personal_data *
new_empty_node()
{
struct personal_data *new_node;
return new_node;
}
struct personal_data *
create_node()
{
int bytes_read;
int nbytes;
current_node = new_empty_node();
puts ("Name?");
nbytes = NAME_LEN;
name = (char *) malloc (nbytes + 1);
bytes_read = getline (&name, &nbytes, stdin);
if (bytes_read == -1)
{
puts ("ERROR!");
}
else
{
strncpy (current_node->name, name, NAME_LEN);
free (name);
}
puts ("Address?");
nbytes = ADDR_LEN;
address = (char *) malloc (nbytes + 1);
bytes_read = getline (&address, &nbytes, stdin);
if (bytes_read == -1)
{
puts ("ERROR!");
}
else
{
strncpy (current_node->address, address, ADDR_LEN);
free (address);
}
218 Chapter 24: Example programs
printf("\n");
return current_node;
}
struct personal_data *
find_end_node (struct personal_data *current_node)
{
if (current_node->next == NULL)
{
return current_node;
}
else
{
return find_end_node (current_node->next);
}
}
int
list_length (struct personal_data *root)
{
struct personal_data *current_node;
int count = 0;
current_node = root;
struct personal_data *
find_node (struct personal_data *root,
int node_wanted)
{
struct personal_data *current_node;
int index = 0;
current_node = root;
{
struct personal_data *previous_node;
struct personal_data *current_node;
previous_node->next = new_node;
new_node->next = temp_ptr;
}
if (a > b)
{
temp = a;
a = b;
b = temp;
}
j = i;
while (strcmp ( (find_node(root, j))->name,
(find_node(root, j-1))->name) < 0)
{
swap_nodes (root, j, j-1);
j--;
}
}
}
if (current_node->next != NULL)
{
print_list (current_node->next, save_stream);
}
}
struct personal_data *
read_node (FILE *instream)
{
int bytes_read;
int nbytes;
current_node = new_empty_node();
nbytes = NAME_LEN;
name = (char *) malloc (nbytes + 1);
bytes_read = getline (&name, &nbytes, instream);
if (bytes_read == -1)
{
read_err = 1;
}
else
{
puts (name);
Example programs 221
nbytes = ADDR_LEN;
address = (char *) malloc (nbytes + 1);
bytes_read = getline (&address, &nbytes, instream);
if (bytes_read == -1)
{
read_err = 1;
}
else
{
puts (address);
strncpy (current_node->address, address, ADDR_LEN);
free (address);
}
if (read_err)
{
return NULL;
}
else
{
return current_node;
}
}
struct personal_data *
read_file (char *infile)
{
FILE *input_stream = NULL;
struct personal_data *root;
struct personal_data *end_node;
struct personal_data *current_node;
root = new_empty_node();
end_node = root;
/*
The main function.
Notice how now the only function call needed to process
222 Chapter 24: Example programs
if (arguments.infile)
{
root = read_file (arguments.infile);
end_node = find_end_node (root);
}
else
{
root = new_empty_node();
end_node = root;
}
sort_list (root);
print_list (root->next, save_stream);
return 0;
}
224 Chapter 24: Example programs
A note from the original author 225
enum d
void d
const d
signed d
volatile d
228 Appendix B: Reserved words in C
Precedence of operators 229
#include <stdio.h>
main ()
{
printf ("Beep! \7 \n");
printf ("ch = \’a\’ \n");
printf (" <- Start of this line!! \r");
}
The output of this program is:
Beep! (and the BELL sound)
ch = ’a’
<- Start of this line!!
and the text cursor is left where the arrow points.
232 Appendix D: Special characters
Character conversion table 233
0 0 0 CTRL-@
1 1 1 CTRL-A
2 2 2 CTRL-B
3 3 3 CTRL-C
4 4 4 CTRL-D
5 5 5 CTRL-E
234 Appendix E: Character conversion table
6 6 6 CTRL-F
7 7 7 CTRL-G
8 10 8 CTRL-H
9 11 9 CTRL-I
10 12 A CTRL-J
11 13 B CTRL-K
12 14 C CTRL-L
13 15 D CTRL-M
14 16 E CTRL-N
15 17 F CTRL-O
16 20 10 CTRL-P
17 21 11 CTRL-Q
18 22 12 CTRL-R
19 23 13 CTRL-S
20 24 14 CTRL-T
21 25 15 CTRL-U
22 26 16 CTRL-V
23 27 17 CTRL-W
24 30 18 CTRL-X
25 31 19 CTRL-Y
26 32 1A CTRL-Z
27 33 1B CTRL-[
28 34 1C CTRL-\
29 35 1D CTRL-]
30 36 1E CTRL-^
31 37 1F CTRL-_
32 40 20
33 41 21 !
34 42 22 "
35 43 23 #
36 44 24 $
37 45 25 %
38 46 26 &
39 47 27 ’
40 50 28 (
41 51 29 )
42 52 2A *
43 53 2B +
44 54 2C ,
45 55 2D -
46 56 2E .
47 57 2F /
48 60 30 0
49 61 31 1
50 62 32 2
51 63 33 3
52 64 34 4
53 65 35 5
54 66 36 6
55 67 37 7
56 70 38 8
57 71 39 9
58 72 3A :
59 73 3B ;
60 74 3C <
61 75 3D =
62 76 3E >
63 77 3F ?
64 100 40 @
65 101 41 A
66 102 42 B
67 103 43 C
68 104 44 D
69 105 45 E
70 106 46 F
71 107 47 G
A word about goto 235
Bibliography
Blah blah blah.
240 Bibliography
Glossary 241
Glossary
Blah blah blah.
242 Glossary
Code index 243
Code index
(Index is nonexistent)
244 Code index
Concept index 245
Concept index
(Index is nonexistent)
246 Concept index
Assigning variables to one another 247
Characters
In C, single characters are written enclosed by single quotes. This is in contrast to strings
of characters, which use double quotes (‘"..."’).
int ch;
ch = ’a’;
would give ch the value of the character ‘a’. The same effect can also be achieved by writing:
char ch = ’a’;
It is also possible to have the type:
unsigned char
This admits ASCII values from 0 to 255, rather than -128 to 127.
i = ch;
248 Bits and pieces
Function pointers
You can create pointers to functions as well as to variables. Function pointers can be tricky,
however, and caution is advised in using them.
Function pointers allow you to pass functions as a parameters to another function. This
enables you to give the latter function a choice of functions to call. That is, you can plug
in a new function in place of an old one simply by passing a different parameter. This
technique is sometimes called indirection or vectoring.
To pass a pointer for one function to a second function, simply use the name of the
first function, as long as there is no variable with the same name. Do not include the first
function’s parentheses or parameters when you pass its name.
For example, the following code passes a pointer for the function named fred_function
to the function barbara_function:
void fred();
barbara (fred);
Notice that fred is declared with a regular function prototype before barbara calls it. You
must also declare barbara, of course:
void barbara (void (*function_ptr)() );
Notice the parentheses around function_ptr and the parentheses after it. As far as barbara
is concerned, any function passed to it is named (*function_ptr)(), and this is how fred
is called in the example below:
#include <stdio.h>
void fred();
void barbara ( void (*function_ptr)() );
int main();
int main()
{
barbara (fred);
return 0;
}
void fred()
{
printf("fred here!\n");
}
For example, in the program below, the function do_math calls the functions add and
subtract with the following line:
result = (*math_fn_ptr) (num1, num2);
int main()
{
int result;
return 0;
}
int do_math (int (*math_fn_ptr) (int, int), int num1, int num2)
{
int result;
do_math here.
Subtraction = 35.
250 Bits and pieces
You can also initialize a function pointer by setting it to the name of a function, then
treating the function pointer as an ordinary function, as in the next example:
#include <stdio.h>
int main();
void print_it();
void (*fn_ptr)();
int main()
{
void (*fn_ptr)() = print_it;
(*fn_ptr)();
return 0;
}
void print_it()
{
printf("We are here! We are here!\n\n");
}
Remember to initialize any function pointers you use this way! If you do not, your
program will probably crash, because the uninitialized function pointer will contain garbage.
Table of Contents i
Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 The advantages of C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Questions for Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Using a compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Basic ideas about C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 The compiler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 File names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.4.1 Typographical errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4.2 Type errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.5 Questions for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.1 Function names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 Function examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.3 Functions with values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.4 Function prototyping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.5 The exit function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.6 Questions for Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.1 Global Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.2 Local Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.3 Communication via parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.4 Scope example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.5 Questions for Chapter 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
8 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
8.1 Parameters in function prototypes . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
8.2 Value Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
8.3 Actual parameters and formal parameters . . . . . . . . . . . . . . . . . . . . 39
8.4 Variadic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
8.5 Questions for Chapter 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
9 Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
9.1 Pointer operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
9.2 Pointer types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
9.3 Pointers and initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
9.4 Variable parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
9.4.1 Passing pointers correctly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
9.4.2 Another variable parameter example . . . . . . . . . . . . . . . . . . . . 48
9.5 Questions for Chapter 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
10 Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
10.1 if . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
10.2 if... else... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
10.3 Nested if statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
10.4 The ?. . . :. . . operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
10.5 The switch statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
10.6 Example Listing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
10.7 Questions for Chapter 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Table of Contents iii
11 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
11.1 while . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
11.2 do. . . while . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
11.3 for . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
11.4 The flexibility of for . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
11.5 Terminating and speeding loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
11.5.1 Terminating loops with break . . . . . . . . . . . . . . . . . . . . . . . . . 64
11.5.2 Terminating loops with return . . . . . . . . . . . . . . . . . . . . . . . . 64
11.5.3 Speeding loops with continue . . . . . . . . . . . . . . . . . . . . . . . . . 65
11.6 Nested loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
11.7 Questions for Chapter 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
12 Preprocessor directives. . . . . . . . . . . . . . . . . . . 67
12.1 A few directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
12.2 Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
12.2.1 Macro functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
12.3 Extended macro example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
12.4 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
13 Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
13.1 Header files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
13.2 Kinds of library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
13.3 Common library functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
13.3.1 Character handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
13.4 Mathematical functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
13.5 Questions for Chapter 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
14 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
14.1 Array bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
14.2 Arrays and for loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
14.3 Multidimensional arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
14.4 Arrays and nested loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
14.5 Initializing arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
14.6 Arrays as Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
14.7 Questions for Chapter 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
15 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
15.1 Conventions and declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
15.2 Initializing strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
15.3 String arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
15.4 String library functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
15.5 Questions for Chapter 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
iv
21 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
21.1 The stack. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
21.1.1 The stack in detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
21.2 Controlled recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
21.3 Controlled recursion with data structures . . . . . . . . . . . . . . . . . . 200
21.4 Recursion summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
21.5 Questions 21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Table of Contents vii
22 Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
22.1 Formatting code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
22.2 Comments and style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
22.3 Variable and function names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
22.4 Declarations and initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
22.5 Global variables and style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
22.6 Hidden operators and style. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
22.7 Final words on style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
22.8 Questions 22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
23 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
23.1 Compile-time errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
23.1.1 parse error at. . . , parse error before. . . . . . . . . . . . . . . . . . 207
23.1.2 undefined reference to. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
23.1.3 unterminated string or character constant . . . . . . . . . . . . . 208
23.2 . . . undeclared (first use in this function) . . . . . . . . . . . . . . . . . . . 209
23.2.1 different type arg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
23.2.2 too few parameters. . . , too many parameters. . . . . . . . . . 210
23.3 Run-time errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
23.3.1 Confusion of = and ==. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
23.3.2 Confusing foo++ and ++foo . . . . . . . . . . . . . . . . . . . . . . . . . . 211
23.3.3 Unwarranted assumptions about storage . . . . . . . . . . . . . . . 212
23.3.4 Array out of bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
23.3.5 Uncoordinated output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
23.3.6 Global variables and recursion . . . . . . . . . . . . . . . . . . . . . . . . 212
23.4 Mathematical errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
23.5 Introduction to GDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
23.6 Questions 23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Bibliography. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241