C Programming FAQs - Frequently Asked Questions - Summit, Steve
C Programming FAQs - Frequently Asked Questions - Summit, Steve
Steve Summit
TT
Addison-Wesley Publishing Company
Summit, Steve.
C programming FAQs / Steve Summit,
p. cm.
Includes bibliographical references and index.
ISBN 0-201-84519-9
1. C (Computer program language) I. Title.
QA76.73.C15S86 1996
005.13'3-dc20 95-39682
CIP
The programs and applications presented in this book have been included for their instructional
value. They have been tested with care, but are not guaranteed for any particular purpose. The
publisher and author do not offer any warranties or representations, nor do they accept any
liabilities with respect to the programs or applications.
Access the latest information about Addison-Wesley books from our Internet gopher site or from
our World Wide Web page:
gopher aw.com
http:/ /www.aw.com/cseng/
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,
or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or
otherwise, without prior written permission of the author. Printed in the United States of
America.
123456789 10—CRW—99 98 97 96 95
This book is dedicated to C programmers everywhere.
Digitized by the Internet Archive
in 2019 with funding from
Kahle/Austin Foundation
https://archive.org/details/cprogrammingfaqsOOOOsumm
Contents
Questions xi
Preface xxxi
Introduction XXXV
Basic Types 1
Pointer Declarations 5
Declaration Style 6
Storage Classes 8
Typedefs 10
The const Qualifier 14
Complex Declarations 15
Array Sizes 19
Declaration Problems 21
Namespace 22
Initialization 26
V
CONTENTS
Structure Declarations 31
Structure Operations 36
Structure Padding 39
Accessing Members 40
Miscellaneous Structure Questions 42
Unions 43
Enumerations 44
Bitfields 46
3. Expressions 48
Evaluation Order 48
Other Expression Questions 57
Preserving Rules 61
4. Pointers 65
5. Null Pointers 77
Retrospective 99
Pointers to Arrays 101
Dynamic Array Allocation 104
Functions and Multidimensional Arrays 108
Sizes of Arrays 112
Tools 286
lint 289
Resources 291
Glossary 363
Bibliography 372
Index 376
Questions
Pointer Declarations
1.5 What’s wrong with the declaration “char* pi, p2; ”? I get errors when I try
to use p2.
1.6 What’s wrong with “char *p; *p = malloc (10) ; ”?
Declaration Style
1.7 What’s the best way to declare and define global variables?
1.8 What’s the best way of implementing opaque (abstract) data types in C?
1.9 How can I make a sort of “semiglobal” variable, that is, one that’s private to a
few functions spread across a few source files?
Storage Classes
1.10 Do all declarations for the same static function or variable have to include the
storage class static?
1.11 What does extern mean in a function declaration?
1.12 What’s the auto keyword good for?
Typedefs
1.13 What’s the difference between using a typedef or a preprocessor macro for a
user-defined type?
1.14 I can’t seem to define a linked list node that contains a pointer to itself.
1.15 How can I define a pair of mutually referential structures?
1.16 What’s the difference between struct xl { ... };
and typedef struct { ... } x2; ?
1.17 What does “typedef int (*funcptr) () ;” mean?
xi
••
XII QUESTIONS
1.18 I have a typedef for char *, and it’s not interacting with const the way I
expect. Why not?
1.19 Why can’t 1 use const values in initializers and array dimensions?
1.20 What’s the difference between const char *p and char * const p?
Complex Declarations
Array Sizes
1.23 Can I declare a local array (or parameter array) of a size matching a passed-in
array, or set by another parameter?
1.24 Why doesn’t sizeof work on arrays declared extern?
Declaration Problems
Namespace
1.29 How can I determine which identifiers are safe for me to use and which are
reserved?
Initialization
1.30 What can I safely assume about the initial values of variables that are not explic¬
itly initialized?
1.31 Why can’t I initialize a local array with a string?
1.32 What is the difference between char a[] = "string"; and
char *p = "string"; ?
1.33 Is char a [3] = "abc"; legal?
1.34 How do I initialize a pointer to a function?
1.35 Can I initialize unions?
QUESTIONS XIII
Structure Operations
2.7 Can structures be assigned to variables and passed to and from functions?
2.8 Why can’t structures be compared using the built in == and != operators?
2.9 How are structure passing and returning implemented?
2.10 How can I create nameless, immediate, constant structure values?
2.11 How can I read/write structures from/to data files?
Structure Padding
2.12 How can I turn off structure padding?
2.13 Why does sizeof report a larger size than I expect for a structure type?
Accessing Members
2.14 How can I determine the byte offset of a field within a structure?
2.15 How can I access structure fields by name at run time?
2.16 Does C have an equivalent to Pascal’s with statement?
Unions
2.19 What’s the difference between a structure and a union?
2.20 Is there a way to initialize unions?
2.21 Is there an automatic way to keep track of which field of a union is in use?
Enumerations
2.22 What is the difference between an enumeration and a set of preprocessor
#def ines?
2.23 Are enumerations portable?
2.24 Is there an easy way to print enumeration values symbolically?
xiv QUESTIONS
Bitfields
2.25 What do these colons and numbers mean in some structure declarations?
2.26 Why do people use explicit masks and bit-twiddling code so much instead of
declaring bitfields?
3. Expressions
Evaluation Order
3.12 if r ’m not using the value of the expression, should I use i + + or ++i to incre¬
ment a variable?
3.13 Why doesn’t “if (a < b < c)”work?
3.14 Why doesn’t the code “int a = 1000, b= 1000; long int c = a * b;”
work?
3.15 Why does the code degC = 5 / 9 * (degF - 32);” keep giving me 0?
3.16 Can I use ? : on the left-hand side of an assignment expression?
3.17 I have an expression “a ? b = c : d” that some compilers aren’t accepting.
Why?
Preserving Rules
3.18 What does the warning “semantics of *>’ change in ANSI C” mean?
3.19 What’s the difference between the “unsigned preserving” and “value preserving”
rules?
QUESTIONS xv
4. Pointers
Basic Pointer Use
Pointer Manipulations
4.4 I’m having problems using pointers to manipulate an array of ints.
4.5 I want to use a char * pointer to step over some ints. Why doesn’t
“((int *)p)++; ” work?
4.6 Why can’t I perform arithmetic on a void * pointer?
4.7 I’ve got some code that’s trying to unpack external structures, but it’s crashing
with a message about an “unaligned access.” What does this mean?
5. Null Pointers
Null Pointers and Null Pointer Constants
5.1 What is this infamous null pointer, anyway?
5.2 How do I get a null pointer in my programs?
5.3 Is the abbreviated pointer comparison “if (p)” to test for non-null pointers
valid?
5.10 But wouldn’t it be better to use NULL, in case the value of NULL changes?
5.11 I once used a compiler that wouldn’t work unless NULL was used.
5.12 I use the preprocessor macro “#define Nullptr (type) (type *)0” to
help me build null pointers of the correct type.
Retrospective
5.13 This is strange: NULL is guaranteed to be 0, but the null pointer is not?
5.14 Why is there so much confusion surrounding null pointers?
5.15 Is there an easier way to understand all this null pointer stuff?
5.16 Given all the confusion surrounding null pointers, wouldn’t it be easier simply to
require them to be represented internally by zeroes?
5.17 Seriously, have any machines really used nonzero null pointers?
6.1 I had the definition char a [6] in one source file, and in another I declared
extern char *a. Why didn’t it work?
6.2 But I heard that char a [ ] is identical to char *a. Is that true?
6.3 So what is meant by the “equivalence of pointers and arrays” in C?
6.4 Why are array and pointer declarations interchangeable as function formal para¬
meters?
Retrospective
Pointers to Arrays
6.14 How can I set an array’s size at compile time? How can I avoid fixed-sized
arrays?
6.15 How can I declare local arrays of a size matching a passed-in array?
6.16 How can I dynamically allocate a multidimensional array?
6.17 Can I simulate a non-O-based array with a pointer?
Sizes of Arrays
6.21 Why doesn’t sizeof properly report the size of an array that is a parameter to
a function?
6.22 How can code in a file where an array is declared as extern determine the size
of the array?
6.23 How can I determine how many elements are in an array, when sizeof yields
the size in bytes?
7. Memory Allocation
Basic Allocation Problems
7.1 Why doesn’t the code “char ‘answer; gets (answer) ; ” work?
7.2 I can’t get strcat to work. I tried “char *s3 = strcat(sl, s2) ; but I
got strange results.
7.3 But the documentation for strcat says that it takes two char *’s as argu¬
ments. How am I supposed to know to allocate things?
7.4 I’m using fgets to read lines from a file into an array of pointers. Why do all
the lines end up containing copies of the last line?
7.5 1 have a function that is supposed to return a string, but when the function
returns to its caller, the returned string is garbage. Why?
Calling malloc
7.6 Why am I getting “warning; assignment of pointer from integer lacks a cast” for
calls to malloc?
7.7 Why does some code carefully cast the values returned by malloc to the pointer
type being allocated?
7.8 Why does so much code leave out the multiplication by sizeof (char) when
allocating strings?
7.9 Why doesn’t the little wrapper I wrote around malloc work?
7.10 What’s wrong with “char *p; *p = malloc (10); ”?
XVIII QUESTIONS
7.15 Why is malloc returning crazy pointer values? I’m sure I declared it correctly.
7.16 I’m allocating a large array for some numeric work, but malloc is acting
strangely. Why?
7.17 I’ve got 8 MB of memory in my PC. Why does malloc seem to allocate only
640K or so?
7.18 My application depends heavily on dynamic allocation of nodes for data struc¬
tures, and malloc/free overhead is becoming a bottleneck. What can I do?
7.19 My program is crashing, apparently somewhere down inside malloc. Why?
Freeing Memory
7.20 Dynamically allocated memory can’t be used after freeing it, can it?
7.21 Why isn’t a pointer null after calling free?
7.22 When I call malloc to allocate memory for a local pointer, do I have to explic¬
itly free it?
7.23 When I free a dynamically allocated structure containing pointers, do I have to
free each subsidiary pointer first?
7.24 Must I free allocated memory before the program exits?
7.25 Why doesn’t my program’s memory usage go down when I free memory?
Header Files
10.6 What are .h files and what should I put in them?
10.7 Is it acceptable for one header file to include another?
10.8 Where are header (“#include”) files searched for?
10.9 Why am I getting strange syntax errors on the very first declaration in a file?
10.10 I’m using header files that supply conflicting definitions for common macros
such as TRUE and FALSE. What can I do?
10.11 Where can I get a copy of a missing header file?
Conditional Compilation
10.12 How can I construct preprocessor #if expressions that compare strings?
10.13 Does the sizeof operator work in preprocessor #if directives?
XX QUESTIONS
10.14 Can I use #ifdef in a #define line to define something two different ways?
10.15 Is there anything like #ifdef for typedefs?
10.16 How can I use a preprocessor #if expression to detect endianness?
10.17 Why am I getting strange syntax errors inside lines I’ve used #ifdef to dis¬
able?
10.18 How can I preprocess some code to remove selected conditional compilations,
without preprocessing everything?
10.19 How can I list all of the predefined identifiers?
Fancier Processing
10.20 I have some old code that tries to construct identifiers with a macro like
“#define Paste (a, b) a/**/b”, but it’s not working any more. Why?
10.21 I have an old macro, “#def ine CTRL(c) ( 'c 1 & 03 7)”, that doesn’t seem
to work any more. Why?
10.22 What does the message “warning: macro replacement within a string literal”
mean?
10.23 How can I use a macro argument inside a string literal in the macro expansion?
10.24 How do I get the ANSI “stringizing” preprocessing operator ‘#’ to stringize the
macro’s value instead of its name?
10.25 How can I do this really tricky preprocessing?
10.26 How can I write a macro which takes a variable number of arguments?
10.27 How can I include expansions of the_FILE_and_LINE_macros in a
general-purpose debugging macro?
Function Prototypes
11.3 Why does my ANSI compiler complain about prototype mismatches for para¬
meters declared float?
11.4 Is it possible to mix old-style and new-style function syntax?
11.5 Why does the declaration “extern f (struct x *p) ; ” give me a warning
message?
11.6 Why don’t function prototypes guard against mismatches in printf’s argu¬
ments?
11.7 I heard that you have to include <stdio.h> before calling printf. Why?
QUESTIONS xxi
11.8 Why can’t I use const values in initializers and array dimensions?
11.9 What’s the difference between const char *p and char * const p?
11.10 Why can’t I pass a char ** to a function that expects a const char **?
11.11 I have a typedef for char *, and it’s not interacting with const the way I
expect. Why?
Using main()
11.12 Can I declare main as void, to shut off these warnings about main not
returning a value?
11.13 What about main’s third argument, envp?
11.14 I believe that declaring void main() can’t fail, since I’m calling exit instead
of returning.
11.15 But why do all my books declare main as void?
11.16 Is exit (status) truly equivalent to returning the same status from main?
Preprocessor Features
11.17 How do I get the ANSI “stringizing” preprocessing operator ‘#’ to stringize the
macro’s value instead of its name?
11.18 What does the message “warning: macro replacement within a string literal”
mean?
11.19 Why am I getting strange syntax errors inside lines I’ve used #ifdef to dis¬
able?
11.20 What is the #pragma directive?
11.21 What does “#pragma once” mean?
Compliance
11.33 What’s the difference between implementation-defined, unspecified, and unde¬
fined behavior?
11.34 I’m appalled that the ANSI standard leaves so many issues undefined. Isn’t a
standard supposed to standardize things?
11.35 I just tried some allegedly-undefined code on an ANSI-conforming compiler,
and got the results I expected.
printf Formats
scanf Formats
scanf Problems
12.17 When I read numbers from the keyboard with scanf "%d\n", it seems to
hang until I type one extra line of input. Why?
12.18 I’m reading a number with scanf %d and then a string with gets (), but the
compiler seems to be skipping the call to gets () ! Why?
QUESTIONS xxiii
12.19 I’m reprompting the user if scanf fails, but sometimes it seems to go into an
infinite loop. Why?
12.20 Why does everyone say not to use scanf? What should I use instead?
12.21 How can I tell how much destination buffer space I’ll need for an arbitrary
sprintf call? How can I avoid overflowing the destination buffer with
sprintf?
12.22 What’s the deal on sprintf’s return value?
12.23 Why does everyone say not to use gets () ?
12.24 Why does errno contain an error code after a call to printf?
12.25 What’s the difference between fgetpos/f setpos and ftell/fseek?
12.26 Will f flush (stdin) flush unread characters from the standard input
stream?
“Binary” I/O
12.37 How can I read and write numbers between files and memory a byte at a time?
12.38 How can I read a binary data file properly?
12.39 How can I change the mode of stdin or stdout to binary?
12.40 What’s the difference between text and binary I/O?
12.41 How can I read/write structures from/to data files?
12.42 How can I write code to conform to these old, binary data file formats?
String Functions
13.1 How can I convert numbers to strings?
xxiv QUESTIONS
Sorting
13.8 I’m trying to sort an array of strings with qsort, using strcmp as the com¬
parison function, but it’s not working. Why not?
13.9 Now I’m trying to sort an array of structures, but the compiler is complaining
that the function is of the wrong type for qsort. What should I do?
13.10 How can I sort a linked list?
13.11 How can I sort more data than will fit in memory?
Random Numbers
13.22 Is exit (status) truly equivalent to returning the same status from main?
13.23 What’s the difference between memcpy and memmove?
13.24 I’m trying to port this old program. Why do I get “undefined external” errors
for some library functions?
13.25 I keep getting errors due to library functions being undefined, even though I’m
including all the right header files. Why?
13.26 I’m still getting errors due to library functions being undefined, even though
I’m requesting the right libraries.
13.27 Why is my simple program compiling to such a huge executable?
13.28 What does it mean when the linker says that _end is undefined?
QUESTIONS XXV
14. Floatingpoint
14.1 When I set a float variable to 3.1, why is printf printing it as 3.0999999?
14.2 Why is sqrt giving me crazy numbers?
14.3 Why is the linker complaining that functions such as sin and cos are unde¬
fined?
14.4 My floating-point calculations are acting strangely and giving me different
answers on different machines. Why?
14.5 What’s a good way to check for “close enough” floating-point equality?
14.6 How do I round numbers?
14.7 Where is C’s exponentiation operator?
14.8 The predefined constant M_PI seems to be missing from <math.h>. Shouldn’t
it be there?
14.9 How do I set variables to or test for IEEE NaN and other special values?
14.10 How can I handle floating-point exceptions gracefully?
14.11 What’s a good way to implement complex numbers in C?
14.12 Where can I find some mathematical library code?
14.13 I’m having trouble with a Turbo C program which crashes and says something
like “floating point formats not linked.” What am I missing?
15.1 I heard that you have to include <stdio . h> before calling printf. Why?
15.2 How can %f be used for both float and double arguments in printf?
15.3 Why don’t function prototypes guard against mismatches in printf’s argu¬
ments?
15.4 How can I write a function that takes a variable number of arguments?
15.5 How can I write a function that, like printf, takes a format string and a vari¬
able number of arguments and passes them to printf to do most of the work?
15.6 How can I write a function analogous to scanf, that calls scanf to do most
of the work?
15.7 I have a pre-ANSI compiler, without <stdarg.h>. What can I do?
15.8 How can I discover how many arguments a function was actually called with?
15.9 My compiler isn’t letting me declare a function that accepts only variable argu¬
ments. Why not?
15.10 Why isn’t va_arg (argp, float) working?
15.11 I can’t get va_arg to pull in an argument of type pointer to function. Why
not?
xxvi QUESTIONS
Harder Problems
15.12 How can I write a function that takes a variable number of arguments and
passes them to another function ?
15.13 How can I call a function with an argument list built up at run time?
17. Style
17.1 What’s the best style for code layout in C?
17.2 How should functions be apportioned among source files?
17.3 Is the code “if ( ! strcmp (si, s2 ))” good style?
17.4 Why do some people write if (0 == x) instead of if (x == 0) ?
17.5 I came across some code that puts a (void) cast before each call to printf.
Why?
17.6 If NULL and 0 are equivalent as null pointer constants, which should I use?
17.7 Should I use symbolic names, such as TRUE and FALSE for Boolean constants,
or plain 1 and 0?
17.8 What is “Hungarian notation”? Is it worthwhile?
17.9 Where can I get the “Indian Hill Style Guide” and other coding standards?
17.10 Some people say that gotos are evil and that I should never use them. Isn’t that
a bit extreme?
17.11 Don’t efficiency concerns necessitate some style concessions?
18.1 I’m looking for C development tools (cross-reference generators, code beauti-
fiers, etc.). Where can I find some?
QUESTIONS XXVII
lint
18.4 I just typed in this program, and it’s acting strangely. What can be wrong with it?
18.5 How can I shut off the “warning: possible pointer alignment problem” message
that lint gives me for each call to malloc?
18.6 Can I declare main as void to shut off these annoying “main returns no
value” messages?
18.7 Where can I get an ANSI-compatible lint?
18.8 Don’t ANSI function prototypes render lint obsolete?
Resources
Other I/O
19.12 How can I find out the size of a file prior to reading it in?
19.13 How can a file be shortened in place without completely clearing or rewriting it?
19.14 How can I insert or delete a line in the middle of a file?
19.15 How can I recover the file name given an open stream or file descriptor?
19.16 How can I delete a file?
19.17 What’s wrong with the call f open (" c: \newdir\ f ile . dat" , " r") ?
19.18 How can I increase the allowable number of simultaneously open files?
19.19 How can I find out how much free space is available on disk?
19.20 How can I read a directory in a C program?
19.21 How do I create a directory? How do I remove a directory (and its contents)?
“System” Commands
19.27 How can I invoke another program or command from within a C program?
19.28 How can I call system when parameters of the executed command aren’t
known until run time?
19.29 How do I get an accurate error status return from system on MS-DOS?
19.30 How can I invoke another program and trap its output?
Process Environment
19.31 How can my program discover the complete pathname to the executable from
which it was invoked?
19.32 How can I automatically locate a program’s configuration files in the same
directory as the executable?
19.33 How can a process change an environment variable in its caller?
19.34 How can I open files mentioned on the command line and parse option flags?
19.35 Is exit (status) truly equivalent to returning the same status from main?
19.36 How can I read in an object file and jump to routines in it?
19.37 How can I implement a delay or time a user’s response with subsecond resolu¬
tion?
19.38 How can I trap or ignore keyboard interrupts like control-C?
19.39 How can I handle floating-point exceptions gracefully?
19.40 How do I use sockets? Do networking? Write client/server applications?
Retrospective
19.41 But I can’t use all these nonstandard, system-dependent functions, because my
program has to be ANSI compatible!
19.42 Why isn’t any of this standardized in C?
QUESTIONS xxix
20. Miscellaneous
Miscellaneous Techniques
Efficiency
20.12 W’hat is the most efficient way to count the number of bits that are set in a
value?
20.13 How can I make my code more efficient?
20.14 Are pointers really faster than arrays? How much do function calls slow things
down?
20.15 Isn’t shifting more efficient than multiplication and division?
switch Statements
Other Languages
20.25 How can I call FORTRAN (C++, BASIC, Pascal, Ada, LISP) functions from C?
20.26 Are there programs for converting Pascal or FORTRAN to C?
XXX QUESTIONS
20.27 What are the differences between C and C++? Can I use a C++ compiler to
compile C code?
Algorithms
20.28 What’s a good way to compare two strings for close, but not necessarily exact,
equality?
20.29 What is hashing?
20.30 How can I generate random numbers with a normal, or Gaussian, distribution?
20.31 How can I find the day of the week given the date?
20.32 Will 2000 be a leap year?
20.33 Why can tm_sec in the tm structure range from 0 to 61?
Trivia
20.34 How do you write a program that produces its own source code as its output?
20.35 What is “Duff’s Device”?
20.36 When will the next Obfuscated C Code Contest be held? How can I get a copy
of previous winning entries?
20.37 What was the entry keyword mentioned in K&Rl?
20.38 Where does the name “C” come from, anyway?
20.39 How do you pronounce “char”?
20.40 Where are the on-line versions of this book?
Preface
At some point in 1979, I heard a lot of people talking about this relatively
new language, C, and the book that had just come out about it. I bought a
copy of K&R, otherwise known as The C Programming Language, by Brian
Kermghan and Dennis Ritchie, but it sat on my shelf for a while because I
didn’t have an immediate need for it (besides which I was busy being a col¬
lege freshman at the time). It proved in the end to be an auspicious purchase,
though, because when I finally did take it up, I never put it down: I’ve been
programming in C ever since.
In 1983, I came across the Usenet newsgroup net.lang.c, which was (and
its successor comp.lang.c still is) an excellent place to learn a lot more about
C, to find out what questions everyone else is having about C, and to discover
that you may not know all there is to know about C after all. It seems that
C, despite its apparent simplicity, has a number of decidedly nonobvious
aspects, and certain questions come up over and over again. This book is a
collection of some of those questions, with answers, based on the Frequently
Asked Questions (“FAQ”) list I began posting to comp.lang.c in May 1990.
I hasten to add, however, that this book is not intended as a critique or
“hatchet job” on the C language. It is all too easy to blame a language (or
any tool) for the difficulties its users encounter with it or to claim that a prop¬
erly designed tool “ought” to prevent its users from misusing it. It would
therefore be easy to regard a book like this, with its long lists of misuses, as
a litany of woes attempting to show that the language is hopelessly deficient.
Nothing could be farther from the case.
I would never have learned enough about C to be able to write this book,
and I would not be attempting to make C more pleasant for others to use by
writing this book now, if I did not think that C is a great language or if I did
not enjoy programming in it. I do like C, and one of the reasons I teach
classes in it and spend time participating in discussion about it on the Inter¬
net is that I would like to discover which aspects of C (or of programming in
general) are difficult to learn or keep people from being able to program effi¬
ciently and effectively. This book represents some of what I’ve learned: These
xxxi
XXXII PREFACE
questions are certainly some of the ones people have the most trouble with,
and the answers have been refined over several years in an attempt to ensure
that people don’t have too much trouble with them.
A reader will certainly have trouble if there are any errors in these answers,
and although the reviewers and I have worked hard to eliminate them, it can
be as difficult to eradicate the last error from a large manuscript as it is to
stamp out the last bug in a program. I will appreciate any corrections or sug¬
gestions sent to me in care of the publisher or at the e-mail address given, and
1 would like to offer the customary $1.00 reward to the first finder of any
error. If you have access to the Internet, you can check for an errata list (and
a scorecard of the finders) at the ftp and http addresses mentioned in ques¬
tion 20.40.
As I hope I’ve made clear, this book is not a critique of the C programming
language, nor is it a critique of the book from which I first learned C or of
that book’s authors. I didn’t just learn C from K&R; I also learned a lot
about programming. As I contemplate my own contribution to the C pro¬
gramming literature, my only regret is that the present book does not live up
to a nice observation made in the second edition of K&R, namely, that “C is
not a big language, and it is not well served by a big book.” I hope that those
who most deeply appreciate C’s brevity and precision (and that of K&R) will
not be too offended by the fact that this book says some things over and over
and over or in three slightly different ways.
Although my name is on the cover, there are a lot of people behind this
book, and it’s difficult to know where to start handing out acknowledgments.
In a sense, every one of comp.lang.c’s readers (today estimated at 320,000) is
a contributor: The FAQ list behind this book was written for comp.lang.c
first, and this book retains the flavor of a good comp.lang.c discussion.
This book also retains, I hope, the philosophy of correct C programming
that I began learning when I started reading net.lang.c. Therefore, I shall first
acknowledge the posters who stand out in my mind as having most clearly
and consistently articulated that philosophy: Doug Gwyn, Guy Fiarris, Karl
Heuer, Henry Spencer, and Chris Torek. These gentlemen have displayed
remarkable patience over the years, answering endless questions with gen¬
erosity and wisdom. I was the one who stuck his neck out and started writ¬
ing the Frequent questions down, but I would hate to give the impression that
the answers are somehow mine. I was once the student (I believe it was Guy
who answered my post asking essentially the present volume’s question 5.10),
and I owe a real debt to the masters who went before me. This book is theirs
as much as mine, though I retain title to any inadequacies or mistakes I’ve
made in the presentation.
The former on-line FAQ list grew by a factor of three in the process of
becoming this book, and its growth was a bit rapid and awkward at times.
PREFACE XXXIII
Mark Brader, Vimt Carpenter, Stephen damage, Jutta Degener, Doug Gwyn,
Karl Heuer, Joseph Kent, and George Leach read proposals or complete
drafts and helped to exert some control over the process; I thank them for
their many careful suggestions and corrections. Their efforts grew out of a
shared wish to improve the overall understanding of C in the programming
community. I appreciate their dedication.
Three of those reviewers have also been long-time contributors to the on¬
line FAQ list. 1 thank Jutta Degener and Karl Heuer for their help over the
years, and I especially thank Mark Brader, who has been my most persistent
critic ever since I first began posting the comp.lang.c FAQ list five years ago.
I don’t know how he has had the stamina to make as many suggestions and
corrections as he has and to overcome my continuing stubborn refusal to
agree with some of them, even though (as I eventually understood) they really
were improvements. You can thank Mark for the form of many of this book’s
explanations and blame me for mangling any of them.
Additional assorted thanks: to Susan Cyr for the cover art; to Bob Dinse
and Eskimo North for providing the network access that is particularly vital
to a project like this; to Bob Holland for providing the computer on which
Eve done most of the writing; to Pete Keleher for the Alpha text editor; to the
University of Washington Mathematics Research and Engineering libraries for
access to their collections; and to the University of Washington Oceanogra¬
phy department for letting me borrow their tape drives to access my dusty old
archives of Usenet postings.
Thanks to Tanmoy Bhattacharya for the example in question 11.10, to
Arjan Kenter for the code in question 13.7, to Tomohiko Sakamoto for the
code in question 20.31, and to Roger Miller for the line in question 11.35.
Thanks to all these people, all over the world, who have contributed to the
FAQ list in various ways by offering suggestions, corrections, constructive
criticism, or other support: Jamshid Afshar, David Anderson, Tanner
Andrews, Sudheer Apte, Joseph Arceneaux, Randall Atkinson, Rick Beem,
Peter Bennett, Wayne Berke, Dan Bernstein, John Bickers, Gary Blaine, Yuan
Bo, Dave Boutcher, Michael Bresnahan, Vincent Broman, Stan Brown, Joe
Buehler, Kimberley Burchett, Gordon Burditt, Burkhard Burow, Conor P.
Cahill, D’Arcy J.M. Cam, Christopher Calabrese, Ian Cargill, Paul Carter,
Mike Chambers, Billy Chambless, Franklin Chen, Jonathan Chen, Raymond
Chen, Richard Cheung, Ken Corbin, Ian Cottam, Russ Cox, Jonathan Cox-
head, Lee Crawford, Steve Dahmer, Andrew Daviel, James Davies, John E.
Davis, Ken Delong, Norm Diamond, Jeff Dunlop, Ray Dunn, Stephen M.
Dunn, Michael J. Eager, Scott Ehrlich, Arno Eigenwillig, Dave Eisen, Bjorn
Engsig, David Evans, Clive D.W. Feather, Dominic Feeley, Simao Ferraz,
Chris Flatters, Rod Flores, Alexander Forst, Steve Fosdick, Jeff Francis, Tom
Gambill, Dave Gillespie, Samuel Goldstein, Tim Goodwin, Alasdair Grant,
xxxiv PREFACE
Ron Guilmette, Michael Hafner, Tony Hansen, Elliotte Rusty Harold, Joe
Harrington, Des Herriott, John Hascall, Ger Hobbelt, Dexter Holland &
Co., Jos Horsmeier, Blair Houghton, James C. Hu, Chin Huang, David Hurt,
Einar Indridason, Vladimir Ivanovic, Jon Jagger, Ke Jin, Kirk Johnson, Larry
Jones, James Kew, Lawrence Kirby, Kin-ichi Kitano, the kittycat, Peter
Klausler, Andrew Koenig, Tom Koenig, Adam Kolawa, Jukka Korpela, Aioy
Krishnan T, Markus Kuhn, Deepak Kulkarni, Oliver Laumann, John Lauro,
Felix Lee, Mike Lee, Timothy J. Lee, Tony Lee, Marty Leisner, Don Libes,
Brian Liedtke, Philip Lijnzaad, Keith Lindsay, Yen-Wei Liu, Paul Long,
Christopher Lott, Tim Love, Tim McDaniel, Kevin McMahon, Stuart Mac-
Martin, John R. MacMillan, Andrew Main, Bob Makowski, Evan Manning,
Barry Margolin, George Matas, Brad Mears, Bill Mitchell, Mark Moraes,
Darren Morby, Bernhard Muenzer, David Murphy, Walter Murray, Ralf
Muschall, Ken Nakata, Todd Nathan, Landon Curt Noll, Tim Norman, Paul
Nulsen, David O’Brien, Richard A. O’Keefe, Adam Kolawa, James Ojaste,
Hans Olsson, Bob Peck, Andrew Phillips, Christopher Phillips, Francois
Pinard, Nick Pitfield, Wayne Pollock, Dan Pop, Lutz Prechelt, Lynn Pye,
Kevin D. Quitt, Pat Rankin, Arjun Ray, Eric S. Raymond, Peter W. Richards,
Dennis Ritchie, Eric Roode, Manfred Rosenboom, J. M. Rosenstock, Rick
Rowe, Erkki Ruohtula, John Rushford, Rutabaga, Kadda Sahnine, Matthew
Saltzman, Rich Salz, Chip Salzenberg, Matthew Sams, Paul Sand, David W.
Sanderson, Frank Sandy, Christopher Sawtell, Jonas Schlein, Paul Schlyter,
Doug Schmidt, Rene Schmit, Russell Schulz, Dean Schulze, Chris Sears, Patri¬
cia Shanahan, Raymond Shwake, Peter da Silva, Joshua Simons, Ross Smith,
Henri Socha, Leslie J. Soraos, David Spuler, James Stern, Bob Stout, Steve Sul¬
livan, my sweetie Melanie Summit, Erik Talvola, Dave Taylor, Clarke
Thatcher, Wayne Throop, Steve Traugott, Ilya Tsindlekht, Andrew Tucker,
Goran Uddeborg, Rodrigo Vanegas, Jim Van Zandt, Wietse Venema, Tom
Verhoeff, Ed Vielmetti, Larry Virden, Chris Volpe, Mark Warren, Alan Wat¬
son, Kurt Watzka, Larry Weiss, Martin Weitzel, Howard West, Tom White,
Freek Wiedijk, Dik T. Winter, Lars Wirzenius, Dave Wolverton, Mitch
Wright, Conway Yee, Ozan S. Yigit, and Zhuo Zang. I have tried to keep
track of everyone whose suggestions I have used, but I fear I’ve probably
overlooked a few; my apologies to anyone whose name should be on this list
but isn’t.
Finally, I’d like to thank my editor at Addison-Wesley, Debbie Lafferty, for
tapping me on the electronic shoulder one day and asking if I might be inter¬
ested in writing this book. I was, and you now hold it, and I hope that it may
help to make C programming as pleasant for you as it is for me.
XXXV
xxxvi INTRODUCTION
you (and, at some level, it can’t, because the author presumably knows the
material already while you presumably don’t), you may be left with deep,
unanswered questions.
This book, however, is organized around some 400 of those questions, all
based on real ones asked by real people attempting to learn or program in C.
This book is not targeted at just those topics that the author thinks are
important; it is targeted at the topics that real readers think are important,
based on the questions they ask. The chances are good that if you are learn¬
ing or using C and you have a question about C that isn’t answered in any of
the other books you’ve checked, you’ll find it answered here.
This book can’t promise to answer every question you’ll have when you’re
programming in C, since many of the questions that will come up in your
programming will have to do with your problem domain, and this book cov¬
ers only the C language itself. Just as it can’t cover every aspect of every prob¬
lem anyone might try to solve in C, this book cannot cover every aspect of
every operating system that anyone might try to write a C program for or
every algorithm that anyone might try to implement in C. Specific problems,
specific operating systems, and general-purpose algorithms are properly dis¬
cussed in books or other materials devoted to those topics. Nevertheless, cer¬
tain questions involving operating systems and algorithms are quite frequent,
so Chapters 19 and 20 provide brief, introductory answers to a few of them,
but please don’t expect the treatment to be complete.
The questions in this book are those that people typically have after read¬
ing an introductory C textbook or taking a C programming class. Therefore,
this book will not teach you C, nor does it discuss fundamental issues that
any C textbook should cover. Furthermore, this book’s answers are intended
for the most part to be definitively correct and to avoid propagating any mis¬
conceptions. Therefore, a few answers are more elaborate than might at first
seem necessary: They give you the complete picture rather than oversimplify¬
ing or leaving out important details. (It is, after all, oversimplifications or
omitted details that are behind many of the misconceptions this book’s ques¬
tions and answers address.) Within the elaborate answers, you will find
shortcuts and simplifications where necessary, and in the glossary you will
find definitions of the precise terms that accurate explanations often demand.
The shortcuts and simplifications are, of course, safe ones: They should not
lead to later misunderstandings, and you can always come back to the more
complete explanations or pursue some of the references, if you later desire the
full story.
As we’ll see particularly in Chapters 3 and 11, the standard definitions of
C do not specify the behavior of every C program that can be written. Some
INTRODUCTION XXXVII
programs fall into various gray areas: They may happen to work on some sys¬
tems, and they may not be strictly illegal, but they are not guaranteed to work
everywhere. This book is about portable C programming, so its answers
advise against using nonportable constructs if at all possible.
The on-line FAQ list underlying this book was written as a dialog: When
people didn’t understand it, they said so. That feedback has been invaluable
in refining the form of the answers. Although a printed book is obviously
more static, such a dialog is still appropriate: Your comments, criticisms, and
suggestions are welcome. If you have access to the Internet, you may send
comments to [email protected], or you may send them on paper in care of the
publisher. A list of any errors that are discovered in this book will be main¬
tained and available on the Internet; see question 20.40 for information.
Question Format
The bulk of this book consists of a series of question/answer pairs. Many
answers also contain a list of references; a few also refer to footnotes, which
you can skip if you find that they’re too picky. Several respected references
are cited repeatedly, under these abbreviations:
Other references are cited by their full titles; full citations for all references
appear in the bibliography.
This constant width typeface is used to indicate C syntax (function
and variable names, keywords, etc.) and also to indicate a few operating sys
tern commands (cc, etc.). An occasional notation of the form tty(4) indicates
the section “tty” in chapter 4 of the UNIX Programmer’s Manual.
XXXVIII INTRODUCTION
Code Samples
This is a book about C, so many small pieces of it are necessarily written
in C. The examples are written primarily for clarity of exposition. They are
not always written in the most efficient way possible; making them “faster”
would often make them less clear. (See question 20.13 for more information
about code efficiency.) They are primarily written using modern, ANSI-style
syntax; see question 11.29 for conversion tips if you’re still using a “classic”
compiler.
The author and publisher invite you to use and modify these code frag¬
ments in your own programs, but of course we would appreciate acknowl¬
edgment if you do so. (Some fragments are from other sources and are so
attributed; please acknowledge those contributors if you use those codes.)
The source code for the larger examples is available on the Internet via
anonymous ftp from aw.com in directory cseng/authors/summit/cfaq (see also
question 18.12).
To underscore certain points, it has unfortunately been necessary to
include a few code fragments that are examples of things not to do. In the
answers, such code fragments are marked with an explicit comment like
/* WRONG */ to remind you not to emulate them. (Code fragments in ques¬
tions are not usually so marked; it should be obvious that the code fragments
in questions are suspect, as the question is usually “Why doesn’t this work?”)
Organization
As mentioned, this book’s questions are based on real questions asked by real
people, and real-world questions do not always fall into neat hierarchies.
Many questions touch on several topics: What seems to be a memory alloca¬
tion problem may in fact reflect an improper declaration. (Several questions
that straddle chapter boundaries appear in both chapters, to make them eas¬
ier to find.) In any case, this is not a book you have to read through sequen¬
tially: Use the table of contents, the list of questions that follows it, the index,
and the cross-references between questions to find the topics that are of inter¬
est to you. (And, if you have some free time, you may find yourself reading
through sequentially anyway; perhaps you’ll encounter the answer to a ques¬
tion you hadn’t thought to ask yet.)
Usually, you have to declare your data structures before you can start writ¬
ing code, so Chapter 1 starts out by talking about declaration and initializa¬
tion. C’s structure, union, and enumeration types are complicated enough
INTRODUCTION xxxix
that they deserve a chapter of their own; Chapter 2 discusses how they are
declared and used.
Most of the work of a program is carried out by expression statements,
which are the subject of Chapter 3.
Chapters 4 through 7 discuss the bane of many a beginning C program¬
mer: pointers. Chapter 4 covers pointers in general. Chapter 5 focuses on the
special case of null pointers, Chapter 6 describes the relationship between
pointers and arrays, and Chapter 7 explores what is often the real problem
when pointers are misbehaving: the underlying memory allocation.
Almost all C programs manipulate characters and strings, but these types
are implemented at a low level by the language. The programmer is often
responsible for managing these types correctly; some questions that come up
while doing so are collected in Chapter 8. Similarly, C does not have a formal
Boolean type; Chapter 9 briefly discusses C’s Boolean expressions and the
appropriate ways of implementing a user-defined Boolean type, if desired.
The C preprocessor (the part of the compiler responsible for handling
ttinclude and #def ine directives, and in fact all lines beginning with #)
is distinct enough that it almost represents a separate language and is covered
in its own chapter, Chapter 10.
The ANSI C Standardization committee (X3J11), in the process of clarify¬
ing C’s definition and making it palatable to the world, introduced a number
of new features and made a few significant changes. Questions specific to
ANSI/ISO Standard C are collected in Chapter 11. If you had experience with
pre-ANSI C (also called “K&R” or “classic” C), you will find Chapter 11 to
be a useful introduction to the differences. If you are comfortably using ANSI
C, on the other hand, the distinction between pre-ANSI and ANSI features
may not be interesting. In any case, all of the questions in Chapter 11 that
also relate to other topics (declarations, the C preprocessor, library functions,
etc.) also appear in or are otherwise cross-referenced from those other chap¬
ters.
C’s definition is relatively spartan in part because many features are not
built in to the language but are accessed via library functions. The most
important of these are the “Standard I/O,” or “stdio” functions, which are
discussed in Chapter 12. Other library functions are covered in Chapter 13.
Chapters 14 and 15 discuss two more advanced topics: floating point and
variable-length argument lists. Floating-point computations tend to be tricky
no matter what system or language you’re using; Chapter 14 outlines a few
general floating-point issues and a few that are specific to C. The possibility
that a function can accept a varying number of arguments, though perhaps
arguably unnecessary or dangerous, is occasionally convenient and is central
to C’s printf function; techniques for dealing with variable-length argument
lists are discussed in Chapter 15.
xl INTRODUCTION
Hiding in Chapter 16 are some questions you may want to jump to first if
you’re already comfortable with most of the preceding material: They con¬
cern the occasional strange problems and mysterious bugs that crop up in a
program and can be agonizingly frustrating to track down.
When there are two or more equally “correct” ways of writing a program
(and there usually are), one may be preferable based on subjective criteria
having to do with more than whether the code simply compiles and runs cor¬
rectly. Chapter 17 discusses a few of these ephemeral issues of programming
style.
You can’t build C programs in isolation: You need a compiler, and you
may need some additional documentation, source code, or tools. Chapter 18
discusses some available tools and resources, including lint, a nearly for¬
gotten but once indispensable tool for checking certain aspects of program
correctness and portability.
As mentioned, the C language does not specify everything you necessarily
need to get a real program working. Questions such as “How do I read one
character without waiting for the Return key?” and “How do I find the size
of a file?” are extremely common, but C does not define the answers; these
operations depend on the facilities provided by the underlying operating sys¬
tem. Chapter 19 presents a number of these questions, along with brief
answers for popular operating systems.
Finally, Chapter 20 collects the miscellaneous questions that don’t fit any¬
where else: bit manipulation, efficiency, algorithms, C’s relationship to other
languages, and a few trivia questions. (The introduction to Chapter 20 con¬
tains a slightly more detailed breakdown of its disparate contents.)
To close this introduction, here are two preliminary questions not so much
about C but more about this book:
Question: Why should I buy this book, if it’s available for free on the
Internet?
Answer: This book contains over three times as much material as does the
version that’s posted to comp.lang.c, and in spite of the advantages of elec¬
tronic documentation, it really can be easier to deal with this amount of
information in a printed form. (You’d spend a lot of time downloading this
much information from the net and printing it, and the typography wouldn’t
be as pretty, either.)
Answer: I pronounce it “eff ay kyoo,” and this was, I believe, the original
pronunciation when FAQ lists were first “invented.” Many people now pro-
INTRODUCTION xli
nounce it “fack,” which is nicely evocative of the word “fact.” I’d pronounce
the plural, as in the title of this book, “eff ay kyooze,” but many people pro¬
nounce it like “fax.” None of these pronunciations are strictly right or wrong;
“FAQ” is a new term, and popular usage plays a certain role in shaping any
term’s evolution.
(It’s equally imponderable, by the way, whether “FAQ” refers to the ques¬
tion alone, to the question plus its answer, or to the whole list of questions
and answers.)
But now, on with the real questions!
Declarations and
Initializations
declaration consists of several parts (though not all are required): a storage
class, a base type, type qualifiers, and finally the declarators (which may also
contain initializers). Each declarator not only declares a new identifier but
declaration mimics the eventual use of the identifier. (Question 1.21 discusses
Basic Types
Some programmers are surprised to discover that even though C is a fairly
low-level language, its type system is nevertheless mildly abstract; the sizes
and representations of the basic types are not precisely defined by the lan
guage.
1
2 CHAPTER 1
Answer: If you might need large values (above 32,767 or below -32,767),
use long. Otherwise, if space is very important (i.e., if there are large arrays
or many structures), use short. Otherwise, use int. If well-defined overflow
characteristics are important and negative values are not, or if you want to
steer clear of sign-extension problems when manipulating bits or bytes, use
one of the corresponding unsigned types. (Beware when mixing signed and
unsigned values in expressions, though; see question 3.19.)
Although character types (especially unsigned char) can be used as
“tiny” integers, doing so is sometimes more trouble than it’s worth. The com¬
piler will have to emit extra code to convert between char and int (making
the executable larger), and unexpected sign extension can be troublesome.
(Using unsigned char can help; see question 12.1 for a related problem.)
A similar space/time tradeoff applies when deciding between float and
double. (Many compilers still convert all float values to double during
expression evaluation.) None of the above rules apply if the address of a vari¬
able is taken and must have a particular type.
It’s often incorrectly assumed that C’s types are defined to have certain,
exact sizes. In fact, what’s guaranteed is that:
These values imply that char is at least 8 bits, short int and int are
at least 16 bits, and long int is at least 32 bits. (The signed and unsigned
versions of each type are guaranteed to have the same size.) Under ANSI C,
the maximum and minimum values for a particular machine can be found in
the header file <limits ,h> and are summarized in the following table:
The values in the table are the minimums guaranteed by the standard. Many
implementations allow larger values, but portable programs shouldn’t depend
on it.
If for some reason you need to declare something with an exact size, be
sure to encapsulate the choice behind an appropriate typedef. Usually the
only good reason for needing an exact size is when attempting to conform to
some externally imposed storage layout. See also questions 1.3 and 20.5.
Question: Why aren’t the sizes of the standard types precisely defined?
Question: Since C doesn’t define sizes exactly, I’ve been using typedefs like
int 16 and int32. I can then define these typedefs to be int, short, long,
etc., depending on what machine I’m using. That should solve everything,
right?
Answer: If you truly need control over exact type sizes, this is the right
approach. However, you should remain aware of several things:
• An exact match might not be possible (on the occasional 36-bit machine,
for example).
4 CHAPTER 1
Question: What should the 64-bit type on new, 64-bit machines be?
Answer: This is a sticky question. You can look at it in at least three ways:
1. On existing 16- and 32-bit systems, two of the three integer types (short
and plain int, or plain int and long) are typically the same size. A 64-
bit machine provides an opportunity to make all three types different sizes.
Therefore, some vendors support 64-bit long ints.
2. Sadly, a lot of existing code is written to assume that ints and longs are
the same size or that one or the other of them is exactly 32 bits. Rather
than risk breaking such code, some vendors introduce a new, nonstandard,
64-bit long long (or_longlong or_very long) type instead.
3. Finally, it can be argued that plain int should be 64 bits on a 64-bit
machine, since int traditionally reflects “the machine’s natural word size.”
Pointer Declarations
Although most questions about pointers come in Chapters 4 through 7, here
Since the * is part of the declarator, it’s best to use whitespace as shown; writ¬
ing char* invites mistakes and confusion.
See also question 1.13.
Question: I’m trying to declare a pointer and allocate some space for it, but
it’s not working. What’s wrong with this code?
char *p;
*p = malloc(10);
Answer: The pointer you declared is p, not *p. See question 4.2.
6 CHAPTER 1
Declaration Style
Declaring functions and variables before using them is not done just to keep
the compiler happy; it also injects useful order into a programming project.
and other difficulties can be more easily avoided, and the compiler can more
Question: What’s the best way to declare and define global variables?
extern int i;
int i = 0;
int f()
{
return 1;
When you need to share variables or functions across several source files, you
will of course want to ensure that all definitions and declarations are consis¬
tent. The best arrangement is to place each definition in a relevant .c file.
DECLARATIONS AND INITIALIZATIONS 7
Then, put an external declaration in a header (“.h”) file and use include to
bring the header and the declaration in wherever needed. The .c file contain¬
ing the definition should also include that same header file, so that the com¬
piler can check that the definition matches the declarations.
This rule promotes a high degree of portability: It is consistent with the
requirements of the ANSI/ISO C Standard and is also consistent with most
pre-ANSI compilers and linkers. (UNIX compilers and linkers typically use a
“common model,” which allows multiple definitions as long as at most one
is initialized; this behavior is mentioned as a “common extension” by the
standard, no pun intended. A few old, nonstandard systems may require an
explicit initializer to distinguish a definition from an external declaration.)
It is possible to use preprocessor tricks to arrange that a line such as
DEFINE(int, i);
need be entered only once in one header file and turned into a definition or a
declaration, depending on the setting of some macro. It’s not clear though,
whether this is worth the trouble, especially since it’s usually a better idea to
keep global variables to a minimum.
It’s more than a good idea to put global declarations in header files: If you
want the compiler to be able to catch inconsistent declarations for you, you
must place them in header files. In particular, never place a prototype for an
external function in a .c file; if the definition of the function ever changes, it
would be too easy to forget to change the prototype, and an incompatible
prototype is worse than useless.
See also questions 1.24, 10.6, 17.2, and 18.8.
Question: How can I make a sort of “semiglobal” variable, that is, one
that’s private to a few functions spread across a few source files?
1. Pick a unique prefix for the names of all functions and global variables in
a library or package of related functions, and warn users of the package
not to define or use any symbols with names matching that prefix other
than those documented as being for public consumption. (In other words,
an undocumented but otherwise global symbol with a name matching that
prefix is, by convention, “private.”)
2. Use a name beginning with an underscore, since such names shouldn’t be
used by ordinary code. (See question 1.29 for more information and for a
description of the “no man’s land” between the user and implementation
namespaces.)
It may also be possible to use special linker invocations to adjust the visibil¬
ity of names, but any such techniques are outside of the scope of the C lan¬
guage.
Storage Classes
We’ve covered two parts of declarations: base types and declarators. The next
few questions discuss the storage class, which determines visibility and lifetime
1.10
Question: Do all declarations for the same static function or variable
have to include the storage class static?
Answer: The language in the standard does not quite require this (what’s
most important is that the first declaration contain static), but the rules are
DECLARATIONS AND INITIALIZATIONS 9
rather intricate and differ slightly for functions and for data objects. (Fur¬
thermore, existing practice has varied widely in this area.) Therefore, it’s
safest if static appears consistently in the definition and all declarations.
Answer: The storage class extern is significant only with data declara¬
tions. In function declarations, it can be used as a stylistic hint to indicate that
the function’s definition is probably in another source file, but there is no for¬
mal difference between
and
int f();
1.12
Question: What’s the auto keyword good for?
Answer: Nothing; it’s archaic. (It’s a holdover from C’s typeless predecessor
language B, which lacked keywords such as int and which required that
every declaration have a storage class.) See also question 20.37.
Typedefs
Although it is syntactically a storage class, the typedef keyword is, as its
name suggests, involved in defining new type names, not new functions or
variables.
1.13
Question: What’s the difference between using a typedef or a preprocessor
macro for a user-defined type?
Answer: In general, typedefs are preferred, in part because they can cor¬
rectly encode pointer types. For example, consider these declarations:
In these declarations, si, s2, and s3 are all declared as char *, but s4
is declared as a char, which is probably not the intention. (See also question
1.5.)
Preprocessor macros do have the advantage that #ifdef works on them
(see also question 10.15). On the other hand, typedefs have the advantage
that they obey scope rules (that is, they can be declared local to a function or
a block).
See also questions 1.17, 2.22, 11.11, and 15.11.
1.14
Question: I can’t seem to declare a linked list successfully. 1 tried this decla¬
ration, but the compiler gave me error messages.
typedef struct {
char *item;
NODEPTR next;
} *NODEPTR;
char *item;
} *NODEPTR;
You could also precede the structure declaration with the typedef, in which case
you could use the NODEPTR typedef when declaring the next field, after all:
struct node {
char *item;
NODEPTR next;
};
In this case, you declare a new typedef name involving struct node even
though struct node has not been completely defined yet; this you’re
allowed to do.
*In the simple example typedef struct { int i; } simplestruct; both the structure and its type¬
def name (“simplestruct”) are defined at the same time; note that there is no structure tag.
12 CHAPTER 1
struct node {
char *item;
};
typedef struct node *NODEPTR;
1.15
Question: How can I define a pair of mutually referential structures? I tried
typedef struct {
int afield;
BPTR bpointer;
} *APTR;
typedef struct {
int bfield;
APTR apointer;
} *BPTR;
but the compiler doesn’t know about BPTR when it is used in the first struc¬
ture declaration.
Answer: As in question 1.14, the problem lies not in the structures or the
pointers but in the typedefs. First, give the two structures tags, and define the
link pointers without using typedefs:
DECLARATIONS AND NITIALIZATIONS 13
struct a {
int afield;
struct b *bpointer;
};
struct b {
int bfield;
struct a *apointer;
};
Tlie compiler can accept the field declaration struct b *bpointer within
struct a, even though it has not yet heard of struct b (which is “incom¬
plete” at that point). Occasionally, it is necessary to precede this couplet with
the line
struct b;
This empty declaration masks the pair of structure declarations (if in an inner
scope) from a different struct b in an outer scope.
After declaring the two structures with structure tags, you can then declare
the typedefs separately:
Alternatively, you can define the typedefs before the structure definitions, in
which case you can use them when declaring the link pointer fields:
struct a {
int afield;
BPTR bpointer;
};
struct b {
int bfield;
APTR apointer;
};
1.16
Question: What’s the difference between these two declarations?
struct xl { ... } ;
1.17
Question: What does “typedef int (*funcptr) () ;” mean?
1.18
Question: I’ve got the declarations
const charp p;
1.19
Question: Why can’t I use const values in initializers and array dimen¬
sions, like this?
const int n = 5;
int a[n] ;
1.20
Question: How do const char *p, char const *p, and char *
const p differ?
Complex Declarations
C’s declarations can be arbitrarily complex. Once you’re used to deciphering
them, you can make sense of even the most complicated ones, although truly
don’t feel like cluttering your program with arcane declarators such as
* (* (*a [N] ) ())(), you can always use a few typedefs to clarify things, as
Answer: The first part of this question can be answered in at least three
ways:
1. char * ( * ( *a [N] ) () ) () ;
3. Use cdecl, which turns English into C and vice versa. You provide a
longhand description of the type you want, and cdecl responds with the
equivalent C declaration:
The cdecl program can also explain complicated declarations (you give
it a complicated declaration and it responds with an English description),
help with casts, and indicate which set of parentheses the arguments go in
(for complicated function definitions, like the one here). Versions of
cdecl are in volume 14 of comp.sources.unix (see question 18.16) and
K&R2.
DECLARATIONS AND INITIALIZATIONS 17
char *pc;
the base type is char, the identifier is pc, and the declarator is *pc; this tells
us that *pc is a char (this is what “declaration mimics use” means).
One way to make sense of complicated C declarations is by reading them
“inside out,” remembering that [] and () bind more tightly than *. For
example, given
char *(*pfpc)();
we can see that pfpc is a pointer (the inner *) to a function (the ()) to a
pointer (the outer *) to char. When we later use pfpc, the expression
* ( *pf pc) () (the value pointed to by the return value of a function pointed
to by pfpc) will be a char.
Another way of analyzing these declarations is to decompose the declara¬
tor while composing the description, maintaining the “declaration mimics
use” relationship:
* ( *pfpc) () is a char
(*pfpc)() is a pointer to char
(*pfpc) is a function returning pointer to char
pfpc is a pointer to function returning pointer to char
If you’d like to make things clearer when declaring complicated types like
these, you can make the analysis explicit by using a chain of typedefs, as in
the preceding option 2.
The pointer-to-function declarations in these examples have not included
information about parameter type. When the parameters have complicated
types, declarations can really get messy. (Modern versions of cdecl can help
here, too.)
’‘'Furthermore, a storage class (static, register, etc.) may appear along with the base type, and type
qualifiers (const, volatile) may be interspersed with both the type and the declarator. See also ques¬
tion 11.9.
18 CHAPTER 1
Answer: You can’t quite do it directly. One way is to have the function
return a generic function pointer (see question 4.13), with some judicious
casts to adjust the types as the pointers are passed around:
statemachine()
{
ptrfuncptr state = start;
while(state != stop)
state = (ptrfuncptr)(*state)();
return 0;
funeptr start()
{
return (funeptr)statel;
Another way (suggested by Paul Eggert, Eugene Ressler, Chris Volpe, and
perhaps others) is to have each function return a structure containing only a
pointer to a function returning that structure:
struct functhunk {
};
statemachine()
{
struct functhunk state = {start};
while(state.func != stop)
state = (*state.func)();
{
struct functhunk ret;
ret.func = statel;
return ret;
Note that these examples use the older, explicit style of calling via function
pointers; see question 4.12. See also question 1.17.
Array Sizes
1.23
Question: Can I declare a local array (or parameter array) of a size match¬
ing a passed-in array or set by another parameter?
1.24
Question: I have an extern array defined in one file and used in another:
filel.c: file2.c:
1. Declare a companion variable, containing the size of the array, defined and
initialized (with sizeof) in the same source file where the array is
defined:
filel.c: file2.c:
filel.h:
#define ARRAYSZ 3
filel.c: file2.c:
3. Use a sentinel value (typically, 0, -1, or NULL) in the array’s last element,
so that code can determine the end without an explicit size indication:
filel.c: file2.c:
Obviously, the choice will depend to some extent on whether the array was
already being initialized; if it was, option 2 is poor. See also question 6.21.
Declaration Problems
Sometimes the compiler insists on complaining about your declarations no
matter how carefully constructed you thought they were. These questions
1.25
Question: Why is my compiler complaining about an invalid redeclaration
of a function that I define and call only once?
11.26
Question: My compiler is complaining about mismatched function proto¬
types that look fine to me. Why?
11.27
Question: I’m getting strange syntax errors on the very first declaration in a
file, but it lc oks fine. Why?
1.28
Question: Why isn’t my compiler letting me declare a large array, such as:
double array[256][256];
Namespace
Naming never seems as though it should be that much of a problem, but of
course it can be. Coming up with names for functions and variables isn’t
you don’t have to worry whether the public will like the names in your pro¬
grams—but you do have to be more sure that the names aren’t already taken.
DECLARATIONS AND INITIALIZATIONS 23
1.29
Question: How can I determine which identifiers are safe for me to use and
which are reserved?
“These concerns apply not only to public symbols but also to the names of the implementation’s various inter¬
The rules, paraphrased from ANSI §4.1.2.1 (ISO §7.1.3), are as follows:
Rules 3 and 4 are further complicated by the fact that several sets of macro
names and standard library identifiers are reserved for “future directions”;
that is, later revisions of the standard may define new names matching cer¬
tain patterns. This table lists the patterns reserved for “future directions”
whenever a given standard header is included:
The notation [A-Z] means “any uppercase letter”; similarly, [a-z] and
[0-9] indicate lowercase letters and digits. The notation * means “any¬
thing.” For example, if you include <stdlib.h>, all external identifiers
beginning with the letters str followed by a lowercase letter are reserved.
DECLARATIONS AND INITIALIZATIONS 25
What do these five rules really mean? If you want to be on the safe side:
Initialization
A declaration of a variable can, of course, also contain an initial value for
Question: What can I safely assume about the initial values of variables that
are not explicitly initialized? If global variables start out as “zero,” is that
good enough for null pointers and floating-point zeroes?
This requirement means that compilers and linkers on machines that use nonzero internal representations for
null pointers or floating-point zeroes cannot necessarily make use of uninitialized, 0-filled memory but must
emit explicit initializers for these values (rather as if the programmer had).
Initializers are not effective if you jump into the middle of a block with either a goto or a switch. Initializ¬
ers are therefore never effective on variables declared in the main block of a switch statement.
Early printings of K&R2 incorrectly stated that partially initialized automatic aggregates were filled out with
garbage.
DECLARATIONS AND INITIALIZATIONS 27
f0
{
char a[] = "Hello, world!";
>
Answer: Perhaps you have a pre-ANSI compiler, which doesn’t allow ini¬
tialization of “automatic aggregates” (non-static local arrays, structures,
and unions). You have four possible workarounds:
1. If the array won’t be written to or if you won’t need a fresh copy during
any subsequent calls, you can declare it static (or perhaps make it
global).
2. If the array won’t be written to, you could replace it with a pointer:
f 0
{
char *a = "Hello, world!";
}
You can always initialize local char * variables to point to string literals
(but see question 1.32).
28 CHAPTER 1
f0
{
char a[14];
strcpyfa, "Hello, world!");
}
1.32
Question: What is the difference between these initializations?
Question: I finally figured out the syntax for declaring pointers to func¬
tions, but now how do I initialize one?
Structures, unions, and enumerations are all similar in that they let you define
new data types. First, you define the new type by declaring the members or
At the same time, you may optionally give the new type a tag by which it can
be referred to later. Having defined a new type, you can declare instances of
it, either at the same time the type is defined or later (using the tag).
To complicate matters, you can also use typedef to define new type
names for user-defined types, just as you can for all types. If you do, though,
you’ll need to realize that the typedef name has nothing to do with the tag
through 2.18 cover structures, 2.19 and 2.20 cover unions, 2.22 through 2.24
30
STRUCTURES, UNIONS, AND ENUMERATIONS 31
Structure Declarations
struct xl { ... } ;
typedef struct { ... } x2 ;
Answer: The first form declares a structure tag; the second declares a type¬
def. The main difference is that the second declaration is of a slightly more
abstract type—its users don’t necessarily know that it is a structure, and the
keyword struct is not used when declaring instances of it:
x2 b;
Structures declared with tags, on the other hand, must be defined with the
struct xl a;
form.*
(It’s also possible to play it both ways:
It’s legal, if potentially obscure, to use the same name for both the tag and the
typedef, since they live in separate namespaces. See question 1.29.)
struct x { ... };
x thestruct;
Answer: C is not C++. Typedef names are not automatically generated for
structure tags. Actual structures are declared in C with the struct keyword:
struct x thestruct;
that this entire distinction is absent in C++ and perhaps in some C++ compilers
“-It may be worth mentioning
In C++, structure tags are essentially declared as typedefs automatically.
masquerading as C compilers.
32 CHAPTER 2
If you wish, you can declare a typedef when you declare a structure and use
the typedef name to declare actual structures:
tx thestruct;
Answer: Most certainly. A problem can arise if you try to use typedefs; see
questions 1.14 and 1.15.
Answer: One good way is for clients to use structure pointers (perhaps addi¬
tionally hidden behind typedefs) that point to structure types that are not
publicly defined. In other words, a client uses structure pointers (and calls
functions accepting and returning structure pointers) without knowing any¬
thing about what the fields of the structure are. As long as the details of the
structure aren t needed—i.e., as long as the -> and sizeof operators are not
used—C can in fact handle pointers to structures of incomplete type. Only
within the source files implementing the abstract data type are complete dec¬
larations for the structures actually in scope.
See also question 11.5.
Question: 1 came across some code that declared a structure like this:
struct name {
int namelen;
char namestr[1];
};
and then did some tricky allocation to make the namestr array act as if it
had several elements, with the number recorded by namelen. How does this
work? Is it legal or portable?
Answer: It’s not clear if it’s legal or portable, but it is rather popular. An
implementation of the technique might look something like this:
#include <stdlib.h>
#include <string.h>
{
struct name *ret =
malloc(sizeof(struct name)-1 + strlen(newname)+1);
/* -1 for initial [1]; +1 for \0 */
if(ret != NULL) {
ret->namelen - strlen(newname);
strcpy(ret->namestr, newname);
return ret;
This function allocates an instance of the name structure with the size
adjusted so that the namestr field can hold the requested name (not just one
character, as the structure declaration would suggest).
34 CHAPTER 2
#include <stdlib.h>
#include <string.h>
struct name {
int namelen;
char namestr[MAX];
};
malloc(sizeof(struct name)-MAX+strlen(newname)+1);
/* +1 for \0 */
if(ret != NULL) {
ret->namelen = strlen(newname);
strcpy(ret->namestr, newname);
}
return ret;
Here MAX should of course be larger than any name that will be stored. How¬
ever, it looks as though this technique is disallowed by a strict interpretation
of the standard as well.
Of course, the right thing to do to be truly safe is use a character pointer
instead of an array:
STRUCTURES, UNIONS, AND ENUMERATIONS 35
#include <stdlib.h>
#include <string.h>
struct name {
int namelen;
char *namep;
{
struct name *ret = malloc(sizeof(struct name));
if(ret != NULL) {
ret->namelen = strlen(newname);
ret->namep = malloc(ret->namelen + 1) ;
if(ret->namep == NULL) {
free(ret);
return NULL;
}
strcpy(ret->namep, newname);
return ret;
Obviously, the “convenience” of having the length and the string stored in the
same block of memory has now been lost, and freeing instances of this struc¬
ture will require two calls to free; see question 7.23.
When the data type being stored is characters, as in the preceding exam¬
ples, it is straightforward to coalesce the two calls to malloc into one, to
preserve contiguity (and therefore rescue the ability to use a single call to
free):
{
char *buf = malloc(sizeof(struct name) +
strlen (newname) + In¬
strcpy(ret->namep, newname);
return ret;
}
36 CHAPTER 2
However, piggybacking a second region onto a single malloc call like this
is portable only if the second region is to be treated as an array of char. For
any larger type, alignment (see questions 2.12 and 16.7) becomes significant
and would have to be preserved.
Structure Operations
Answer: K&R1 also said that the restrictions on structure operations would
be lifted in a forthcoming version of the compiler, and in fact the operations
of assigning structures, passing structures as function arguments, and return¬
ing structures from functions were fully functional in Ritchie’s compiler even
as K&R1 was being published. Although a few early C compilers lacked these
operations, all modern compilers support them, and they are part of the stan¬
dard, so there should be no reluctance to use them.51'
(Note that when a structure is assigned, passed, or returned, the copying is
done monolithically. This means that the copies of any pointer fields will
point to the same place as the original. In other words, anything pointed to is
not copied.)
See the code fragments in question 14.11 for an example of structure oper¬
ations in action.
However, passing large structures to and from functions can be expensive (see question 2.9), so you may want
to consider using pointers instead (as long as you don’t need pass-by-value semantics, of course).
STRUCTURES, UNIONS, AND ENUMERATIONS 37
2.10
Question: How can I pass constant values to functions that accept structure
arguments? How can 1 create nameless, immediate, constant structure values?
2.11
Question: How can I read/write structures from/to data files?
and a corresponding f read invocation can read it back in. What happens here
is that fwrite receives a pointer to the structure and writes (or f read corre¬
spondingly reads) the memory image of the structure as a stream of bytes. The
sizeof operator determines how many bytes the structure occupies.
This call to fwrite is correct under an ANSI compiler as long as a pro¬
totype for fwrite is in scope, usually because <stdio.h> is included.
Under pre-ANSI C, a cast on the first argument is required:
important concern if the data files you’re writing will ever be interchanged
between machines. See also questions 2.12 and 20.5.
Note also that if the structure contains any pointers (char * strings or
pointers to other data structures), only the pointer values will be written, and
they are most unlikely to be valid when read back in. Finally, note that for
widespread portability, you must use the "b" flag when opening the files; see
question 12.38.
A more portable solution, although it’s a bit more work initially, is to write
a pair of functions for writing and reading a structure, field by field, in a
portable (perhaps even human-readable) way.
Structure Padding
Answer: Many machines access values in memory most efficiently when the
values are appropriately aligned. For example, on a byte-addressed machine,
short ints of size 2 might best be placed at even addresses, and long
ints of size 4 at addresses that are multiples of 4. Some machines cannot
perform unaligned accesses at all and require that all data be appropriately
aligned.
Suppose that you declare this structure
struct {
char c;
int i;
};
The compiler will usually leave an unnamed, unused hole between the char
and int fields to ensure that the int field is properly aligned. (This incre¬
mental alignment of the second field based on the first relies on the fact that
the structure itself is always properly aligned, with the most conservative
alignment requirement. The compiler guarantees this alignment for structures
it allocates, as does malloc.)
40 CHAPTER 2
Your compiler may provide an extension to give you control over the pack¬
ing of structures (i.e., whether they are padded), perhaps with a ttpragma
(see question 11.20), but there is no standard method.
If you’re worried about wasted space, you can minimize the effects of
padding by ordering the members of a structure from largest to smallest. You
can sometimes get more control over size and alignment by using bitfields,
although they have their own drawbacks (see question 2.26).
See also questions 16.7 and 20.5.
2.13
Question: Why does sizeof report a larger size than I expect for a struc¬
ture type, as if there were padding at the end?
Answer: Structures may have this padding (as well as internal padding), if
necessary, to ensure that alignment properties will be preserved when an
array of contiguous structures is allocated. Even when the structure is not
part of an array, the end padding remains, so that sizeof can always return
a consistent size. See question 2.12.
Accessing Members
2.14
Question: How can I determine the byte offset of a field within a structure?
2.15
Question: How can I access structure fields by name at run time?
Answer: Build a table of names and offsets, using the of fsetof () macro.
The offset of field b in struct a is:
offsetb = offsetof(struct a, b)
2.16
Question: Does C have an equivalent to Pascal’s with statement?
2.17
Question: If an array name acts like a pointer to the base of an array, why
isn’t the same thing true of a structure?
Answer: The rule (see question 6.3) that causes array references to “decay”
into pointers is a special case that applies only to arrays and that reflects their
“second-class” status in C. (An analogous rule applies to functions.) Struc¬
tures, however, are first-class objects: When you mention a structure, you get
the entire structure.
2.18
Question: This program works correctly, but it dumps core after it finishes.
Why?
struct list {
char *item;
struct list *next;
main(argc, argv)
{ ... }
Unions
2.19
Question: What’s the difference between a structure and a union?
2.20
Question: Is there a way to initialize unions?
Answer: The ANSI/SIO C Standard allows an initializer for the first mem¬
ber of a union. There is no standard way of initializing any other member.
(Under a pre-ANSI compiler, there is generally no way of initializing a union
at all.)
Many proposals have been advanced to allow more flexible union initial¬
ization, but none has been adopted yet. (The GNU C compiler provides ini¬
tialization of any union member as an extension, and this feature is likely to
make it into a future revision of the C standard.) If you re really desperate,
you can sometimes define several variant copies of a union, with the members
in different orders, so that you can declare and initialize the one having the
appropriate first member. (These variants are guaranteed to be implemented
compatibly, so it’s okay to “pun” them by initializing one and then using the
other.)
2.21
Question: Is there an automatic way to keep track of which field of a union
is in use?
struct taggedunion {
union {
int i;
long 1;
double d;
void *p;
} u;
};
You will have to make sure that the code field is always set appropriately
when the union is written to; the compiler won't do any of this for you auto¬
matically. (C unions are not like Pascal variant records.)
Enumerations
2.22
Question: What is the difference between an enumeration and a set of pre¬
processor ttdefines?
Answer: At the present time, there is little difference. The C standard says
that enumerations have integral type and that enumeration constants are of
type int, so both may be freely intermixed with other integral types, without
errors. (If such intermixing were disallowed without explicit casts, judicious
use of enumerations could catch certain programming errors.)
STRUCTURES, UNIONS, AND ENUMERATIONS 45
Some advantages of enumerations are that the numeric values are auto¬
matically assigned, that a debugger may he able to display the symbolic val¬
ues when enumeration variables are examined, and that they obey block
scope. (A compiler may also generate nonfatal warnings when enumerations
and integers are indiscriminately mixed, since doing so can still be considered
bad style even though it is not strictly illegal.) A disadvantage is that the pro¬
grammer has little control over those nonfatal warnings; some programmers
also resent not having control over the sizes of enumeration variables.
2.23
Question: Are enumerations portable?
2.24
Question: Is there an easy way to print enumeration values symbolically?
Answer: No. You can write a small function (one per enumeration) to map
an enumeration constant to a string, either by using a switch statement or
by searching an array. (If you’re worried only about debugging, a good
debugger should automatically print enumeration constants symbolically.)
46 CHAPTER 2
Bitfields
Question: What do these colons and numbers mean in some structure dec¬
larations?
struct record {
char ‘name;
int refcount : 4;
unsigned dirty : 1;
};
Answer: Those are bitfields; the number gives the exact size of the field, in
bits. (See any complete book on C for the details.) Bitfields can be used to
save space in structures having several binary flags or other small fields. They
can also be used in an attempt to conform to externally imposed storage lay¬
outs. (Their success at the latter task is mitigated by the fact that bitfields are
assigned left to right on some machines and right to left on others.)
Note that the colon notation for specifying the size of a field in bits is valid
only in structures (and in unions); you cannot use this mechanism to specify
the size of arbitrary variables. (See questions 1.2 and 1.3.)
2.26
Question: Why do people use explicit masks and bit-twiddling code so
much instead of declaring bitfields?
memory, but that’s equally true of the bytes of all types and matters only if
you’re trying to conform to externally imposed storage layouts. (Doing so is
always nonportable; see also questions 2.12 and 20.5.)
Bitfields are inconvenient when you also want to be able to manipulate
some collection of bits as a whole (perhaps to copy a set of flags). You can’t
have arrays of bitfields; see also question 20.8. Many programmers suspect
that the compiler won’t generate good code for bitfields; historically, this was
sometimes true.
Straightforward code using bitfields is certainly clearer than the equivalent
explicit masking instructions; it’s too bad that bitfields can’t be used more
often.
Expressions
reasonably small and easy to write and for good code to be reasonably easy
to generate. This dual goal has significant impacts on the language specifica¬
tion, although the implications are not always appreciated by users, particu¬
larly if they are used to languages that are more tightly specified or that try
to do more for them (such as protecting them from their own mistakes).
Evaluation Order
A compiler is given relatively free rein in choosing the evaluation order of the
order chosen by the compiler is immaterial unless there are multiple visible
side effects or if several parallel side effects involve a single variable, in which
48
EXPRESSIONS 49
a[i] = i++;
Answer: The subexpression i++ causes a side effect—it modifies i’s value—
which leads to undefined behavior, since i is also referenced elsewhere in the
same expression. There is no way of knowing whether the reference will hap¬
pen before or after the side effect—in fact, neither obvious interpretation
might hold; see question 3.9. (Note that although the language in K&R sug¬
gests that the behavior of this expression is unspecified, the C standard makes
the stronger statement that it is undefined—see question 11.33.)
int i = 7;
printf("%d\n", i++ * i++);
The behavior of code that contains multiple, ambiguous side effects has
always been undefined. A single expression should not cause the same object
to be modified twice or to be modified and then inspected. Don’t even try to
find out how your compiler implements such things, let alone write code that
depends on them (contrary to the ill-advised exercises in many C textbooks);
as Kernighan and Ritchie wisely point out, “if you don’t know how they are
done on various machines, that innocence may help to protect you.” See also
questions 3.8, 3.11, and 11.33.
int i = 3;
i = i++;
Some gave i the value 3, some gave 4, but one gave 7.1 know that the behav¬
ior is undefined, but how could it give 7?
Answer: Undefined behavior means that anything can happen. See questions
3.9 and 11.33. (Also, note that neither i++ nor ++i is the same as i+1. If
you want to increment i, use i = i + l or i + + or + + i, not some combination.
See also question 3.12.)
before the addition, but we don’t know which of the three functions will be
called first. In other words, precedence specifies order of evaluation only par¬
tially, where “partially” emphatically does not cover evaluation of operands.
Parentheses tell the compiler which operands go with which operators but
do not force the compiler to evaluate everything within the parentheses first.
Adding explicit parentheses to the preceding expression to make it
f0 + (g() * h())
would make no difference in the order of the function calls. Similarly, adding
explicit parentheses to the expression from question 3.2 accomplishes noth¬
ing, since ++ already has higher precedence than *):
Question: But what about the && and | | operators? I see code like
“while ((c = getcharO) != EOF && c != ' \n')”.
*The comma operator also guarantees left-to-right evaluation and an intermediate sequence point; see also
question 3.7.
52 CHAPTER 3
Question: Is it safe to assume that the right-hand side of the && and |
operators won’t be evaluated if the left-hand side determines the outcome?
and
{ /* no string */ }
call f 2 first? I thought that the comma operator guaranteed left-to-right eval¬
uation.
" If the commas separating the arguments in a function call were comma operators, no function could receive
more than one argument!
EXPRESSIONS 53
Question: How can I understand complex expressions like the ones in this
chapter and avoid writing undefined ones? What’s a “sequence point”?
Answer: A sequence point is a point at which the dust has settled and all
side effects that have been seen so far are guaranteed to be complete. The
sequence points listed in the C standard are:
Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the prior
value shall be accessed only to determine the value to be stored.
These two rather opaque sentences say several things. First, they talk about
operations bounded by the “previous and next sequence points”; such opera¬
tions usually correspond to full expressions. (In an expression statement, the
“next sequence point” is usually at the terminating semicolon, and the “pre¬
vious sequence point” is at the end of the previous statement. An expression
may also contain intermediate sequence points, as listed previously.)
The first sentence rules out both the examples i++ * i++ and i = i++ from
questions 3.2 and 3.3—in both cases, i has its value modified twice within the
expression, i.e., between sequence points. (If we were to write a similar expies-
sion that did have an internal sequence point, such as i++ && i++, it would be
well defined, if questionably useful.)
54 CHAPTER 3
The second sentence can be quite difficult to understand. It turns out that
it disallows code like a[i] = i++ from question 3.1. (In fact, the other
expressions we’ve been discussing also violate the second sentence.) To see
why, let’s first look more carefully at what the standard is trying to allow and
disallow.
Clearly, expressions like a = b and c = d + e that read some values and
use them to write others, are well defined and legal. Clearly,* expressions like
i = i++ that modify the same value twice are abominations that needn’t be
allowed (or in any case, needn’t be well defined, i.e., we don’t have to figure
out a way to say what they do, and compilers don’t have to support them).
Expressions like these are disallowed by the first sentence.
It’s also clear* that we’d like to disallow expressions like a[i] = i++ that
modify i and use it along the way, but not disallow expressions like i = i + 1
that use and modify i but only modify it later when it’s reasonably easy to
ensure that the final store of the final value (into i, in this case) doesn’t inter¬
fere with the earlier accesses.
And that’s what the second sentence says: If an object is written to within
a full expression, any and all accesses to it within the same expression must
be for the purposes of computing the value to be written. This rule effectively
constrains legal expressions to those in which the accesses demonstrably pre¬
cede the modification. The old standby i = i + 1 is allowed because the
access of i is used to determine i’s final value. The example a[i] = i++ is
disallowed because one of the accesses of i (the one in a [ i ]) has nothing to
do with the value that ends up being stored in i (which happens over in i + + ),
and so there’s no good way to define—for either our understanding or the
compiler’s—whether the access should take place before or after the incre¬
mented value is stored. Since there’s no good way to define it, the standard
declares that it is undefined and that portable programs simply must not use
such constructs.
See also questions 3.9 and 3.11.
*Well, you may disagree, but it was clear to the people who wrote the standard.
EXPRESSIONS 55
Question: So if I write
a[i] = i++;
and I don’t care which cell of a [ ] gets written to, the code is fine, and i gets
incremented by 1, right?
Answer: No. For one thing, if you don’t care which cell of a [ ] gets written
to, why write code that seems to write to a [ ] at all? More significantly, once
an expression or a program becomes undefined, all aspects of it become unde¬
fined. When an undefined expression has (apparently) two plausible interpre¬
tations, do not mislead yourself by imagining that the compiler will choose
one or the other. The standard does not require that a compiler make an obvi¬
ous choice, and some compilers don’t. In this case, not only do we not know
whether a[i] or a[i + l] is written to, it is possible that a completely unre¬
lated cell of the array (or any random part of memory) may be written to,
and it is also not possible to predict what final value i will receive. See ques¬
tions 3.2, 3.3, 11.33, and 11.35.
3.10
Question: People keep saying that the behavior of i = i++ is undefined,
but I just tried it on an ANSI-conforming compiler and got the results I
expected.
3.11
Question: How can I avoid these undefined evaluation-order difficulties if I
don’t feel like learning the complicated rules?
Answer: The easiest answer is that if you steer clear of expressions that
don’t have reasonably obvious interpretations, you’ll generally steer clear of
56 CHAPTER 3
the undefined ones, too. (Of course, “reasonably obvious” means different
things to different people. This answer works as long as you agree that
a[i] = i + + and i = i + + are not “reasonably obvious.”)
To be a bit more precise, here are some simpler rules that are slightly more
conservative than the ones in the standard but that will help to make sure
that your code is “reasonably obvious” and equally understandable to both
the compiler and your fellow programmers:
1. Make sure that each expression modifies at most one object: a simple vari¬
able, a cell of an array, or the location pointed to by a pointer (e.g., *p).
A “modification” is a simple assignment with the = operator; a compound
assignment with an operator like +=, -=, or *=; or an increment or decre¬
ment with ++ or -- (in either pre or post forms).
2. If an object (as just defined) appears more than once in an expression and
is the object modified in the expression, make sure that all appearances of
the object that fetch its value participate in the computation of the new
value which is stored. This rule allows the expression i = i + 1 because
although the object i appears twice and is modified, the appearance (on
the right-hand side) that fetches i’s old value is used to compute i’s new
value.
3. If you want to break rule 1, make sure that the several objects being mod¬
ified are distinctly different. Also, try to limit yourself to two or at most
three modifications and of a style matching those of the following exam¬
ples. (Make sure that you continue to follow rule 2 for each object modi¬
fied.)
The expression c = *p++ is allowed under this rule, because the two
objects modified (c and p) are distinct. The expression *p++ = c is also
allowed, because p and *p (i.e., p itself and what it points to) are both
modified but are almost certainly distinct. Similarly, both c = a[i++]
and a[i++] = c are allowed, because c, i, and a[i] are presumably
all distinct. Finally, expressions in which three or more things are mod¬
ified—e.g., p, q, and *p in *p++ = *q++, and i, j, and a [ i ] in
a[i++] = b[j++]—are allowed if all three objects are distinct, i.e.,
only if two different pointers p and q or two different array indices i and
j are used.
4. You may also break the first two rules if you interpose a defined sequence-
point operator between the two modifications or between the modification
and the access. This expression (commonly seen in a while loop while
EXPRESSIONS 57
reading a line) is legal because the second access of the variable c occurs
after the sequence point implied by &&:
Without the sequence point, the expression would be illegal because the
access of c while comparing it to ' \n' on the right does not “determine
the value to be stored” on the left.
types that appear in the same expression. Usually, these rules are just simple
enough, but questions 3.14 and 3.15 describe two situations in which they
this section concern the autoincrement operator and the conditional (or
“ternary”) ?: operator.
3.12
Question: If I’m not using the value of the expression, should I use i++ or
++i to increment a variable?
Answer: It doesn’t matter. The only difference between i + + and ++i is the
value that is passed on to the containing expression. When there is no con¬
taining expression (that is, when they stand alone as full expressions), both
forms are equivalent in that they simply increment i. (It doesn t matter
whether they give up the previous or incremented value, since the value is not
used.)
It may be worth noting that as full expressions, the forms i += 1 and
i = i + l are also equivalent, both to each other and to i + + and + + i.
See also question 3.3.
3.13
Question: I need to check whether one number lies between two others.
Why doesn’t if (a < b < c) work?
Answer: The relational operators, such as <, are all binary; they compare
two operands and return a true or false (1 or 0) result. Therefore, the expres¬
sion a < b < c compares a to b and then checks whether the resulting 1
or 0 is less than c. (To see it more clearly, imagine that it had been written as
(a < b) < c, because that’s how the compiler interprets it.) To check
whether one number lies between two others, use code like this:
if(a<b&&b<c)
Note that the expression (long int) (a * b) would not have the
desired effect. An explicit cast of this form (i.e., applied to the result of the
multiplication) is equivalent to the implicit conversion that would occur any¬
way when the value is assigned to the long int left-hand side; like the
implicit conversion, it happens too late, after the damage has been done.
See also question 3.15.
3.15
or
Note that the cast must be on one of the operands; casting the result (as in
(double) (5 / 9) * (degF - 32)) would not help.
See also question 3.14.
3.16
Question: I need to assign a complicated expression to one of two variables,
depending on a condition. Can I use code like this?
((condition) ? a : b) = complicated_expression;
Answer: No. The ?: operator, like most operators, yields a value, and you
can’t assign to a value. (In other words, ? : does not yield an lvalue.) If you
really want to, you can try something like this:
a ? b = c : d
(a ? b) = (c : d)
Since it has no other sensible meaning, however, later compilers have allowed
the expression and interpret it as if an inner set of parentheses were implied:
a ? (b = c) : d
Preserving Rules
The “reasonably simple set of rules for promoting operands of different
types” changed slightly between classic and ANSI/ISO C; these questions dis¬
3.18
Question: What does the warning “semantics of *>’ change in ANSI C
mean?
Answer: These rules concern the behavior when an unsigned type must be
promoted to a “larger” type. Should it be promoted to a larger signed or
unsigned type? (To foreshadow the answer, it may depend on whether the
larger type is truly larger.)
Under the unsigned preserving (also called “sign preserving”) rules, the
promoted type is always unsigned. This rule has the virtue of simplicity, but
it can lead to surprises (see the first example that follows).
Under the value preserving rules, the conversion depends on the actual
sizes of the original and promoted types. If the promoted type is truly
larger—which means that it can represent all the values of the original,
unsigned type as signed values—the promoted type is signed. If the two types
are actually the same size, the promoted type is unsigned (as for the unsigned
preserving rules).
Since the actual sizes of the types are used in making the determination,
the results will vary from machine to machine. On some machines, short
int is smaller than int, but on other machines, they’re the same size. On
some machines, int is smaller than long int, but on others, they’re the
same size.
In practice, the difference between the unsigned and value preserving rules
matters most often when one operand of a binary operator is (or promotes
to) int and the other one might, depending on the promotion rules, be either
int or unsigned int. If one operand is unsigned int, the other will be
converted to that type—almost certainly causing an undesired result if its
value was negative (again, see the first example that follows). When the ANSI
C standard was established, the value preserving rules were chosen, to reduce
the number of cases where these surprising results occur. (On the other hand,
the value preserving rules also reduce the number of predictable cases,
because portable programs cannot depend on a machine’s type sizes and
hence cannot know which way the value preserving rules will fall.)
EXPRESSIONS 63
Here is a contrived example showing the sort of surprise that can occur
under the unsigned preserving rules:
The important issue is how the expression i > us is evaluated. Under the
unsigned preserving rules (and under the value preserving rules on a machine
for which short integers and plain integers are the same size), us is pro¬
moted to unsigned int. The usual integral conversions say that when
types unsigned int and int meet across a binary operator, both operands
are converted to unsigned, so i is converted to unsigned int, as well. The
old value of i, -5, is converted to some large unsigned value (65,531 on a 16-
bit machine). This converted value is greater than 10, so the code prints
“whoops!”
Under the value preserving rules, on a machine for which plain integers are
larger than short integers, us is converted to a plain int (and retains its
value, 10), and i remains a plain int. The expression is not true, and the
code prints nothing. (To see why the values can be preserved only when the
signed type is larger, remember that a value like 40,000 can be represented as
an unsigned 16-bit integer but not as a signed one.)
Unfortunately, the value preserving rules do not prevent all surprises. The
example just presented still prints “whoops” on a machine for which short
and plain integers are the same size. The value preserving rules may also
inject a few surprises of their own—consider the code:
Pointers, though certainly one of the most powerful and popular features of
don’t point where they should, the possibilities for mayhem are endless.
(Actually, many of the apparent problems with pointers have more to do with
65
66 CHAPTER 4
Question: I’m trying to declare a pointer and allocate some space for it, but
it’s not working. What’s wrong with this code?
char *p;
*p = malloc(10) ;
Answer: The pointer you declared is p, not *p. To make a pointer point
somewhere, you just use the name of the pointer:
p = malloc(10);
It’s when you’re manipulating the pointed-to memory that you use * as an
indirection operator:
*p = 'H' ;
It s easy to make the mistake shown in the question, though, because if you
had used the malloc call as an initializer in the declaration of a local vari¬
able, it would have looked like this:
char *p = malloc(10);
Pointer Manipulations
ip = array;
printf("%d\n", *(ip + 3 * sizeof(int)));
Answer: You’re doing a bit more work than you have to or should. Pointer
arithmetic in C is always automatically scaled by the size of the objects
pointed to. What you want to say is simply:
This will print the third element of the array. In code like this, you don’t need
to worry about scaling by the size of the pomted-to elements; by attempting
to do so explicitly, you inadvertently tried to access a nonexistent element
68 CHAPTER 4
((int *)p)++;
Answer: In C, a cast operator does not mean “pretend that these bits have
a different type and treat them accordingly”; it is a conversion operator, and
by definition it yields an rvalue, which cannot be assigned to or incremented
with ++. (It is an anomaly in older compilers and an extension in gcc that
such expressions are ever accepted.) Say what you mean; use
p += sizeof(int);
Question: I’ve got some code that’s trying to unpack external structures, but
it’s crashing with a message about an “unaligned access.” What does this
mean?
void f(ip)
int *ip;
{
static int dummy = 5;
ip = &dummy;
int *ip;
f(ip);
Answer: Are you sure that the function initialized what you thought it did?
Remember, arguments in C are passed by value. In the preceding code the
called function alters only the passed copy of the pointer. To make it work as
you expect, you can pass the address of the pointer; the function ends up
accepting a pointer to a pointer:
void f(ipp)
int * *ipp;
{
static int dummy = 5;
*ipp = &dummy;
}
int *ip;
f(&ip);
int * f()
{
static int dummy = 5;
return &dummy;
}
Answer: Not portably. Code like this may work and is sometimes recom¬
mended, but it relies on all pointer types having the same internal represen¬
tation (which is common but not universal; see question 5.17).
C has no generic pointer-to-pointer type. Values of type void * act as
generic pointers only because conversions are applied automatically when
other pointer types are assigned to and from void *; such conversions can¬
not be performed if an attempt is made to indirect on a void ** value that
points at something other than a void *. When you use a void ** pointer
value (for instance, when you use the * operator to access the void * value
to which the void ** points), the compiler has no way of knowing whether
that void * value was once converted from another pointer type. Rather, the
compiler must assume that it is nothing more than a void *; it cannot per¬
form any implicit conversions.
In other words, any void ** value you play with must be the address of
an actual void * value somewhere. Casts like (void **) &dp, although
they may shut the compiler up, are nonportable and may not even do what
you want; see also question 13.9. If the pointer that the void ** points to is
not a void * and if it has a different size or representation than a void *,
the compiler isn’t going to be able to access it correctly.
To make the previous code fragment work, you’d have to use an interme¬
diate void * variable:
double *dp;
void *vp = dp;
f(&vp);
dp = vp;
72 CHAPTER 4
The assignments to and from vp give the compiler the opportunity to per¬
form any conversions, if necessary.
Again, the discussion so far assumes that different pointer types might
have different sizes or representations, which is rare today but not unheard
of. To appreciate the problem with void ** more clearly, compare the situ¬
ation to an analogous one involving, say, types int and double, which
probably have different sizes and certainly have different representations.
Suppose that we have a function:
{
*p += 1;
int i = 1;
double d = i;
incme(&d);
i = d;
int i = 1;
This code is analogous to the fragment in the question and would almost cer¬
tainly not work.
4.10
Question: I have a function
f (&5) ;
Answer: You can’t do this directly. You will have to declare a temporary
variable and then pass its address to the function:
int five = 5;
f(&five);
Answer: Not really. Strictly speaking, C always uses pass by value. C seems
to have something like pass by reference in two situations: You can simulate
pass by reference yourself by defining functions that accept pointers and then
using the & operator when calling, and the compiler will essentially simulate
it for you when you pass an array to a function (by passing a pointer instead,
see question 6.4 and others). Formally, though, C has nothing truly equiva¬
lent to pass by reference or C++ reference parameters. (However, function¬
like preprocessor macros do provide a form of “call by name. ) See also ques¬
tions 4.8, 7.9, 12.27, and 20.1.
4.12
Question: I’ve seen different methods used for calling functions via pointers.
What’s the story?
4.13
Question: What’s the total generic pointer type? My compiler complained
when I tried to stuff function pointers into a void *.
4.14
Question: How are integers converted to and from pointers? Can I tem¬
porarily stuff an integer into a pointer, or vice versa?
Answer: Once upon a time, it was guaranteed that a pointer could be con¬
verted to an integer (although one never knew whether an int or a long
might be required), that an integer could be converted to a pointer, that a
pointer remained unchanged when converted to a (large enough) integer and
back again, and that the conversions (and any mapping) were intended to be
“unsurprising to those who know the addressing structure of the machine.”
In other words, there is some precedent and support for integer/pointer con¬
versions, but they have always been machine dependent and hence non¬
portable. Explicit casts have always been required (although early compilers
rarely complained if you left them out).
76 CHAPTER 4
For each pointer type, C defines a special pointer value, the null pointer, that
is guaranteed not to point to any object or function of that type. (The null
pointer is analogous to the nil pointer in Pascal and LISP.) C programmers are
often confused about the proper use of null pointers and about their internal
most programmers). The null pointer constant used for representing null
pointers in source code involves the integer 0, and many machines represent
null pointers internally as a word with all bits zero, but the second fact is not
confusion itself.) If you are fortunate enough not to share the many misun¬
derstandings covered or find the discussion too exhausting, you can skip to
77
78 CHAPTER 5
in the language.
Answer: The language definition states that for each pointer type, there is a
special value—the “null pointer”—that is distinguishable from all other
pointer values and that is “guaranteed to compare unequal to a pointer to
any object or function.” That is, a null pointer points definitively nowhere; it
is not the address of any object or function. The address-of operator & will
never yield a null pointer, nor will a successful call to malice.* (A null
pointer is returned when malloc fails, and this is a typical use of null point¬
ers: as a “special” pointer value with another meaning, usually “not allo¬
cated” or “not pointing anywhere yet.”)
A null pointer is conceptually different from an uninitialized pointer. A
null pointer is known not to point to any object or function; an uninitialized
pointer might point anywhere. See also questions 1.30, 7.1, and 7.31.
Each pointer type has a null pointer, and the internal values of null point¬
ers for different types may differ. Although programmers need not know the
internal values, the compiler must always be informed which type of null
pointer is required, so that it can make the distinction if necessary (see ques¬
tion 5.2).
"A “successful” call to malloc (0) can yield a null pointer; see question 11.26.
NULL POINTERS 79
char *p = 0;
if(p != 0)
If the (char * ) cast on the last argument were omitted, the compiler would
not know to pass a null pointer and would pass an integer 0 instead. (Note
that many UNIX manuals get this example wrong; see also question 5.11.)
When function prototypes are in scope, argument passing becomes an
“assignment context,” and most casts may safely be omitted, since the proto¬
type tells the compiler that a pointer is required and of which type, enabling it
to correctly convert an unadorned 0. Function prototypes cannot provide the
types for variable arguments in variable-length argument lists, however, so
explicit casts are still required for those arguments. (See also question 15.3.) It
can be considered safest to properly cast all null pointer constants in function
calls: to guard against varargs functions or those without prototypes, to allow
interim use of non-ANSI compilers, and to demonstrate that you know what
you are doing. (Incidentally, it’s also a simpler rule to remember.)
80 CHAPTER 5
Here is a summary of the rules for when null pointer constants may be
used by themselves and when they require explicit casts:
Initialization
Assignment
Comparison
Function call, prototype in scope, Function call, no prototype in scope
fixed argument
Variable argument in varargs function call
Question: Is the abbreviated pointer comparison “if (p) ” to test for non¬
null pointers valid? What if the internal representation for null pointers is
nonzero?
if(expr)
where “expr” is any expression at all, the compiler essentially acts as if it had
been written as
if((expr) != 0)
if (p)
is equivalent to
if(p != 0)
NULL POINTERS 81
This is a comparison context, so the compiler can tell that the (implicit) 0 is
a null pointer constant and use the correct null pointer value. No trickery is
involved here; compilers do work this way and generate identical code for
both constructs. The internal representation of a null pointer does not matter.
The Boolean negation operator, !, can be described as follows:
! expr
is essentially equivalent to
(expr)?0:1
or to
((expr) == 0)
if(!p)
is equivalent to
if(p == 0)
wouldn’t that make function calls that pass an uncast NULL work?
Answer: Not in general. The problem is that some machines use different
internal representations for pointers to different types of data. The sug¬
gested definition would make uncast NULL arguments to functions expecting
pointers to characters work correctly, but pointer arguments of other types
would still require explicit casts. Furthermore, such legal constructions as
FILE *fp = NULL; COuld fail.
Nevertheless, ANSI C allows this alternative definition for NULL:*
"Because of the special assignment properties of void * pointers, the initialization FILE *fp = NULL; is
Question: My vendor provides header files that define NULL as OL. Why?
Answer: Many programmers believe that NULL should be used in all pointer
contexts, as a reminder that the value is to be thought of as a pointer. Others
feel that the confusion surrounding NULL and 0 is only compounded by hid¬
ing 0 behind a macro and prefer to use unadorned 0 instead. There is no one
right answer. (See also questions 9.4 and 17.10.) C programmers must under¬
stand that NULL and 0 are interchangeable in pointer contexts and that an
uncast 0 is perfectly acceptable. Any usage of NULL (as opposed to 0) should
NULL POINTERS 85
5.10
Question: But wouldn’t it be better to use NULL rather than 0 in case the
value of NULL changes, perhaps on a machine with nonzero internal null
pointers?
Answer: No. (Using NULL may be preferable but not for this reason.)
Although symbolic constants are often used in place of numbers because the
numbers might change, this is not the reason that NULL is used in place of 0.
Once again, the language guarantees that source-code Os (in pointer contexts)
generate null pointers. NULL is used only as a stylistic convention. See ques¬
tions 5.5 and 9.4.
Question: I once used a compiler that wouldn’t work unless NULL was used.
Answer: Unless the code being compiled was nonportable, that compiler
was probably broken. Perhaps the code used something like this nonportable
version of an example from question 5.2:
With the cast, the code works correctly no matter what the machine’s integer
and pointer representations are and no matter which form of null pointer
constant the compiler has chosen as the definition of NULL. (The code frag¬
ment in question 5.2, which used 0 instead of NULL, is equally correct; see
also question 5.9.)
5.12
Question: I use the preprocessor macro
Answer: This trick, though popular and superficially attractive, does not
buy much. It is not needed in assignments and comparisons; see question 5.2.
It does not even save keystrokes. Its use may suggest to the reader that the
program’s author is shaky on the subject of null pointers, requiring that the
definition of the macro, its invocations, and all other pointer usages be
checked. See also questions 9.1 and 10.2.
Retrospective
In some circles, misunderstandings about null pointers run rampant. These
Using (void *) 0, in the guise of NULL, instead of (char *) 0 happens to work only because of a special
guarantee about the representations of void * and char * pointers.
NULL POINTERS 87
Answer: When the term “null” or “NULL” is casually used, one of several
things may be meant:
1. The conceptual null pointer, the abstract language concept defined in ques¬
tion 5.1. It is implemented with ...
2. The internal, or run-time, representation of a null pointer, which may or
may not be all bits 0 and which may be different for different pointer
types. The actual values should be of concern only to compiler writers.
Authors of C programs never see them, since they use ...
3. The null pointer constant, which is a constant integer 0* (see question
5.2). It is often hidden behind ...
4. The NULL macro, which is defined to be 0 or ( (void * ) 0) (see question
5.4). Finally, as red herrings, we have ...
5. The ASCII null character (NUL), which does have all bits zero but has no
necessary relation to the null pointer except in name and ...
6. The “null string,” which is another name for the empty string (""). Using
the term “null string” can be confusing in C, because an empty string
involves a null (' \ 0 ') character but not a null pointer, which brings us
full circle.
More precisely, a null pointer constant is an integer constant expression with the value 0, possibly cast to
void *.
only sense 5, and “NULL ’ means only sense 4; the
fTo be very, very precise, the word “null” as a noun means 01
■lated) term “null statement.” These are admittedly
other usages all use “null” as an adjective, as does the (unrel:
fine points.
88 CHAPTER 5
5.15
Question: Is there an easier way to understand all this null pointer stuff?
1. When you want a null pointer constant in source code, use “0” or
“NULL”.
5.16
Question: Given all the confusion surrounding null pointers, wouldn’t it be
easier simply to require them to be represented internally by zeroes?
Question: Seriously, have any actual machines really used nonzero null
pointers or different representations for pointers to different types?
Answer: The Prime 50 series used segment 07777, offset 0 for the null
pointer, at least for PL/I. Later models used segment 0, offset 0 for null point¬
ers in C, necessitating new instructions, such as TCNP (Test C Null Pointer),
evidently as a sop to all the extant poorly written C code that made incorrect
assumptions. Older, word-addressed Prime machines were also notorious for
requiring larger byte pointers (char *’s) than word pointers (int * s).
The Eclipse MV series from Data General has three architecturally sup¬
ported pointer formats (word, byte, and bit pointers), two of which are used
by C compilers: byte pointers for char * and void *, and word pointers
for everything else.
Some Honeywell-Bull mainframes use the bit pattern 06000 for (internal)
null pointers.
90 CHAPTER 5
The CDC Cyber 180 Series has 48-bit pointers consisting of a ring, seg¬
ment, and offset. Most users (in ring 11) have null pointers of
OxBOOOOOOOOOOO. It was common on old CDC ones-complement machines
to use an all-one-bits word as a special flag for all kinds of data, including
invalid addresses.
The old HP 3000 series uses different addressing schemes for byte
addresses and for word addresses; like several of the previous machines, it
therefore uses different representations for char * and void * pointers
than for other pointers.
The Symbolics Lisp Machine, a tagged architecture, does not even have
conventional numeric pointers; it uses the pair <NIL, 0> (basically a nonex¬
istent cobject, offset> handle) as a C null pointer.
Depending on the “memory model” in use, 8086-family processors (PC
compatibles) may use 16-bit data pointers and 32-bit function pointers, or
vice versa.
Some 64-bit Cray machines represent int * in the lower 48 bits of a
word; char * additionally uses the upper 16 bits to indicate a byte address
within a word.
5.18
Question: Is a run-time integral value of 0, cast to a pointer, guaranteed to
be a null pointer?
Answer: No. Only constant integral expressions with value 0 are guaranteed
to indicate null pointers. See also questions 4.14, 5.2, and 5.19.
NULL POINTERS 91
5.19
Question: How can I access an interrupt vector located at the machine’s
location 0? If I set a pointer to 0, the compiler might translate it to a nonzero
internal null pointer value.
• Simply set a pointer to 0. (This is the way that doesn’t have to work, but
if it’s meaningful, it probably will.)
• Assign the integer 0 to an int variable and convert that int to a pointer.
(This is also not guaranteed to work, but it probably will.)
• Use a union to set the bits of a pointer variable to 0:
union {
int *u_p;
int u_i; /* assumes sizeof(int) >= sizeof(int *) */
} p;
p.u_i = 0;
5.20
Question: What does a run-time “null pointer assignment” error mean?
How do I track it down?
Answer: This message, which typically occurs with MS-DOS compilers (see,
therefore, Chapter 19) means that you’ve written to location 0 via a null (per¬
haps because uninitialized) pointer. (See also question 16.8.)
A debugger may let you set a data breakpoint or watchpoint or something
on location 0. Alternatively, you could write a bit of code to stash away a
copy of 20 or so bytes from location 0 and periodically check that the mem¬
ory at location 0 hasn’t changed.
Arrays and Pointers
between the two, imagining either that they are identical or that various non¬
fact that most array references decay into pointers to the array’s first element,
C: You can never manipulate an array in its entirety (i.e., to copy it or pass it
to a function), because whenever you mention its name, you’re left with a
pointer rather than the entire array. Because arrays decay to pointers, the
93
94 CHAPTER 6
and pointers, and this chapter tries to clear them up as best it can. If you find
yourself bored by the repetition, stop reading and move on. But if you’re con¬
fused or if things don’t make sense, keep reading until they fall into place.
Question: I had the definition char a [6] in one source file, and in another
I declared extern char *a. Why didn’t it work?
Answer: The declaration extern char *a does not declare an array and
therefore does not match the actual definition. The type pointer to type T is
not the same as array of type T. Use extern char a [ ].
Question: But I heard that char a [ ] is identical to char *a. Is that true?
Answer: Not at all. (What you heard has to do with formal parameters to
functions; see question 6.4.) Arrays are not pointers, although they are closely
related (see question 6.3) and can be used similarly (see questions 4.1, 6.8
6.10, and 6.14).
The array declaration char a [ 6 ] requests that space for six characters be
set aside, to be known by the name a. That is, there is a location named a at
which six characters can sit. The pointer declaration char *p, on the other
hand, requests a place that holds a pointer, to be known by the name p. This
ARRAYS AND POINTERS 95
pointer can point almost anywhere: to any char, to any contiguous array of
chars, or nowhere*(see also questions 1.30 and 5.1).
As usual, a picture is worth a thousand words. The declarations
char *p = "world";
1 o \o
w o r 1 d \o
‘Don’t interpret “anywhere” and “nowhere” too broadly. To be valid, a pointer must point to properly allo¬
cated memory (see questions 7.1, 7.2, and 7.3); to point definitively nowhere, a pointer must be a null pointer
expression decays (with three exceptions) into a pointer to its first element;
the type of the resultant pointer is pointer to T. (The exceptions are when the
array is the operand of a sizeof or an & operator or is a string literal ini¬
tializer for a character array.* See questions 6.23, 6.12, and 1.32, respec¬
tively.)
As a consequence of this definition, and in spite of the fact that the under¬
lying arrays and pointers are quite different, the compiler doesn’t apply the
array subscripting operator [ ] that differently to arrays and pointers, after
all^ Given an array a and pointer p, an expression of the form a [i] causes
the array to decay into a pointer, following the preceding rule, and then to be
subscripted just as would be a pointer variable in the expression p[i]
(although the eventual memory accesses will be different, as explained in
question 6.2). If you were to assign the array’s address to the pointer:
p = a;
By string literal initializer for a character array,” we include also wide string literals for arrays of wchar_t.
fStrictly speaking, the [ ] operator is always applied to a pointer; see question 6.10 item 2.
ARRAYS AND POINTERS 97
f(char a[])
{ ... }
Interpreted literally, this declaration would have no use, so the compiler turns
around and pretends that you’d written a pointer declaration, since that s
what the function will in fact receive:
f(char *a)
{ ... }
There’s nothing particularly wrong with talking about a function as if it
“receives” an array if the function is traditionally used to operate on arrays
or if the parameter is naturally treated within the function as an array.
This rewriting of array declarators into pointers holds only within func¬
tion formal parameter declarations, nowhere else. If rewritten array para¬
meter declarations bother you, you’re under no compulsion to use them;
many people have concluded that the confusion they cause outweighs the
small advantage of having the declaration “look like” the call or the uses
within the function. (Note that the conversion happens only once; some¬
thing like char a2 [] [] won’t work. See questions 6.18 and 6.19.)
See also question 6.21.
it decays to is copied, not the entire array. Furthermore, an array may not
appear on the left-hand side of an assignment (in part because, by the previ¬
When you want to pass arrays around without copying them, you can use
pointers and simple assignment. See also questions 4.1 and 8.2.
Question: If I can’t assign to arrays, then how can this code work?
{
if(str[0] == •\0')
str = "none";
Answer: The term lvalue doesn’t quite mean “something you can assign to ;
a better definition is “something that has a location (in memory). " The
ANSI/ISO C Standard goes on to define a “modifiable lvalue”; an array is not
a modifiable lvalue. See also question 6.5.
Retrospective
generates so much confusion, here are a few questions about that confu¬
sion.
definition of “lvalue” did have to do with the left-hand side of assignment statements.
"The original
100 CHAPTER 6
referenced using [ ]) exactly as if it were a true array. See questions 6.14 and
6.16. (Be careful with sizeof; see question 7.28.)
See also questions 1.32, 6.10, and 20.14.
6.10
Question: I’m still mystified. Is a pointer a kind of array, or is an array a
kind of pointer?
Answer: An array is not a pointer, and vice versa. An array reference (that
is, any mention of an array in a value context) turns into a pointer (see ques¬
tions 6.2 and 6.3).
There are perhaps three ways to think about the situation:
1. Pointers can simulate arrays (though that’s not all; see question 4.1).
2. There’s hardly such a thing as an array (it is, after all, a “second-class cit¬
izen”); the subscripting operator [ ] is in fact a pointer operator.
3. At a higher level of abstraction, a pointer to a block of memory is effec¬
tively the same as an array (although this says nothing about other uses of
pointers).
But to reiterate, here are two ways not to think about it:
6.11
Question: I came across some “joke” code containing the expression
5 [ " abcdef " ] . How can this be legal C?
a[e]
* ( (a) + (e)) (by definition)
*((e) + (a)) (by commutativity of addition)
Pointers to Arrays
Since arrays usually decay into pointers, it’s particularly easy to get confused
when dealing with the occasional pointer to an entire array (as opposed to a
♦The commutativity iTof the array-subscripting operator [1 itself; obviously, a [ i 1 [ j 1 is in general different
from a [ j ] 1 i 1 •
102 CHAPTER 6
6.12
Question: Since array references decay into pointers, what’s the difference—
if array is an array—between array and &array?
int a[10];
int array[NROWS][NCOLUMNS];
6.13
Question: How do I declare a pointer to an array?
Answer: Usually, you don’t want to. When people speak casually of a
pointer to an array, they usually mean a pointer to its first element.
Instead of a pointer to an array, consider using a pointer to one of the
arrays elements. Arrays of type T decay into pointers to type T (see question
6.3), which is convenient; subscripting or incrementing the resultant pointer
will access the individual members of the array. True pointers to arrays, when
ARRAYS AND POINTERS 103
subscripted or incremented, step over entire arrays and are generally useful
only when operating on arrays of arrays,* if at all. (See also question 6.18.)
If you really need to declare a pointer to an entire array, use something like
“int (*ap) [N] ; ” where N is the size of the array. (See also question 1.21.)
If the size of the array is unknown, N can in principle be omitted, but the
resulting type, “pointer to array of unknown size,” is useless.
Here is an example showing the difference between simple pointers and
pointers to arrays. Given the declarations
you could use the simple pointer to int, ip, to access the one-dimensional
array al:
ip = al;
printf("%d ", *ip);
ip++;
printf("%d\n", * ip) ;
0 1
ap = &al;
printf("%d\n", **ap);
/* WRONG */
ap++;
would print 0 on the first line and something undefined on the second (and
might crash). The pointer to array would be useful only in accessing an array
of arrays, such as a2:
ap = a2 ;
3 4
6 7
6.14
Question: How can I set an array’s size at run time? How can I avoid fixed¬
sized arrays?
Answer: The equivalence between arrays and pointers (see question 6.3)
allows a pointer to memory obtained from malloc to simulate an array quite
effectively. After executing
#include <stdlib.h>
(and if the call to malloc succeeds), you can reference dynarray [i] (for i
from 0 to 9) just as if dynarray were a conventional, statically allocated
array (int a [10]). See also questions 6.16, 7.28, and 7.29.
6.15
Question: How can I declare local arrays of a size matching a passed-in
array?
have to use malloc, and remember to call free before the function returns.
See also questions 6.14, 6.16, 6.19, 7.22, and maybe 7.32.
6.16
Question: How can I dynamically allocate a multidimensional array?
#include <stdlib.h>
In either case (i.e., for arrayl or array2), the elements of the dynamic
array can be accessed with normal-looking array subscripts, arrayx [ i ] [ j ]
(for 0 < i < NROWS and 0 < j < NCOLUMNS). The schematic illustration
on page 106 shows the layout of arrayl and array2.
‘Strictly speaking, these aren’t arrays but rather objects to be used like arrays; see also question 6.14.
106 CHAPTER 6
If the double indirection implied by the above schemes is for some reason
unacceptable, you can simulate a two-dimensional array with a single,
dynamically-allocated one-dimensional array:
”Note’ however> that double indirection is not necessarily any less efficient than multiplicative indexing.
tA macro such as #define Arrayaccess (a, i, j) ( (a) [ (i) * ncolumns + (j )]) could hide
the explicit calculation. Invoking that macro, however, would require parentheses and commas that wouldn’t
look exactly like conventional C multidimensional array syntax, and the macro would need access to at least
one of the dimensions, as well.
ARRAYS AND POINTERS 107
int (*array4)[NCOLUMNS] =
or even
int (*array5)[NROWS][NCOLUMNS] =
(int (*)[NROWS][NCOLUMNS])malloc(sizeof(*array5));
Here, however, the syntax starts getting horrific (accesses to array5 look
like (*array5) [i] [ j ] ), and at most one dimension may be specified at
run time.
With all of these techniques, you may of course need to remember to free
the arrays when they are no longer needed; this takes several steps in the case
of arrayl and array2 (see also question 7.23):
Also, you cannot necessarily intermix dynamically allocated arrays with con¬
ventional, statically allocated ones (see question 6.20, and also question
6.18).
All of these techniques can also be extended to three or more dimensions.
Here is a three-dimensional version of the first technique:
6.17
Question: Here’s a neat trick: If I write
int realarray[10];
int *array = &realarray[-1];
Answer: Although this technique is attractive (and was used in old editions
of the book Numerical Recipes in C), it does not conform to the C standards.
Pointer arithmetic is defined only as long as the pointer points within the
same allocated block of memory or to the imaginary “terminating” element
one past it; otherwise, the behavior is undefined, even if the pointer is not
dereferenced. The code in the question computes a pointer to memory before
the beginning of realarray and could fail if, while subtracting the offset,
an illegal address were generated (perhaps because the address tried to “wrap
around” past the beginning of some memory segment).
References: K&R2 §5.3 p. 100, §5.4 pp. 102-3, §A7.7 pp. 205-6
ANSI §3.3.6
ISO §6.3.6
Rationale §3.2.2.3
means that functions that accept simple arrays seem to accept arrays of arbi¬
only on the “outermost” array, so the “width” and higher dimensions of mul¬
tion.
ARRAYS AND POINTERS 109
6.18
Question: My compiler complained when I passed a two-dimensional array
to a function expecting a pointer to a pointer. Why?
Answer: The rule (see question 6.3) by which arrays decay into pointers is
not applied recursively. An array of arrays (i.e., a two-dimensional array in C)
decays into a pointer to an array, not a pointer to a pointer. Pointers to arrays
can be confusing and must be treated carefully; see also question 6.13. (The
confusion is heightened by the existence of incorrect compilers, including
some old versions of pcc and pcc-derived lints, which improperly accept
assignments of multidimensional arrays to multilevel pointers.)
If you are passing a two-dimensional array to a function:
int array[NROWS][NCOLUMNS];
f(array);
f(int a[][NCOLUMNS])
{ - )
or
{ - }
In the first declaration, the compiler performs the usual implicit parameter
rewriting of “array of array” to “pointer to array” (see questions 6.3 and
6.4); in the second form, the pointer declaration is explicit. Since the called
function does not allocate space for the array, it does not need to know the
overall size, so the number of rows, NROWS, can be omitted. The shape of
the array is still important, so the column dimension NCOLUMNS (and, for
three- or more dimensional arrays, the intervening ones) must be retained.
If a function is already declared as accepting a pointer to a pointer, it is
probably meaningless to pass a two-dimensional array directly to it. An inter¬
mediate pointer would have to be used when attempting to call it with a two-
dimensional array:
This usage, however, is misleading and almost certainly incorrect, since the
array has been “flattened” (its shape has been lost).
See also questions 6.12 and 6.15.
6.19
Question: How do I write functions that accept two-dimensional arrays
when the “width” is not known at compile time?
Answer: It’s not easy. One way is to pass in a pointer to the [0] [0] ele¬
ment along with the two dimensions and to simulate array subscripting “by
hand”:
Note that the correct expression for manual subscripting involves ncolumns
(the “width” of each row), not nrows (the number of rows); it’s easy to get
this backward.
This function could be called with the array from question 6.18 as
int array[NROWS][NCOLUMNS];
int **array1; /* ragged */
int **array2; /* contiguous */
int *array3; /* "flattened" */
int (* array4) [NCOLUMNS];
int (* array5) [NROWS] [NCOLUMNS];
with the pointers initialized as in the code fragments in question 6.16 and
functions declared as
The following two calls would probably work on most systems but involve
questionable casts and work only if the dynamic ncolumns matches the sta¬
tic NCOLUMNS:
Only f2, as shown, can conveniently be made to work with both statically
and dynamically allocated arrays, although it will not work with the tradi¬
tional “ragged” array implementation, arrayl. However, note that passing
&array [ 0 ] [0] (or, equivalently, * array) to f2 is not strictly conforming;
see question 6.19.
If you can understand why all of the preceding calls work and are written
as they are and if you understand why the combinations that are not listed
would not work, you have a very good understanding of arrays and pointers
in C.
Rather than worrying about all of this, one approach to using multidimen¬
sional arrays of various sizes is to make them all dynamic, as in question 6.16.
If there are no static multidimensional arrays—if all arrays are allocated like
arrayl or array2 in question 6.16—all functions can be written like f 3.
Sizes of Arrays
The sizeof operator will tell you the size of an array if it can, but it’s not
able to if the size is not known or if the array has already decayed to a
pointer.
6.21
Question: Why doesn’t sizeof properly report the size of an array when
the array is a parameter to a function? I have this test function and it prints
4, not 10:
f(char a[10])
{
int i = sizeof(a);
printf("%d\n", i);
Answer: The compiler pretends that the array parameter was declared as a
pointer (that is, in the example, as char *a; see question 6.4), and sizeof
reports the size of the pointer. See also questions 1.24 and 7.28.
6.22
Question: How can code in a file where an array is declared as extern (i.e.
it is defined and its size determined in another file) determine the size of the
array? The sizeof operator doesn’t seem to work.
6.23
Question: How can I determine how many elements are in an array, when
sizeof yields the size in bytes?
Answer: Simply divide the size of the entire array by the size of one element:
Many people assume that pointers are the most difficult aspect of C to learn.
Often, however, the problem is not so much managing the pointers as man¬
aging the memory to which they point. In keeping with C’s low-level flavor,
bugs.
\
allocated.
114
MEMORY ALLOCATION 115
char * answer;
printf("Type something:\n");
gets(answer);
printf("You typed \"%s\"\n", answer);
int i;
printf("i = %d\n", i);
That is, in the first piece of code, we cannot say where the pointer answer
points, just as we cannot say what value i will have in the second. (Since local
variables are not initialized and typically contain garbage, it is not even guar¬
anteed that answer starts out as a null pointer. See questions 1.30 and 5.1.)
The simplest way to correct the question-asking program is to use a local
array instead of a pointer and to let the compiler worry about allocation:
#include <stdio.h>
#include <string.h>
This example also uses fgets () instead of gets () ,so that the end of the
array cannot be overwritten. (See question 12.23. Unfortunately for this
example, fgets () does not automatically delete the trailing \n, as gets ()
would.) It would also be possible to use malloc () to allocate the answer
buffer, and to parameterize the buffer size (with something like
Answer: As in question 7.1, the main problem here is that space for the con¬
catenated result is not properly allocated. C does not provide an automati¬
cally managed string type. C compilers allocate memory only for objects
explicitly mentioned in the source code (in the case of “strings,” this includes
character arrays and string literals). The programmer must arrange for suffi¬
cient space for the results of run-time operations, such as string concatena¬
tion, typically by declaring arrays or by calling malloc.
The streat function performs no allocation; the second string is
appended to the first one, in place. The first (destination) string must be
writable and have enough room for the concatenated result. Therefore, one
fix would be to declare the first string as an array:
Question: But the documentation for streat says that it takes two char *
pointers as arguments. How am I supposed to know to allocate things?
Question: I’m reading lines from a file into an array, with this code:
char linebuf[80];
char *lines[100];
int i;
Why do all the lines end up containing copies of the last line?
Answer: You have allocated memory for only one line: linebuf. Each time
you call fgets, the previous line is overwritten. No memory is allocated by
fgets: unless it reaches EOF or encounters an error, the pointer it returns is
the same pointer you handed it as its first argument (in this case, a pointer to
the single linebuf array).
To make code like this work, you’ll need to allocate memory for each line.
See question 20.2 for an example.
Question: I have a function that is supposed to return a string, but when the
function returns to its caller, the returned string is garbage. Why?
Answer: Whenever a function returns a pointer, make sure that the pointed-
to memory is properly allocated. The returned pointer should be to a stati¬
cally allocated buffer, to a buffer passed in by the caller, or to memory
obtained with malloc, but not to a local (automatic) array. In other words,
never do something like this:
#include <stdio.h>
char *itoa(int n)
{
char retbuf[20]; /* WRONG */
sprintf(retbuf, "%d", n);
return retbuf; /* WRONG */
When a function returns, its automatic, local variables are discarded, so the
returned pointer in this case is invalid (it points to an array that no longer
exists).
One fix would be to declare the return buffer as
This fix is imperfect, since a function using static data is not reentrant. Fur¬
thermore, successive calls to this version of itoa keep overwriting the same
return buffer: The caller won’t be able to call it several times and keep all the
return values around simultaneously.
Another fix is to have the caller pass space for the result:
{
sprintf(retbuf, "%d", n);
return retbuf;
char str[20];
itoa(123, str);
MEMORY ALLOCATION 119
#include <stdlib.h>
char *itoa(int n)
{
char *retbuf = malloc(20);
if(retbuf != NULL)
sprintf(retbuf, "%d", n);
return retbuf;
In this case, the caller must remember to free the returned pointer when it is
no longer needed.
See also questions 12.21 and 20.1.
Calling malloc
When you need more flexibly allocated data than you can achieve with static
allocation, it’s time to take the plunge and begin allocating memory dynami¬
cally, usually using malloc. The questions in this section cover the basics of
modes.
an int (see question 1.25), which is not correct. (The same problem could
arise for calloc or realloc.) See also question 7.15.
Question: Why does some code carefully cast the values returned by mal-
loc to the pointer type being allocated?
Question: I wrote this little wrapper around malloc. Why doesn’t it work?
#include <stdio.h>
#include <stdlib.h>
}
}
Answer: See question 4.8. (In this case, you’ll want to have mymalloc
return the allocated pointer.)
Question: I’m trying to declare a pointer and allocate some space for it, but
it’s not working. What’s wrong with this code?
char *p;
*p = malloc(10);
7.11
Question: How can I dynamically allocate arrays?
7.12
Question: How can I find out how much memory is available?
7.13
Question: What should malloc(O) do: return a null pointer or a pointer
to 0 bytes?
Question: I’ve heard that some operating systems don’t allocate memory
obtained via malloc until the program tries to use it. Is this legal?
Answer: It s hard to say. The standard doesn’t say that systems can act this
way, but it doesn’t explicitly say that they can’t, either. (Such a “deferred fail¬
ure implementation would not seem to conform to the implied requirements
of the standard.)
The conspicuous problem is that by the time the program gets around to
trying to use the memory, there might not be any. The program in this case
must typically be killed by the operating system, since the semantics of C pro¬
vide no recourse. (Obviously, malloc is supposed to return a null pointer if
there s no memory, so that the program—as long as it checks ma.lloc’s return
value at all—never tries to use more memory than is available.)
MEMORY ALLOCATION 123
Systems that do this “lazy allocation” usually provide extra signals indi¬
cating that memory is dangerously low, but portable or naive programs won’t
catch them. Some systems that do lazy allocation also provide a way to turn
it off (reverting to traditional malloc semantics) on a per process or per user
basis, but the details vary from system to system.
Question: Why is malloc returning crazy pointer values? I did read ques¬
tion 7.6, and I have included the declaration extern void *malloc(); before
I call it.
7.16
Question: I’m allocating a large array for some numeric work, using the line
Answer: Notice that 256 x 256 is 65,536, which will not fit in a 16-bit int,
even before you multiply it by sizeof (double). If you need to allocate this
much memory, you’ll have to be careful. If size_t (the type accepted by
malloc) is a 32-bit type on your machine but int is 16 bits, you might be
able to get away with writing 256 * (256 * sizeof (double) ) (see
question 3.14). Otherwise, you’ll have to break up your data structure into
smaller chunks, use a 32-bit machine, or use some nonstandard memory allo¬
cation routines. See also questions 7.15 and 19.23.
7.17
Question: I’ve got 8 MB of memory in my PC. Why does malloc seem to
allocate only 640K or so?
7.18
Question: My application depends heavily on dynamic allocation of nodes
for data structures, and malloc/free overhead is becoming a bottleneck.
What can I do?
Answer: One improvement, particularly attractive if all nodes are the same
size, is to place unused nodes on your own free list rather than actually call¬
ing free on them. (This approach works well when one kind of data struc¬
ture dominates a program’s memory use, but it can cause as many problems
as it solves if so much memory is tied up in the list of unused nodes that it
isn’t available for other purposes.)
MEMORY ALLOCATION 125
Freeing Memory
Memory allocated with malloc can persist as long as you need it. It is never
7.20
Question: Dynamically allocated memory can’t be used after freeing it,
can it?
Answer: No. Some early documentation for malloc stated that the con¬
tents of freed memory were “left undisturbed,” but this ill-advised guarantee
was never universal and is not required by the C standard.
Few programmers would use the contents of freed memory deliberately,
but it is easy to do so accidentally. Consider the following (correct) code for
freeing a singly linked list:
Notice that if the code used the more obvious loop iteration expression
listp = listp->next (without the temporary nextp pointer), it would
be trying to fetch listp->next from freed memory.
7.21
Question: Why isn’t a pointer null after calling free? How unsafe is it to
use (assign, compare) a pointer value after it’s been freed?
Answer: When you call free, the memory pointed to by the passed pointer
is freed, but the value of the pointer in the caller remains unchanged, because
C s pass-by-value semantics mean that called functions never permanently
change the values of their arguments. (See also question 4.8.)
A pointer value that has been freed is, strictly speaking, invalid, and any
use of it, even if is not dereferenced (i.e., even if the use of it is a seemingly
innocuous assignment or comparison) can theoretically lead to trouble. (We
MEMORY ALLOCATION 127
7.22
Answer: Yes. Remember that a pointer is different from what it points to.
Local variables’5' are deallocated when the function returns, but in the case of
a pointer variable, this means that the pointer is deallocated, not what it
points to. Memory allocated with mailoc always persists until you explicitly
free it. (If the only pointer to a block of memory is a local pointer and if that
pointer disappears, that block cannot be freed.) In general, for every call to
malloc, there should be a corresponding call to free.
7.23
Answer: Yes. The malloc and free functions know nothing about struc¬
ture declarations or about the contents of allocated memory; they especially
do not know whether allocated memory contains pointers to other allo¬
cated memory. In general, you must arrange that each pointer returned from
malloc be individually passed to free, exactly once (if it is freed at all).
"■Strictly speaking, it is automatic variables that are deallocated when a function returns.
128 CHAPTER 7
A good rule of thumb is that for each call to malloc in a program, you
should be able to point at the call to free that frees the memory allocated
by that malloc call.
See also question 7.24.
7.24
Question: Must I free allocated memory before the program exits?
Answer: You shouldn’t have to. It’s the operating system’s job to reclaim all
memory when a program exits; the system cannot afford to have memory
integrity depend on the whims of random programs. (Strictly speaking, it is
not even free’s job to return memory to the operating system; see question
7.25.) Nevertheless, some personal computers are said not to reliably recover
memory unless it was freed before exiting, and all that can be inferred from
the ANSI/ISO C Standard is that this issue is one of “quality of implementa¬
tion.”
In any case, it can be considered good practice to explicitly free all mem¬
ory—for example, in case the program is ever rewritten to perform its main
task more than once (perhaps under a graphical user interface).* On the other
hand, some programs, such as interpreters, don’t know what memory they’re
done with (i.e., what memory could be freed) until it’s time to exit, and since
all memory should be released at exit, it would be a needless, potentially
expensive, and error-prone exercise for the program to explicitly free all of it.
7.25
Question: I have a program that allocates and later frees a lot of memory,
but memory usage doesn’t seem to go back down. Why?
4Also, unless a program frees all memory it knows about, it’s much more difficult to see real memory leaks
detected by a “leak checking” tool (see question 18.2).
MEMORY ALLOCATION 129
size, but once it’s allocated, you can't ask the malloc package what that size
is. (For one thing, if you could ask, should it tell you the size you asked for
7.26
Question: How does freed know how many bytes to free?
7.27
Question: So can I query the malloc package to find out how big an allo¬
cated block is?
Answer: Not portably. If you need to know, you’ll have to keep track of it
yourself.
7.28
Question: Why doesn’t sizeof tell me the size of the block of memory
pointed to by a pointer?
Answer: The sizeof operator does not know that malloc has been used
to allocate a pointer; sizeof tells you the size of the pointer itself. There is
no portable way to find out the size of a block allocated by malloc.
130 CHAPTER 7
7.29
Question: Having dynamically allocated an array (as in question 6.14), can
I change its size?
Answer: Yes. This is exactly what realloc is for. To change the size of a
dynamically allocated array (e.g., dynarray from question 6.14), use code
like this:
Note that realloc may not always be able to enlarge* memory regions in
place. When it is able to, it simply gives you back the same pointer you
handed it, but if it must go to some other part of memory to find enough con¬
tiguous space, it will return a different pointer, and the previous pointer value
will become unusable.
If realloc cannot find enough space at all, it returns a null pointer and
leaves the previous region allocated.1 Therefore, you usually don’t want to
immediately assign the new pointer to the old variable. Instead, use a tempo¬
rary pointer:
tinclude <stdio.h>
ttinclude <stdlib.h>
int *newarray =
dynarray = newarray;
else {
Nor, for that matter, is realloc always able to shrink regions in place.
fBeware, though, that some pre-ANSI implementations were not always able to preserve the prior region when
realloc failed.
MEMORY ALLOCATION 131
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int tmpoffset;
p = malloc (10);
tmpoffset = p2 - p;
if(newp != NULL) {
strcpy(p2, ”, world!");
printf("%s\n", p);
7.30
Question: Is it legal to pass a null pointer as the first argument to realloc () ?
Why would you want to?
132 CHAPTER 7
Answer: ANSI C sanctions this usage (and the related realloc (..., 0),
which frees), although several earlier implementations do not support it, so it
may not be fully portable. Passing an initially null pointer to realloc can
make it easier to write a self-starting incremental allocation algorithm.
For example, here is a function that reads an arbitrarily long line into
dynamically allocated memory, reallocating the input buffer as necessary.
(The caller must free the returned pointer when it is no longer needed.)
#include <stdio.h>
#include <stdlib.h>
{
char *retbuf = NULL;
size_t nchmax = 0;
register int c;
size_t nchread = 0;
char *newbuf;
nchmax += 20;
/* +1 for \0 */
if(newbuf == NULL) {
free(retbuf);
return NULL;
retbuf = newbuf;
if(c == '\n')
break;
retbuf[nchread++] = c;
}
MEMORY ALLOCATION 133
if(retbuf != NULL) {
retbuf[nchread] = 1\0';
if(newbuf != NULL)
retbuf = newbuf;
return retbuf;
7.31
Question: What’s the difference between calloc and malloc? Which
should I use? Is it safe to take advantage of calloc’s zero-filling? Does free
work on memory allocated with calloc, or do you need a cfree?
p ■ malloc (m * n) ;
memset(p, 0, m * n) ;
There is no important difference between the two other than the number of
arguments and the zero fill.*
Use whichever function is convenient. Don’t rely on calloc’s zero fill too
much; usually, it’s best to initialize data structures yourself, on a field-by-field
basis, especially if there are pointer fields. Since calloc’s zero fill is all bits
zero, it is guaranteed to yield the value 0 for all integral types (including \0
for character types). But it does not guarantee useful null pointer values (see
Chapter 5) or floating-point zero values.
It is sometimes argued that calloc' s 0-fill guarantees early (nonlazy) allocation; see question 7.14.
134 CHAPTER 7
Yes, free is properly used to free the memory allocated by calloc; there
is no standard cfree function.
One imagined distinction that is not significant between malloc and
calloc is whether a single element or an array of elements is being allo¬
cated. Although calloc’s two-argument calling convention suggests that it is
supposed to be used to allocate an array of m items of size n, there is no such
requirement; it is perfectly permissible to allocate one item with calloc (by
passing one argument as 1) or to allocate an array with malloc (by doing
the multiplication yourself; see, for example, the code fragment in question
6.14). (Nor does structure padding enter into the question; any padding nec¬
essary to make arrays of structures work correctly is always handled by the
compiler and is reflected by sizeof. See question 2.13.)
7.32
Question: What is alloca and why is its use discouraged?
* Although an “almost portable” implementation of alloca exists in the public domain, its author intended
it as a stopgap measure and recommends against use of alloca in new code.
Of a call to alloca tries to allocate some memory on the stack in the middle of the preparation, on the same
stack, of the argument list for another function call (fgets, in this case), the argument list may well be per¬
turbed.
Characters and Strings
character set. Because these representations are laid bare and are visible to C
ters and strings are manipulated. The downside is that to some extent, pro¬
grams have to exert this control: The programmer must remember whether a
question 8.6) and must remember to maintain arrays (and allocated blocks of
See also questions 13.1 through 13.7, which cover library functions for
string handling.
135
136 CHAPTER 8
Answer: Characters and strings are very different, and strcat concate¬
nates strings.
A character constant like 1 ! 1 represents a single character. A string literal
between double quotes usually represents multiple characters. Although
string literal like " ! " seems to represent a single character, it contains two:
the ! you requested and the \0 that terminates all strings in C.
Characters in C are represented by small integers corresponding to their
character set values (see also question 8.6). Strings are represented by arrays
of characters; you usually manipulate a pointer to the first character of the
array. It is never correct to use one when the other is expected. To append !
to a string, use
strcat(string, "!");
char *string;
if(string == "value") {
To compare two strings, you generally use the library function strcmp:
8.3
Question: If I can say
char a[14];
a = "Hello, world!";
Answer: Strings are arrays, and you can’t assign arrays directly. Use
strcpy instead:
8.5
Question: What is the difference between these initializations?
Question: How can I get the numeric value (i.e., ASCII or other character
set code) corresponding to a character, or vice versa?
prints
A 65 A 65
Question: Does C have anything like the “substr” (extract substring) rou¬
tine present in other languages?
Question: I’m reading strings typed by the user into an array and then
printing them out later. When the user types a sequence like \n, why isn’t it
being handled properly?
8.10
Question: I’m starting to think about multinational character sets. Should I
worry about the implications of making sizeof (char) be 2 so that 16-bit
character sets can be represented?
Answer: If type char were made 16 bits, sizeof (char) would still be 1,
and CHAR_BIT in <limits.h> would be 16, and it would simply be
impossible to declare (or allocate with malloc) a single 8-bit object.
140 CHAPTER 8
C provides no formal, built-in Boolean type. Boolean values are just integers
(though with greatly reduced range!), so they can be held in any integral type.
C interprets a zero value as “false” and any nonzero value as “true.” The
relational and logical operators, such as ==, ! =, <, >=, &&, and | |, return the
Question: What is the right type to use for Boolean values in C? Why isn’t
it a standard type? Should I use #defines or enums for the true and false
values?
Answer: C does not provide a standard Boolean type, in part because pick¬
ing one involves a space/time tradeoff that can best be decided by the pro¬
grammer. (Using an int may be faster, whereas using char may save data
space."' Smaller types may make the generated code bigger or slower, though,
if they require lots of conversions to and from int.)
“■ Bitfields may be even more compact; see also question 2.26. An unsigned bitfield would be required; a 1-bit
signed bitfield cannot portably hold the value +1.
142 CHAPTER 9
or
or
These don’t buy anything (see question 9.2; see also questions 5.12 and 10.2).
Answer: Even though any nonzero value is considered true in C, this applies
only “on input,” i.e., where a Boolean value is expected. When a Boolean
BOOLEAN EXPRESSIONS AND VARIABLES 143
if((a == b) == TRUE)
would work as expected (as long as TRUE is 1), but it is obviously silly. In
general, explicit tests against TRUE and FALSE are inappropriate. In par¬
ticular, and unlike the built-in operators, some library functions (notably
isupper, isalpha, etc.) return, on success, a nonzero value that is not nec¬
essarily 1, so comparing their return values against a single value, such as
TRUE, is quite risky and likely not to work.
(Besides, if you believe that
if((a == b) == TRUE)
is an improvement over
if(a == b)
or even
See also Lewis Carroll’s essay “What the Tortoise Said to Achilles.”)
Given that if (a == b) is a perfectly legitimate conditional, so is this:
♦include <ctype.h>
if(isupper(c))
{ ... }
if(isvegetable)
{ ... }
or
if(fileexists(outfile))
{ ... )
144 CHAPTER 9
if(isvegetable == TRUE)
or
if(fileexists(outfile) == YES)
are not really any improvement. (They can be thought of as “safer” or “bet¬
ter style,” but they can also be thought of as risky or poor style. They cer¬
tainly don’t read as smoothly. See question 17.10.)
A good rule of thumb is to use TRUE and FALSE (or the like) only for
assignment to a Boolean variable or function parameter or as the return value
from a Boolean function, never in a comparison.
See also question 5.3.
Question: Should I use symbolic names, such as TRUE and FALSE, for
Boolean constants or use plain 1 and 0?
Answer: It’s your choice. Preprocessor macros like these are used for code
readability, not because the underlying values might ever change. It’s a matter
of style, not correctness, whether to use symbolic names or raw 1/0 values.
(The same argument applies to the NULL macro. See also questions 5.10 and
17.10.)
BOOLEAN EXPRESSIONS AND VARIABLES 145
On the one hand, using a symbolic name (e.g., TRUE or FALSE) reminds
the reader that a Boolean value is involved. On the other hand, Boolean val¬
ues and definitions can evidently be confusing, and some programmers feel
that TRUE and FALSE macros only compound the confusion. (See also ques¬
tion 5.9.)
Question: A third-party header file I just started using is defining its own
TRUE and FALSE values incompatibly with the code I’ve already developed.
What can I do?
unlike the rest of C in several respects. As its name suggests, the preprocessor
know about the structure of the code as seen by the rest of the compiler, it
The first part of this chapter is arranged around the major preprocessor
10.6 through 10.11), and #if (questions 10.12 through 10.19). Questions
10.20 through 10.25 cover fancier macro replacement, and finally questions
10.26 and 10.27 cover a particular set of problems relating to the preproces¬
146
THE C PREPROCESSOR 147
Macro Definitions
#define square(x) x * x
1 / square(n)
would expand to
1 / n * n
1 / (n * n)
In this case, the problem is one of associativity rather than precedence, but
the effect is the same.
2. Within the macro definition, all occurrences of the parameters must be
parenthesized to protect any low-precedence operators in the actual argu¬
ments from the rest of the macro expansion. Again given the square ()
macro, the invocation
square(n + 1)
would expand to
n + 1 * n + 1
(n + 1) * (n + 1)
148 CHAPTER 10
3. If a parameter appears several times in the expansion, the macro may not
work properly if the actual argument is an expression with side effects. Yet
again given the square () macro, the invocation
square(i++)
would expand to
i++ * i++
10.2
Question: Here are some cute preprocessor macros:
tdefine begin {
#define end }
With these, I can write C code that looks more like Pascal.
THE C PREPROCESSOR 149
10.3
Question: How can I write a generic macro to swap two values?
Answer: This question has no good answer. If the values are integers, a well-
known trick using exclusive-OR could perhaps be used, but it will not work
for floating-point values or pointers or if the two values are the same vari¬
able."' If the macro is intended to be used on values of arbitrary type (the
usual goal), any solution involving a temporary variable is problematic,
because:
• It’s difficult to give the temporary a name that won’t clash with anything.
Any name you pick might be the actual name of one of the variables being
swapped. If you tried using ## to concatenate the names of the two actual
arguments, to ensure that it won’t match either one, it might still not be
unique if the concatenated name is longer than 31 characters,^ and it
wouldn’t let you swap things (such as a [i]) that aren’t simple identifiers.
*Also, the “obvious” supercompressed implementation for integral types aA=bA=a"=b is illegal, due to mul¬
tiple side effects; see question 3.2.
Ifhe C standard does not require compilers to look beyond the first 31 characters of an identifier.
150 CHAPTER 10
You could probably get away with using a name like _tmp in the “no
man’s land” between the user and implementation namespaces; see ques¬
tion 1.29.
• Either the temporary can’t be declared with the right type (because stan¬
dard C does not provide a typeof operator), or (if it copies objects byte
by byte, perhaps with memcpy, to a temporary array sized with sizeof)
the macro can’t be used on operands that are declared register.
MACRO(argl, arg2);
This means that the “caller” will be supplying the final semicolon, so the
macro body should not. The macro body cannot therefore be a simple brace-
enclosed compound statement, because of the possibility that the macro could
be used as the if branch of an if/else statement with an explicit else
clause:
if(cond)
MACRO(argl, arg2);
if(cond)
(stmtl; stmt2;};
/*...*/ \
} while(0) /* (no trailing ; ) */
When the caller appends a semicolon, this expansion becomes a single state¬
ment regardless of context. (An optimizing compiler will remove any “dead”
tests or branches on the constant condition 0, although lint may complain.)
Another possibility might be
stmtl; \
stmt2; \
} else
This is inferior, however, since it quietly breaks the surrounding code if the
caller happens to forget to append the semicolon on invocation.
If all of the statements in the intended macro are simple expressions, with
no declarations or loops, another technique is to write a single, parenthesized
expression using one or more comma operators:
For an example, see the first DEBUG () macro in question 10.26. This tech¬
nique also allows a value (in this case, expr3) to be “returned.”
Some compilers, e.g., gcc, are also able to expand compact functions in
line, either automatically or at the programmer’s request, perhaps with a non¬
standard “inline” keyword or other extension.
10.5
Question: What’s the difference between using a typedef or a preprocessor
macro for a user-defined type?
Header Files
Question: I’m splitting up a program into multiple source files for the first
time. What should I put in .c files and what should I put in .h files? (What
does “.h” mean, anyway?)
Answer: As a general rule, you should put these things in header (.h) files:
tifndef HFILENAME_USED
#define HFILENAME_USED
(A different bracketing macro name is, of course, used for each header file.)
Finally, automated Makefile maintenance tools (which are a virtual necessity
in large projects anyway; see question 18.1) handle dependency generation in
the face of nested #include files easily.
See also question 17.10.
10.8
Question: Where are header (“#include”) files searched for?
* Strictly speaking, <> headers do not have to exist as files at all. The <> syntax is usually reserved for system-
defined headers.
154 CHAPTER 10
then (if not found) in the same standard places. (This last rule, that "" files
are additionally searched for as if they were <> files, is the only rule specified
by the standard.)
Another distinction is the definition of “current directory” for "" files.
Traditionally (especially under UNIX compilers), the current directory is
taken to be the directory containing the file containing the #include direc¬
tive. Under other compilers, however, the current directory is the directory in
which the compiler was initially invoked. (Compilers running on systems
without directories or without the notion of a current directory may, of
course, use still different rules.)
It is also common for there to be a way (usually a command line option
involving capital I or maybe an environment variable) to add additional
directories to the list of standard places to search. Check your compiler doc¬
umentation.
10.9
Question: Why am I getting strange syntax errors on the very first declara¬
tion in a file?
Answer: Perhaps there’s a missing semicolon at the end of the last declara¬
tion in the last header file you’re including. See also questions 2.18, 11.29,
and 16.1.
THE C PREPROCESSOR 155
10.10
Question: I’m using header files that accompany two different third-party
libraries, and they are “helpfully” defining common macros, such as TRUE,
FALSE, Min (), and Max (), but the definitions clash with each other and
with definitions I’d already established in my own header files. What can I
do?
10.11
Question: I seem to be missing the system header file <sgtty.h>. Where
can I get a copy?
Conditional Compilation
10.12
Question: How can I construct preprocessor #if expressions that compare
strings?
Answer: You can’t do it directly; preprocessor #if arithmetic uses only inte¬
gers. You can define several manifest constants, however, and implement con¬
ditionals on those:
#define RED 1
#define BLUE 2
#define GREEN 3
(Standard C specifies a new #elif directive that makes if/else chains like
these a bit cleaner.)
See also question 20.17.
10.13
10.14
Question: Can I use #ifdef in a #def ine line to define something in two
different ways, like this?
#define a b \
#ifdef whatever
c d
#else
e f g
#endif
Answer: No. You can’t “run the preprocessor on itself,” so to speak. What
you can do is use one of two completely separate #def ine lines, depending
on the #ifdef setting:
#ifdef whatever
#define abed
#else
#define a b e f g
#endif
10.15
Question: Is there anything like an #ifdef for typedefs?
Answer: Unfortunately, no. (There can’t be, because types and typedefs
haven’t been parsed at preprocessing time.) See also questions 1.13 and
10.13.
10.16
Question: How can I use a preprocessor #if expression to tell whether a
machine’s byte order is big-endian or little-endian?
Answer: You probably can’t. The usual techniques for detecting endianness
involve pointers or arrays of char or maybe unions, but preprocessor arith¬
metic uses only long integers, and there is no concept of addressing. Fur¬
thermore, the integer formats used in preprocessor #if expressions are not
necessarily the same as those that will be used at run time.
Are you sure you need to know the machine’s endianness explicitly? Usu¬
ally, it’s better to write code that doesn’t care (see, for example, the code frag¬
ments in question 12.42). See also question 20.9.
10.17
Question: Why am I getting strange syntax errors inside lines I’ve used
ttifdef to disable?
10.18
Question: I inherited some code that contains far too many ttifdefs for
my taste. How can I preprocess the code to leave only one conditional com¬
pilation set, without running it through the preprocessor and expanding all of
the #includes and #def ines as well?
10.19
Fancier Processing
Macro replacement can get fairly complicated sometimes too complicated.
For two somewhat popular tricks that used to work (if at all) by accident,
namely, “token pasting” and replacement inside string literals, ANSI C intro¬
Question: I have some old code that tries to construct identifiers with a
macro like
Here is one other method you could try for pasting tokens under a pre-
ANSI compiler:
ttdefine XPaste(s) s
#define Paste(a, b) XPaste(a)b
10.21
Question: I have an old macro
tchars.t_eofc = CTRL(D);
based on the assumption that the actual value of the parameter c will be sub¬
stituted even inside the single quotes of a character constant. Preprocessing
was never supposed to work this way, however; it was somewhat of an acci¬
dent that a CTRL () macro like this ever worked. ANSI C defines a new
“stringizing” operator (see question 11.17), but there is no corresponding
“charizing” operator.
The best solution to this problem is probably to move the single quotes
from the definition to the invocation, by rewriting the macro as
and invoking it as
CTRL('D')
or
#define CTRL(c) (#c[0] & 037)
Neither of these would work as well as the original, however, since they
wouldn’t be valid in case labels or as global variable initializers. (Global
variable initializers and case labels require various flavors of constant
expressions in which string literals and indirection are not allowed.)
See also question 11.18.
10.22
Question: Why is the macro
10.23I
Question: How can I use a macro argument inside a string literal in the
macro expansion?
10.24
Question: I’m trying to use the ANSI “stringizing” preprocessing operator
‘#’ to insert the value of a symbolic constant into a message, but it keeps
stringizing tlle macro’s name rather than its value. Why?
10.25
Question: How can I do this really tricky preprocessing?
of arguments (the canonical example is printf; see also Chapter 15). For
THE C PREPROCESSOR 163
the same sorts of reasons, it’s sometimes wished that a function-like macro
10.26
Question: How can I write a macro that takes a variable number of argu¬
ments or use the preprocessor to “turn off” a function call with a variable
number of arguments?
Answer: One popular trick is to define and invoke the macro with a single,
parenthesized “argument,” which in the macro expansion becomes the entire
argument list, parentheses and all, for a function such as printf:
The obvious disadvantage is that the caller must always remember to use the
extra parentheses. Another problem is that the macro expansion cannot insert
any additional arguments (that is, DEBUG () couldn’t expand to something
like fprintf (debugfd, ...)).
The GNU C compiler has an extension that allows a function-like macro
to accept a variable number of arguments, but it’s not standard. Other possi¬
ble solutions are:
#define _ ,
DEBUG("i = %d" _ i)
(These all require care on the part of the user, and all of them are rather ugly.)
164 CHAPTER 10
It is often better to use a true function, which can take a variable number
of arguments in a well-defined way. See questions 15.4 and 15.5.
When you want to turn the debugging printouts off, you can use a differ¬
ent version of your debug macro:
Or, if you’re using real function calls, you can use still more preprocessor
tricks to remove the function name but not the arguments, such as:
These tricks are predicated on the assumption that a good optimizer will
remove any “dead” printf calls or degenerate cast-to-void parenthesized
comma expressions. See also question 10.14.
10.27
Question: How can I include expansions of the_FILE_and_LINE_
macros in a general-purpose debugging macro?
#include <stdio.h>
#include <stdarg.h>
expands to
{
dbgfile = file;
dbgline = line;
return debug;
}
With these definitions, debug("i is %d", i); gets expanded to:
by ISO 9899:1990 and its ongoing revisions) marked a major step in C’s
guities in the language, but it introduced a few new features and definitions
with pre-ANSI compilers try to use code written since the standard became
widely adopted.
Institute, so it’s often called “ANSI C.” The ANSI C Standard was adopted
sometimes called “ISO C.” ANSI eventually adopted the ISO version (super¬
seding the original), so it’s now often called “ANSI/ISO C.” Unless you’re
making a distinction about the wording of the original ANSI standard before
167
168 CHAPTER 11
subject of C is implicit in the discussion, it’s also common to use the word
The Standard
n.i
Question: What is the “ANSI C Standard?”
• ISO Sales
Case Postale 56
CH-1211 Geneve 20
Switzerland
At the time of this writing, the cost is $130.00 from ANSI or $410.00
from Global. Copies of the original X3.159 (including the Rationale) may
still be available at $205.00 from ANSI or $162.50 from Global. Note that
ANSI derives revenues to support its operations from the sale of printed stan¬
dards, so electronic copies are not available.
In the United States, it may be possible to get a copy of the original ANSI
X3.159 (including the Rationale) as “FIPS PUB 160” from:
Function Prototypes
The most significant introduction in ANSI C is the function prototype (bor¬
acceptable, which makes the rules for prototypes somewhat more compli¬
cated.
11.3
Question: Why does my ANSI compiler complain about a mismatch when
it sees
int func(x)
float x;
{ ... }
short integers are promoted to int. (For old-style function definitions, the
values are automatically converted back to the corresponding narrower types
within the body of the called function, if they are declared that way there.)
Therefore, the old-style definition in the question actually says that func
takes a double (which will be converted to float inside the function).
This problem can be fixed in two ways. One is to use new-style syntax
consistently in the definition:
11.4
Answer: Doing so is perfectly legal and can be useful for backward com¬
patibility, as long as you’re careful (see especially question 11.3). Note, how¬
ever, that old-style syntax is marked as obsolescent, so official support for it
may be removed some day.
"Changing a parameter’s type may require additional changes if the address of that parameter is taken and
must have a particular type.
172 CHAPTER 11
11.5
Question: Why does the declaration
struct x;
printf("%d", n) ;
where n was actually a long int. Aren’t ANSI function prototypes sup¬
posed to guard against argument type mismatches like this?
11.7
Question: 1 heard that you have to include <stdio.h> before calling
printf. Why?
tem: type qualifiers. Type qualifiers can modify pointer types in several ways
(affecting either the pointer or the object pointed to), so qualified pointer dec¬
larations can be tricky. (The questions in this section refer to const, but
11.8
Question: Why can’t I use const values in initializers and array dimen¬
sions, as in:
const int n = 5;
int a[n];
11.9
Question: What’s the difference between const char *p, char const *p,
and char * const p?
Answer: The first two are interchangeable; they declare a pointer to a con¬
stant character (which means that you can’t change the character). On the
other hand, char * const p declares a constant pointer to a (variable)
character (i.e., you can’t change the pointer). Read these declarations “inside
out” to understand them; see question 1.21.
11.10
Question: Why can’t I pass a char ** to a function that expects a
const char **?
Answer: You can use a pointer to T (for any type T) where a pointer to
const T is expected. However, the rule (an explicit exception) that permits
slight mismatches in qualified pointer types is not applied recursively but only
at the top level. (Since const char ** is pointer to pointer to const char,
the exception does not apply.)
The reason you cannot assign a char ** value to a const char **
pointer is somewhat obscure. Given that the const qualifier exists at all, the
compiler would like to help you keep your promises not to modify const
values. That’s why you can assign a char * to a const char * but not the
other way around: It’s clearly safe to “add” const-ness to a simple pointer,
but it would be dangerous to take it away. However, suppose that you per¬
formed the following, more complicated series of assignments:
*p2 = &c; /* 4 */
*pl = 'X' ; /* 5 */
ANSI/ISO STANDARD C 175
11.11
Question: I’ve got the declarations
Answer: Typedef substitutions are not purely textual. (This is one of the
advantages of typedefs; see question 1.13.) In the declaration
const charp p;
p is const for the same reason that const int i declares i as const. The
declaration of p does not “look inside” the typedef to see that a pointer is
involved.
*C++ has more complicated rules for assigning const-qualified pointers, allowing you to make more kinds of
assignments without incurring warnings but still protecting against inadvertent attempts to modify const val¬
ues. C++ would still not allow assigning a char ** to a const char **, but it would let you get away with
assigning a char ** to a const char * const *.
176 CHAPTER 11
Using main()
main, the declaration of main is unique because it has two acceptable argu¬
ment lists, and the rest of the declaration (in particular, the return type) is dic¬
tated by a factor outside of the program’s control, namely, the startup code
11.12
Question: Can I declare main as void to shut off these warnings about
main not returning a value?
int main(void)
int main(int argc, char **argv)
int main()
Finally, the int return value can be omitted, since int is the default (see
question 1.25).
If you’re calling exit but still getting warnings, you may have to insert a
redundant return statement (or use some kind of “not reached” directive, if
available).
Declaring a function as void does not merely shut off or rearrange warn¬
ings; it may also result in a different function call/return sequence, incompat¬
ible with what the caller (in main’s case, the C run-time startup code)
expects. That is, if the calling sequences for void- and int-valued functions
differ, the startup code is going to be calling main using specifically the
ANSI/ISO STANDARD C 177
11.13
Question: What about main’s third argument, envp?
11.14
Question: I believe that declaring void main() can’t fail, since I’m calling
exit instead of returning. Anyway, my operating system ignores a program’s
exit/return status.
11.15
Question: But why do all my books declare main as void?
Answer: They’re wrong, or they’re assuming that everyone writes code for
systems on which it happens to work.
11.16
Question: Is exit (status) truly equivalent to returning the same
status from main?
Answer: Yes and no. The standard says that a return from the initial call to
main is equivalent to calling exit. However, a few older, nonconforming
systems may have problems with one or the other form. Also, a return from
main cannot be expected to work if data local to main might be needed dur¬
ing cleanup; see also question 16.4. (Finally, the two forms are obviously not
equivalent in a recursive call to main.)
Preprocessor Features
ANSI C introduced a few new features into the C preprocessor, including the
#define Str(x) #x
#define OP plus
This code sets opname to "plus" rather than to "OP". (It works because the
Xstr () macro expands its argument, and then Str () stringizes it.)
An equivalent circumlocution is necessary with the token-pasting operator
## when the values (rather than the names) of two macros are to be con¬
catenated.
Note that both # and ## operate only during preprocessor macro expan¬
sion. You cannot use them in normal source code, only in macro definitions.
11.18
Question: What does the message “warning: macro replacement within a
string literal” mean?
printf("TRACE: i = %d\n", i) ;
In other words, macro parameters were expanded even inside string literals
and character constants. (This interpretation may even have been an accident
of early implementations, but it can prove useful for macros like this.)
Macro expansion is not defined in this way by K&R or by the C standard.
(It can be dangerous and confusing; see question 10.22.) When you do want
180 CHAPTER 11
to turn macro arguments into strings, you can use the new # preprocessing
operator, along with string literal concatenation (another new ANSI feature):
11.19
Question: Why am I getting strange syntax errors inside lines I’ve used
#ifdef to disable?
Answer: Under ANSI C, the text inside a “turned off” #if, #ifdef, or
#ifndef must still consist of “valid preprocessing tokens.” This means that
there must be no newlines inside quotes and no unterminated comments or
quotes. (Note particularly that an apostrophe within a contracted word looks
like the beginning of a character constant.) Therefore, natural-language com¬
ments and pseudocode should always be written between the “official” com¬
ment delimiters /* and */. (But see questions 20.20 and also 10.25.)
11.20
Question: What is the #pragma directive and what is it good for?
11.21
Question: What does “#pragma once” mean? I found it in some header
files.
11.22
Question: Is char a [3] = "abclegal? What does it mean?
11.23
Question: Since array references decay into pointers, if array is an array,
what’s the difference between array and &array?
11.24
Question: Why can’t I perform arithmetic on a void * pointer?
Answer: The compiler doesn’t know the size of the pointed-to objects.
(Remember that pointer arithmetic is always in terms of the pointed-to size;
see also question 4.4.) Therefore, arithmetic on a void * is disallowed
(although some compilers allow it as an extension). Before performing arith¬
metic, convert the pointer either to char * or to the pointer type you’re try¬
ing to manipulate (but see also questions 4.5 and 16.7).
11.25
Question: What’s the difference between memcpy and memmove?
dp += n; sp += n;
while (n— > 0)
*—dp = *—sp;
}
return dest;
}
ANSI/ISO STANDARD C 183
The problem with this code is in that additional test; the pointer comparison
(dp < sp) is not quite portable (it compares two pointers that do not nec¬
essarily point within the same object) and may not be as cheap as it looks. On
some machines, particularly segmented architectures, it may be tricky and sig¬
nificantly less efficient’1' to implement.
11.26
Question: What should malloc(O) do: return a null pointer or a pointer
to 0 bytes?
Answer: The ANSI/ISO C Standard says that it may do either; the behavior
is implementation-defined (see question 11.33). Portable code must either
take care not to call malloc (0) or be prepared for the possibility of a null
return.
11.27
Question: Why does the ANSI standard not guarantee more than six case-
insensitive characters of external identifier significance?
Answer: The problem is older linkers, which are under the control of nei¬
ther the ANSI/ISO C Standard nor the C compiler developers on the systems
that have them. The limitation is only that identifiers be significant in the first
*For example, a correct test under a segmented architecture might require pointer normalization.
184 CHAPTER 11
six characters, not that they be restricted to six characters in length. This lim¬
itation is annoying, but certainly not unbearable, and is marked in the stan¬
dard as “obsolescent,” i.e., a future revision will likely relax it.
This concession to current, restrictive linkers really had to be made (the
Rationale notes that its retention was “most painful”). Several tricks have
been proposed by which a compiler burdened with a restrictive linker could
present the C programmer with the appearance of more significance in exter¬
nal identifiers; the excellently worded §3.1.2 in the X3.159 Rationale (see
question 11.1) discusses some of these schemes and explains why they could
not be mandated. You can rely on uniqueness of longer or mixed-case identi¬
fiers if your environment supports them, but be prepared for extra work if
you ever have to port such code to a more restrictive environment.
11.28
Question: What was noalias and what ever happened to it?
Answer: The type qualifier noalias (in the same syntactic class as const
and volatile), was intended to assert that an object was not pointed to
(“aliased”) by other pointers. The primary application would have been for
the formal parameters of functions designed to perform computations on
large arrays. A compiler cannot usually take advantage of vectorization or
other parallelization hardware (on supercomputers that have it) unless it can
ensure that the source and destination arrays do not overlap.
The noalias keyword was not backed up by any “prior art,” and it was
introduced late in the review and approval process. It was surprisingly diffi¬
cult to define precisely and explain coherently, and sparked widespread, acri¬
monious debate. It had far-ranging implications, particularly for several stan¬
dard library interfaces, for which easy fixes were not readily apparent.
ANSI/ISO STANDARD C 185
Because of the criticism and the difficulty of defining noalias well, the
committee declined to adopt it, in spite of its superficial attractions. (When
writing a standard, features cannot be introduced halfway; their full integra¬
tion, and all implications, must be understood.) The need for an explicit
mechanism to support parallel implementation of nonoverlapping operations
remains unfilled, although some work is being done on the problem.
cient new functionality that ANSI code is not necessarily acceptable to older
or may accept (and therefore seem to condone) code that the standard says is
suspect.
11.29
{
return 0;
would be rewritten as
10. Cross your fingers. (In other words, the steps listed here are not always
sufficient; more complicated changes, not covered by any cookbook con¬
versions, may be required.)
11.30
Question: Why are some ANSI/ISO C Standard library functions showing
up as undefined, even though I’ve got an ANSI compiler?
Answer: It’s possible to have a compiler available that accepts ANSI syntax
but not to have ANSI-compatible header files or run-time libraries installed.
(In fact, this situation is rather common when using a non-vendor-supplied
compiler, such as gcc.) See also questions 11.29, 13.25, and 13.26.
11.31
Answer: Two programs, protoize and unprotoize, convert back and forth
between prototyped and old-style function definitions and declarations.
(These programs do not handle full-blown translation between classic C
and ANSI C.) These programs are part of the FSF’s GNU C compiler distrib¬
ution; see question 18.3.
The unproto program (pub/unix/unproto5.shar.Z on ftp.win.tue.nl) is a fil¬
ter that sits between the preprocessor and the next compiler pass, converting
most of ANSI C to traditional C on the fly.
The GNU GhostScript package comes with a little program called
ansi2knr.
Before converting ANSI C back to old style, beware that such a conversion
cannot always be made both safely and automatically. ANSI C introduces
new features and complexities not found in K&R C. You 11 especially need to
be careful of prototyped function calls; you’ll probably need to insert explicit
casts. See also questions 11.3 and 11.29.
Several prototype generators exist, many as modifications to lint. A pro
gram called CPROTO was posted to comp.sources.misc in March 1992.
There is another program, called “cextract.” Many vendors supply simple
utilities like these with their compilers. See also question 18.16. (But be care¬
ful when generating prototypes for old functions with “narrow” parameters;
see question 11.3.)
Finally, are you sure you really need to convert lots of old code to ANSI
C? The old-style function syntax is still acceptable, and a hasty conversion
can easily introduce bugs. (See question 11.3.)
188 CHAPTER 1 1
11.32
Question: Why won’t my C compiler, which claims to be ANSI compliant,
accept this code? I know that the code is ANSI, because gcc accepts it.
Compliance
Obviously, the whole point of having a standard is so that programs and
compilers can be compatible with it (and therefore with each other). Com¬
compliance, and the scope of the standard’s definitions is not always as com¬
issues are not precisely specified; portable programs must simply avoid
11.33
Answer: First of all, all three of these represent areas in which the C stan¬
dard does not specify exactly what a particular construct, or a program that
uses it, must do. This looseness in C’s definition is traditional and deliberate:
ANSI/ISO STANDARD C 189
It permits compiler writers to (1) make choices that allow efficient code to be
generated by arranging that various constructs are implemented as “however
the hardware does them” (see also question 14.4), and (2) ignore (that is,
avoid worrying about generating correct code for) certain marginal constructs
that are too difficult to define precisely and that probably aren’t useful to
well-written programs, anyway (see, for example, the code fragments in ques¬
tions 3.1, 3.2, and 3.3).
The three variations on “not precisely defined by the standard” are defined
as:
11.34
Question: I’m appalled that the ANSI standard leaves so many issues unde¬
fined. Isn’t a standard’s whole job to standardize these things?
11.35
Somebody told me that in basketball you can’t hold the ball and run. I got a basket¬
ball and tried it and it worked just fine. He obviously didn’t understand basketball.
A program isn’t very useful unless you can tell it what to do and it can tell
you what it has done. Almost any program must therefore do some I/O. C’s
library*—and these functions are therefore some of the most used in C’s
libraries.
from, and write to files. Files are treated as sequential character streams,
although seeking is possible. You can make a distinction between text and
binary files if it’s meaningful. You name a file by a string that represents an
Chapter 19). Three predefined I/O streams are opened for your program
implicitly: You can read from stdin, which is often an interactive keyboard,
and you can write to stdout or stderr, both of which are often the users
“■When we refer to “the stdio library,” we really mean “the stdio functions within the standard C run-time
library,” or “the functions described by <stdio.h>.
191
192 CHAPTER 12
screen. There is, however, very little defined functionality concerning the
and scanf (questions 12.12 through 12.20). Questions 12.21 through 12.26
cover other stdio functions. When you need access to a specific file, you can
either open it with fopen (questions 12.27 through 12.32) or redirect a stan¬
text I/O, you can resort to “binary” streams (questions 12.37 through 12.42).
Before delving into all those particulars, however, here are a few simple,
Basic I/O
char c;
Answer: For one thing, the variable to hold getchar’s return value must be
an int. EOF is an out of band return value from getchar, distinct from
all possible char values that getchar can return. (On modern systems, it
does not reflect any actual end-of-file character stored in a file; it is a signal
that no more characters are available.) The values returned by getchar must
be stored in a variable larger than char so that it can hold all possible char
values and EOF.
Two failure modes are possible if, as in the preceding fragment, getchar’s
return value is assigned to a char.
1. If type char is signed and if EOF is defined (as is usual) as -1, the char¬
acter with the decimal value 255 (• \377 ' or ' \xf f - in C) will be sign
extended and will compare equal to EOF, prematurely terminating the
input.51'
The value 255 assumes that type char is 8 bits. On some systems chars are larger, but the possibility of
analogous failure modes remains.
THE STANDARD I/O LIBRARY 193
2. If type char is unsigned, an EOF value will be truncated (by having its
higher-order bits discarded, probably resulting in 255 or Oxff) and will
not be recognized as EOF, resulting in effectively infinite input. *
The bug can go undetected for a long time, however, if chars are signed and
if the input is all 7-bit characters. (Whether plain char is signed or unsigned
is implementation-defined.)
12.2
Question: Why does the simple line-copying loop
while(!feof(infp)) {
fputs(buf, outfp);
Answer: In C, EOF is indicated only after an input routine has tried to read
and has reached end of file. (In other words, C’s I/O is not like Pascal’s.) Usu¬
ally, you should just check the return value of the input routine:
fputs(buf, outfp);
Generally, you don’t need to use feof at all. (Occasionally, feof or its com¬
panion f error is useful after a stdio call has returned EOF or NULL, to dis¬
tinguish between an end-of-file condition and a read error.)
“As in the previous paragraph, the value 255 assumes that type char is 8 bits. On some systems chars are
larger, but the possibility of analogous failure modes remains.
194 CHAPTER 12
12.3
Question: I’m using fgets to read lines from a file into an array of point¬
ers. Why do all the lines end up containing copies of the last line?
12.4
12.5
Question: How can I read one character at a time without waiting for the
Return key?
‘Another possibility might be to use setbuf or setvbuf to turn off buffering of the output stream, but
buffering is a Good Thing, and completely disabling it can lead to crippling inefficiencies.
THE STANDARD I/O LIBRARY 195
printf Formats
12.6
Question: How can I print a • V character in a printf format string? I
tried \%, but it didn’t work.
printf("%d\n", n);
196 CHAPTER 12
Answer: Whenever you print long ints, you must use the 1 (lowercase
letter “ell”) modifier in the printf format (e.g., %ld). Since printf can’t
know the types of the arguments you’ve passed to it, you must let it know by
using the correct format specifiers.
12.8
Question: Aren’t ANSI function prototypes supposed to guard against argu¬
ment type mismatches?
12.9
Question: Someone told me that it is wrong to use %lf with printf. How
can printf use %f for type double if scanf requires %lf}
Answer: It’s true that printf’s %f specifier works with both float and
double arguments."' Due to the “default argument promotions” (which
apply in variable-length argument lists,+ such as print f’s, whether or not
prototypes are in scope), values of type float are promoted to double, and
printf therefore sees only doubles. See also question 15.2.
This situation is completely different for scanf, which accepts pointers
for which no such promotions apply. Storing into a float (via a pointer) is
very different from storing into a double, so scanf distinguishes between
%f and %lf.
The following table lists the argument types expected by printf and
scanf for the various format specifiers.
f Everything said here is equally true of %e and %g and the corresponding scanf formats %le and %lg.
In fact, the default argument promotions apply only in the variable-length part of variable-length argument
lists; see Chapter 15.
THE STANDARD I/O LIBRARY 197
%c int char *
%d, %i int int *
%o, %u, %x unsigned int unsigned int *
%s char * char *
%p void * void **
%n int * int *
%% none none
12.10
Question: How can I implement a variable field width with print f ? That
is, instead of something like %8d, I want the width to be specified at run time.
Answer: Use printf ("%*d", width, n). The asterisk in the format
specifier indicates that an int value from the argument list will be used for
the field width. (Note that in the argument list, the width precedes the value
to be printed.) See also question 12.15.
12.11
Question: How can I print numbers with commas separating the thou¬
sands? What about currency-formatted numbers?
tinclude <locale.h>
char *p = &retbuf[sizeof(retbuf)-1]■
int i = 0
}
}
*P = ' \0 ' ;
do {
if(i%3 == 0 && i != 0)
*—p = comma;
*~P = 'O' + n % 10;
n /= 10;
i++;
} while(n != 0);
return p;
}
THE STANDARD I/O LIBRARY 199
A better implementation would use the grouping field of the lconv struc¬
ture rather than assuming groups of three digits. A safer size for retbuf
might be 4* (sizeof (long) *CHAR_BIT+2) /3/3 + 1. See question 12.21.
scanf Formats
12.12
Question: Why doesn’t the call scanf ( "%d" , i) work?
Answer: The arguments you pass to scanf must always be pointers. For
each value converted, scanf “returns” it by filling in one of the locations
you’ve passed pointers to. (See also question 20.1.) To fix the preceding frag¬
ment, change it to scanf ("%d" , &i).
12.13
double d;
scanf ("%f, &d) ;
Answer: Unlike printf, scanf uses %lf for values of type double and
%f for float.* The %f format tells scanf to expect a pointer to float, not
the pointer to double you gave it. Either use %lf or declare the receiving
variable as a float. See also question 12.9.
12.14
Question: Why doesn’t this code work?
short int s;
scant("%d", &s);
#define WIDTH 3
#define Str(x) #x
If the width is a run-time variable, though, you’ll have to build the format
specifier at run time, too:
char fmt[10];
(Such scanf formats are unlikely when reading from standard input but
might find some usefulness with fscanf or sscanf.)
See also questions 11.17 and 12.10.
THE STANDARD I/O LIBRARY 201
12.16
Question: How can I read data from data files with particular formats?
How can I read 10 floats without having to use a jawbreaker scanf for¬
mat mentioning %t 10 times? How can I read an arbitrary number of fields
from a line into an array?
Answer: In general, there are three main ways of parsing data lines:
1234ABC5.678
could be read with "%d%3s%f". (See also the last example in question
12.19.)
2. Break the line into fields separated by whitespace (or some other delim¬
iter), using strtok or the equivalent (see question 13.6); then deal with
each field individually, perhaps with functions such as atoi and atof.
(Once the line is broken up, the code for handling the fields is much like
the traditional code in main () for handling the argv array; see question
20.3.) This method is particularly useful for reading an arbitrary (i.e., not
known in advance) number of fields from a line into an array.
Here is a simple example that copies a line of up to 10 floating-point
numbers (separated by whitespace) into an array:
#include <stdlib.h>
#define MAXARGS 10
char *av[MAXARGS];
int ac, i;
double array[MAXARGS];
When possible, design data files and input formats so that they don’t
require arcane manipulations, but can instead be parsed with easier tech¬
niques, such as 1 and 2. Dealing with the files will then be much more pleas¬
ant all around.
scanf Problems
Though it seems to be an obvious complement to printf, scanf has a
12.17
Question: When I read numbers from the keyboard with scanf and a
"%d\n" format, like this:
int n;
scanf("%d\n", &n) ;
printf("you typed %d\n", n);
12 3
or
1
2
3
(By way of comparison, source code in such languages as C, Pascal, and LISP
is free-format, whereas traditional BASIC and FORTRAN are not.)
If you’re insistent, scanf can be told to match a newline, using the
“scanset” directive:
scanf("%d%*[\n]”, &n);
Scansets, though powerful, won’t solve all scanf problems, however. See
also question 12.20.
12.18
Question: I’m reading a number with scanf and %d and then a string with
gets():
int n;
char str[80];
scanf("%d", &n);
printf("enter a string: ");
gets(str);
printf("you typed %d and \"%s\"\n", n( str);
but the compiler seems to be skipping the call to gets ()! Why?
204 CHAPTER 12
Answer: If, in response to the program in the question, you type the two
lines
42
a string
scanf will read the 42 but not the newline following it. That newline will
remain on the input stream, where it will immediately satisfy gets (), which
will therefore seem to read a blank line. The second line, “a string”, will not
be read at all.
If you had typed both the number and the string on the same line:
42 a string
12.19
Question: I figured I could use scanf more safely if I checked its return
value to make sure that the user typed the numeric values I expect:
int n;
while(1) {
“■Don’t try the code fragment in the question unless you have a working control-C key or are willing to reboot.
THE STANDARD I/O LIBRARY 205
123CODE
You might want to parse this data file with scanf, using the format string
"%d%s". But if the %d conversion did not leave the unmatched character on
the input stream, %s would incorrectly read "ODE" instead of "CODE". (The
problem is a standard one in lexical analysis: When scanning an arbitrary-
length numeric constant or alphanumeric identifier, you never know where it
ends until you’ve read “too far.” This is one reason that ungetc exists.)
See also question 12.20.
12.20
Question: Why does everyone say not to use scanf? What should I use
instead?
Answer: As noted in questions 12.17, 12.18, and 12.19, scanf has a num¬
ber of problems. Also, its %s format has the same problem that gets () has
(see question 12.23)—it’s difficult to guarantee that the receiving buffer won’t
overflow. *
"An explicit field width, as in %20s, may help; see also question 12.15.
206 CHAPTER 12
12.21
Question: How can I tell how much destination buffer space I’ll need for an
arbitrary sprintf call? How can I avoid overflowing the destination buffer
with sprintf?
consists of one or two %s’s, you can count the fixed characters in the format
string yourself (or let sizeof count them for you) and add in the result of
calling strlen on the string(s) to be inserted. For example, to compute the
buffer size that the call
or
int bufsize = sizeof("You typed \"%s\"") + strlen(answer);
followed by
You can conservatively estimate the size that %d will expand to with code
such as:
♦include <limits.h>
char buf[(sizeof(int) * CHAR_BIT + 2) / 3 + 1 + 1] ;
sprintf(buf, "%d", n);
This code computes the number of characters required for a base-8 repre¬
sentation of a number; a base-10 expansion is guaranteed to take as much
room or less. (The +2 takes care of truncation if the size is not a multiple of
3, and the +1 + 1 leaves room for a leading - and a trailing \0.) An analogous
technique could, of course, be used for long int, and the same buffer can
obviously be used with %u, %o, and %x formats as well.
When the format string is more complicated or is not even known until
run time, predicting the buffer size becomes as difficult as reimplementing
sprintf and correspondingly error prone (and inadvisable). A last-ditch
technique sometimes suggested is to use fprintf to print the same text to a
temporary file and then to look at fprintf’s return value or the size of the
file (but see question 19.12). (Using a temporary file for this application is
admittedly clumsy and inelegant, but it’s the only portable solution besides
writing an entire sprintf format interpreter. If your system provides one,
208 CHAPTER 12
you can use a null or “bit bucket” device, such as /dev/null or NUL,
instead of a temporary file.)
If there’s any chance that the buffer might not be big enough, you won’t
want to call sprintf without some guarantee that the buffer will not over¬
flow and overwrite some other part of memory. Several versions of the stdio
library (including those in GNU and 4.4bsd) provide the obvious snprintf
function, which can be used like this:
and we can hope that a future revision of the ANSI/ISO C Standard will
include this function. (It’s tremendously needed and no more difficult to
implement than sprintf itself.) For computing the buffer size in the first
place, it’s possible that sprintf could be extended to accept a null pointer
buffer argument, safely returning the correct size without storing anything.
12.22
Question: What’s the deal on sprintf’s return value? Is it an int or a
char *?
Answer: The standard says that it returns an int (the number of characters
written, just like printf and fprintf). Once upon a time, in some C
libraries, sprintf returned the char * value of its first argument, pointing
to the completed result (i.e., analogous to strcpy’s return value).
12.23
Question: Why does everyone say not to use gets () ?
Answer: Unlike fgets (), gets () cannot be told the size of the buffer it’s
to read into, so it cannot be prevented from overflowing that buffer if an
input line is longer than expected—and Murphy’s Law says that sooner or
THE STANDARD I/O LIBRARY 209
later, a larger than expected input line will occur. * As a general rule, always
use fgets(). (It’s possible to convince yourself that for one reason or
another, input lines longer than a particular maximum are impossible, but it’s
also possible to be mistaken/ and in any case, it’s just as easy to use fgets.)
One other difference between fgets () and gets () is that fgets ()
retains the ' \n', but it is straightforward to strip it out. See question 7.1 for
a code fragment illustrating the replacement of gets () with fgets ().
12.24
Question: I thought I’d check errno after a long string of printf calls to
see whether any of them had failed:
errno = 0 ;
printf("This\n");
printf("is\n");
printf("a\n");
printf("test. \n") ;
if(errno != 0)
fprintf(stderr, "printf failed: %s\n", strerror(errno));
"■When discussing the drawbacks of gets(), it is customary to point out that the 1988 “Internet worm”
exploited a call to gets () in the UNIX finger daemon as one of its methods of attack. It overflowed gets’s
buffer with carefully contrived binary data that overwrote a return address on the stack such that control flow
transferred into the binary data.
fYou may think that your operating system imposes a maximum length on keyboard input lines, but what if
input is redirected from a file?
210 CHAPTER 12
precisely, errno is meaningful only after a library function that sets errno
on error has returned an error code.)
In general, it’s best to detect errors by checking a function’s return value.
To check for any accumulated error after a long string of stdio calls, you can
use f error. See also questions 12.2 and 20.4.
12.25
Question: What’s the difference between f getpos/f setpos and
ftell/fseek? What are fgetpos and fsetpos good for?
Answer: The newer fgetpos and fsetpos functions use a special typedef,
fpos_t, for representing offsets (positions) in a file. The type behind this
typedef, if chosen appropriately, can represent arbitrarily large offsets, allow¬
ing fgetpos and fsetpos to be used with arbitrarily huge files. In contrast,
ftell and fseek use long int and are therefore limited to offsets that
can be represented in a long int. (Type long int is not guaranteed to
hold values larger than 23l-l, limiting the maximum offset to 2 gigabytes.)
See also question 1.4.
12.26
Question: How can I flush pending input so that a user’s typeahead isn’t
read at the next prompt? Will fflush(stdin) work?
Answer: In standard C, f flush is defined only for output streams. Since its
definition of flush” is to complete the writing of buffered characters (not to
discard them), discarding unread input would not be an analogous meaning
for fflush on input streams.
THE STANDARD I/O LIBRARY 211
12.27
Question: I wrote this function, which opens a file:
{
fp = fopen(filename, "r");
FILE *infp;
myfopen("filename.dat", infp);
the infp variable in the caller doesn’t get set properly. Why not?
For this example, one fix is to change myf open to return a FILE *:
{
FILE *fp = fopen(filename, "r");
return fp;
FILE *infp;
infp = myfopenCfilename.dat");
{
FILE *fp = fopen(filename, "r");
*fpp = fp;
FILE *infp;
myfopenCfilename.dat", &infp);
12.28
Question: I can’t even get a simple fopen call to work! What’s wrong with
this call?
The problem is that fopen’s mode argument must be a string, such as "r",
not a character (' r'). See also question 8.1.
THE STANDARD I/O LIBRARY 213
12.29
Question: Why can’t I open a file by its explicit path? This call is failing:
fopen("c:\newdir\file.dat", "r")
12.30
Question: I’m trying to update a file in place by using fopen mode "r+",
reading a certain string, and writing back a modified string, but it’s not work¬
ing. Why not?
Answer: Be sure to call fseek before you write, both to seek back to the
beginning of the string you’re trying to overwrite and because an fseek or
fflush is always required between reading and writing in the read/write
" +" modes. Also, remember that you can overwrite characters only with the
same number of replacement characters; there is no way to insert or delete
characters in place (see also question 19.14).
12.31
Question: How can I insert or delete a line (or record) in the middle of a file?
12.32
Question: How can I recover the file name given an open stream?
12.33
Question: How can I redirect stdin or stdout to a file from within a pro¬
gram?
f0 ;
12.34
Question: Once I’ve used freopen, how can I get the original stdout (or
stdin) back?
Answer: There isn’t a good way. If you need to switch back, the best solu¬
tion is not to have used freopen in the first place. Try using your own
explicit output (or input) stream variable, which you can reassign at will,
while leaving the original stdout (or stdin) undisturbed. For example,
declare a global
FILE *ofp;
and replace all calls to printf ( ... ) with fprintf (ofp, ... ). (Obvi¬
ously, you’ll have to check for calls to putchar and puts, too.) Then you
can set ofp to stdout or to anything else.
THE STANDARD I/O LIBRARY 215
You might wonder whether you could skip freopen entirely and do
something like
Code like this is not likely to work, because stdout (and stdin and
stderr) are typically constants, which cannot be reassigned (which is why
freopen exists in the first place).
It is barely possible to save away information about a stream before call¬
ing f reopen to open another file in its place, such that the original stream
can later be restored, but the methods involve system-specific calls, such as
dup, or copying or inspecting the contents of a FILE structure, which is
exceedingly nonportable and unreliable.
Under some systems, you can explicitly open the controlling terminal (see
question 12.36), but this isn’t necessarily what you want, since the original
input or output (i.e., what stdin or stdout had been before you called
f reopen) could have been redirected from the command line.
If you’re trying to capture the result of a subprogram execution, f reopen
probably won’t work anyway; see question 19.30 instead.
12.35
Answer: You can’t tell directly, but you can usually look at a few other
things to make whatever decision you need to. If you want your program to
take input from stdin when not given any input files, you can do so if argv
doesn’t mention any input files (see question 20.3) or perhaps if you’re given
a placeholder, such as instead of a file name. If you want to suppress
prompts if input is not coming from an interactive terminal, on some sys¬
tems (e.g., UNIX, and usually MS-DOS), you can use isatty(O) or
isatty (fileno (stdin) ) to make the determination.
216 CHAPTER 12
12.36
Question: I’m trying to write a program like “more.” How can I get back
to the interactive keyboard if stdin is redirected?
Answer: There is no portable way of doing this. Under UNIX, you can open
the special file /dev/tty. Under MS-DOS, you can try opening the “file”
CON or use routines or BIOS calls, such as getch, that may go to the key¬
board whether or not input is redirected.
"Binary" I/O
A normal stream is assumed to consist of printable text and may undergo cer¬
tem. When you want to read and write arbitrary bytes exactly, without any
12.37
Question: I want to read and write numbers directly between files and
memory a byte at a time, not as formatted characters the way fprintf and
f scanf do. How can I do this?
Answer: What you’re trying to do is usually called “binary” I/O. First, make
sure that you are calling fopen with the "b" modifier ("rb", "wb", etc.; see
question 12.38). Then use the & and sizeof operators to get a handle on the
sequences of bytes you are trying to transfer. Usually, the fread and fwrite
functions are what you want to use; see question 2.11 for an example.
Note, though, that fread and fwrite do not necessarily imply binary
I/O. If you’ve opened a file in binary mode, you can use any I/O calls on it
(see, for example, the examples in question 12.42); if you’ve opened it in text
mode, you can use fread or fwrite if they’re convenient.
Finally, note that binary data files are not very portable; see question 20.5.
See also question 12.40.
THE STANDARD I/O LIBRARY 217
12.38
Question: How can I read a binary data file properly? I’m occasionally see¬
ing 0x0a and OxOd values getting garbled, and it seems to hit EOF prema¬
turely if the data contains the value Oxla.
Answer: When you’re reading a binary data file, you should specify "rb"
mode when calling fopen, to make sure that text file translations do not
occur. Similarly, when writing binary data files, use "wb". (Under operating
systems such as UNIX that don’t distinguish between text and binary files,
"b" may not be required but is harmless.)
Note that the text/binary distinction is made when you open the file. Once
a file is open, it doesn’t matter which I/O calls you use on it. See also ques¬
tions 12.40, 12.42, and 20.5.
12.39
Question: I’m writing a “filter” for binary files, but stdin and stdout are
preopened as text streams. How can I change their mode to binary?
system’s newline conventions: When a C program writes a ' \n', the stdio
library writes the appropriate end-of-line indication, and when the stdio
library detects an end of line while reading, it returns a single ' \n' to the
calling program.’'''
In binary mode, on the other hand, bytes are read and written between the
program and the file without any interpretation. (On MS-DOS systems,
binary mode also turns off testing for control-Z as an in-band end-of-file
character.)
Text-mode translations also affect the apparent size of a file as it’s read.
Because the characters read from and written to a file in text mode do not
necessarily match exactly the characters stored in the file, the size of the file
on disk may not always match the number of characters that can be read
from it. Furthermore, for analogous reasons, the f seek and f tell functions
do not necessarily deal in pure byte offsets from the beginning of the file.
(Strictly speaking, in text mode, the offset values used by fseek and ftell
should not be interpreted at all; a value returned by f tell should be used
only as a later argument to fseek, and only values returned by ftell
should be used as arguments to fseek.)
In binary mode, fseek and ftell do use pure byte offsets. However,
some systems may have to append a number of null bytes at the end of a
binary file to pad it out to a full record.
See also questions 12.37 and 19.12.
12.41
‘Some systems may represent lines in text files as space-padded records. On these systems, trailing spaces are
necessarily trimmed when lines are read in text mode, so any trailing spaces that were explicitly written are
THE STANDARD I/O LIBRARY 219
12.42
Question: How can I write code to conform to these old, binary data file
formats?
Answer: It’s difficult, because of word size and byte-order differences, float¬
ing-point formats, and structure padding. To get the control you need over
these particulars, you may have to read and write things a byte at a time,
shuffling and rearranging as you go. (This isn’t always as bad as it sounds
and gives you both code portability and complete control.)
For example, suppose that you want to read a data structure, consisting of
a character, a 32-bit integer, and a 16-bit integer, from the stream fp into the
C structure
struct mystruct {
char c;
long int i32;
int il6;
};
s.c = getc(fp);
s.i32 |= getc(fp);
s . il6 = getc(fp) « 8;
s.il6 |= getc(fp);
This code assumes that getc reads 8-bit characters and that the data is
stored most significant byte first (“big endian”). The casts to (long) ensure
that the 16- and 24-bit shifts operate on long values (see question 3.14), and
the cast to (unsigned) guards against sign extension. (In general, it’s safer
to use all unsigned types when writing code like this, but see question
3.19.)
220 CHAPTER 12
putc(s.c, fp) ;
Once upon a time, a specific run-time library was not a formal part of the C
run-time library (including the stdio functions of Chapter 12) also became
standard.
Some particularly important library functions have their own chapters, see
(malloc, free, etc.) and Chapter 12 for information on the “standard I/O”
The last few questions (13.25 through 13.28) concern problems (e.g.,
221
222 CHAPTER 13
String Functions
(Don’t worry that sprintf may be overkill, potentially wasting run time or
code space; it works well in practice.) See also the examples in the answer to
question 7.5 and also question 12.21.
You can obviously use sprintf to convert long or floating-point num¬
bers to strings as well (using %ld or %f); in other words, sprintf can also
be thought of as the opposite of atol and atof. In addition, you have quite
a bit of control over the formatting. (It’s for these reasons that C supplies
sprintf as a general solution, and not itoa.)
If you simply must write an itoa function, here are some things to con¬
sider:
Question: Why does strncpy not always place a ' \0 ' terminator in the
destination string?
LIBRARY FUNCTIONS 223
Answer: Since it was first designed to handle a now obsolete data structure,
the fixed-length, not necessarily \0-terminated “string, * strncpy is admit¬
tedly a bit cumbersome to use in other contexts; you must often append a
' \ 0 ' to the destination string by hand. You can get around the problem by
using strncat instead of strncpy. If the destination string starts out
empty, strncat does what you probably wanted strncpy to do:
Strictly speaking, however, this is guaranteed to work only for n < 509.
When arbitrary bytes (as opposed to strings) are being copied, memcpy is
usually a more appropriate function to use than strncpy.
Question: Does C have anything like the “substr” (extract substring) rou¬
tine present in other languages?
char dest[LEN+l];
strncpy(dest, &source[POS], LEN);
dest[LEN] = '\0 ’ ; /* ensure \0 termination */
char dest[LEN+l] =
stmcatfdest, &source[POS] , LEN);
•For example, early C compilers and linkers used 8-character fixed-length strings in their symbol tables and
many versions of UNIX still use 14-character file names. A related quirk of strncpy’s is that it pads short
strings with multiple \0’s, out to the specified length; this can allow more efficient string comparisons, since
they can blindly compare n bytes without also looking for 1 \ 0 '.
224 CHAPTER 13
13.4
The C standard, however, says that toupper and tolower must work
correctly on all characters, i.e., characters that don’t need changing are left
alone.
13.6
Answer: The only standard function available for this kind of “tokemzing”
is strtok, although it can be tricky to use* and may not do everything you
want it to. (For instance, it does not handle quoting.) Here is a usage exam¬
ple that simply prints each field as it’s extracted:
#include <string.h>
char string[] = "this is a test"; /* not char *; see Q16.6 */
char *p;
for(p = strtok(string, " \t\n"); p != NULL;
p = strtok(NULL, " \t\n"))
printf("\"%s\"\n", p);
#include <ctype.h>
{
char *p = string;
int i;
int argc = 0;
while(isspace(*p))
p++;
argv[argc++] = p;
else {
argv[argc] = 0;
break;
P++;
/* terminate arg: */
if(*p != '\0' && i < argvsize-1)
*p++ = '\0 ' ;
}
return argc;
}
char *av[10];
printf("\"%s\"\n", av[i])■
#include <string.h>
char *p = string;
All the code fragments presented here modify the input string by inserting
\0’s to terminate each field. If you’ll need the original string later, make a
copy before breaking it up.
Question: Where can I get some code to do regular expression and wildcard
matching?
• Classic regular expressions, variants of which are used in such UNIX utili¬
ties as ed and grep. In regular expressions, a dot (.) usually matches any
single character, and the sequence . * usually matches any string of charac¬
ters. (Of course, full-blown regular expressions have several more features
as well.)
• Filename wildcards, variants of which are used by most operating systems.
There is considerably more variation here, but it is often the case that ?
matches any single character and that * matches any string of characters.
system services for listing or opening files specified by wildcards. Check your
compiler/library documentation.
Here is a quick little wildcard matcher by Arjan Kenter:
match(pat+1,str+1) ;
}
}
With this definition, the call match ("a*b. c", "aplomb, c") would
return 1.
Sorting
13.8
Question: I’m trying to sort an array of strings with qsort, using strcmp
as the comparison function, but it’s not working. Why not?
#include <stdlib.h>
char *strings[NSTRINGS];
int nstrings;
/* nstrings cells of strings[] are to be sorted */
qsort(strings, nstrings, sizeof(char *), pstrcmp);
13.9
struct mystruct {
int year, month, day;
};
230 CHAPTER 13
♦include <stdlib.h>
If, on the other hand, you’re sorting pointers to structures, you’ll need
indirection, as in question 13.8. The comparison function would begin with
whe he f Y UCtCmP USeS 6XP 1 COmpanSOnS rather than *e ™re obvious subtractions to decide
ther to return a negative, zero, or positive value. In general, it’s safer to write comparison functions this
way: Subtraction can easily overflow (and cause either an abort or a quiet wrong answer) when a very large
positive number is compared with a very large negative numher tin tUic * \ c n X g
be unlikely in any case.) § ‘ th‘S °f C°UrSe> °Verflow would
LIBRARY FUNCTIONS 231
the cast (int (*) (const void *, const void *)) would do nothing
except, perhaps, silence the message from the compiler telling you that this
comparison function may not work with qsort. The implications of any cast
you use when calling qsort will have been forgotten by the time qsort gets
around to calling your comparison function: It will call them with const
void * arguments, so that is what your function must accept. No proto¬
type mechanism exists that could operate down inside qsort to convert
the void pointers to struct mystruct pointers just before calling
mywrongstructcmp.
232 CHAPTER 13
In general, it is a bad idea to insert casts just to “shut the compiler up.”
Compiler warnings are usually trying to tell you something, and unless you
really know what you’re doing, you ignore or muzzle them at your peril. See
also question 4.9.
13.10
Question: How can I sort a linked list?
Answer: Sometimes, it’s easier to keep the list in order as you build it (or
perhaps to use a tree instead). Algorithms such as insertion sort and merge
sort lend themselves ideally to use with linked lists. If you want to use a stan¬
dard library function, you can allocate a temporary array of pointers, fill it in
with pointers to all your list nodes, call qsort, and finally rebuild the list
pointers based on the sorted array.
13.11
Question: How can I sort more data than will fit in memory?
Answer: You want an “external sort,” which you can read about in Knuth,
Volume 3. The basic idea is to sort the data in chunks (as much as will fit in
memory at one time), write each sorted chunk to a temporary file, and then
merge the files. If your operating system provides a general-purpose sort util¬
ity, you can try invoking it from within your program; see questions 19.27
and 19.30 and the example in question 19.28.
13.12
Question: How can I get the current date or time of day in a C program?
Answer: Just use the time, ctime, and/or localtime functions. (These
functions have been around for years and are in the ANSI standard.) Here is
a simple example:’1'
#include <stdio.h>
#include <time.h>
main()
{
time_t now;
time (&now) ;
return 0;
char fmtbuf[30];
(Note that these functions take a pointer to the time_t variable, even when
they will not be modifying it.f)
*Note, though, that according to ANSI, time can fail, returning (time_t) (-1).
These pointers are basically a holdover from the earliest days of C, before type long was invented; back then,
an array of two ints was used to hold time values.
234 CHAPTER 13
13.13
Question: I know that the library function local time will convert a
time__t into a broken-down struct tm and that ctime will convert a
time_t to a printable string. How can I perform the inverse operations of
converting a struct tm or a string into a time_t?
13.14
Question: How can I add n days to a date? How can I find the difference
between two dates?
Answer: The ANSI/ISO Standard C mktime and dif f time functions pro¬
vide some support for both problems. Nonnormalized dates are acceptable to
mktime, so it is straightforward to take a filled-in struct tm, add or sub¬
tract from the tm_mday field, and call mktime to normalize the year, month,
and day fields (and, incidentally, convert to a time_t value). The dif f time
function computes the difference, in seconds, between two time_t values;
mktime can be used to compute time_t values for two dates to be sub¬
tracted.
These solutions are guaranteed to work correctly only for dates in the
range that can be represented as time_ts. The tm_mday field is an int, so
day offsets of more than 32,736 or so may cause overflow. Note also that at
daylight saving time changeovers, local days are not 24 hours long, so be
careful if you try to divide by 86,400 seconds/day.
LIBRARY FUNCTIONS 235
Here is a code fragment to compute the date 90 days past October 24,
1994:
#include <stdio.h>
#include <time.h>
tml.tm_mon = 10 - 1;
tml.tm_mday = 24;
tml.tm_year = 1994 - 1900;
tml. tm_hour = tml. tm_min = tml. tm_sec = 0;
tml.tm_isdst = -1;
tml.tm_mday += 90;
if(mktime(&tml) == -1)
fprintf(stderr, "mktime failed\n");
else printf("%d/%d/%d\n",
tml.tm_mon+l, tml.tm_mday, tml.tm_year+19 00);
tml.tm_mon -2 - 1;
tml.tm_mday = 28;
tml.tm_year = 2000 - 1900;
tml. tm_hour = tml. tm_min = tml. tm_sec = 0 ;
tml.tm_isdst = -1;
tm2.tm_mon = 3 - 1 ;
tm2 . tm__mday = 1 ;
tm2.tm year = 2000 - 1900;
tm2 . tm_hour = tm2 . tm_min = tm2 . tm_sec = 0 ;
tm2.tm_isdst - -1;
tl = mktime (&tml) ;
t2 = mktime (&tm2) ;
236 CHAPTER 13
if(tl == -1 || t2 == -1)
else {
printf("%ld\n"( d);
(The addition of 8640 0L/2 rounds the difference to the nearest day; see also
question 14.6.)
Another approach to both problems is to use “Julian day” numbers, or the
number of days since January 1, 4013 BC.* It’s convenient to declare a pair of
Julian day conversion functions:
int n = 90;
* Specifically, since noon GMT on that date. Note that the Julian day number is different from the “Julian
dates” sometimes used in data processing and that neither one has anything to do with dates in the Julian
calendar.
LIBRARY FUNCTIONS 237
Random Numbers
13.15
Question: How can I generate random numbers?
#define a 16807
#define m 2147483647
#define q (m / a)
#define r (m % a)
{
long int hi = seed / q;
long int lo = seed % q;
long int test = a * lo - r * hi;
if(test > 0)
seed = test;
else seed = test + m;
return seed;
X «- (aX + c) mod m
double PMrand()
return (double)seed / m;
For slightly better statistical properties, Park and Miller now recommend
using a = 48271.
13.16
Question: How can I get random integers in a certain range?
rand() % N /* POOR */
(which tries to return numbers from 0 to N-l) is poor, because the low-order
bits of many random number generators are distressingly nonrandom. (See
question 13.18.) A better method is something like
rand() / (RAND_MAX / N + 1)
unsigned int y = x * N;
unsigned int r;
do {
r = rand();
} while(r >= y);
return r / x;
For any of these techniques, it’s straightforward to shift the range, if nec¬
essary; numbers in the range [M, N] could be generated with something like
(Note, by the way, that RAND_MAX is a constant telling you what the fixed
range of the C library rand function is. You cannot set RAND_MAX to some
other value, and there is no way of requesting that rand return numbers in
some other range.)
If you’re starting with a random number generator that returns floating¬
point values between 0 and 1 (such as the last version of PMrand alluded to
in question 13.15 or drand48 in question 13.21), all you have to do to get
integers from 0 to N-l is multiply the output of that generator by N:
(int)(drand48() * N)
13.17
Question: Each time I run my program, I get the same sequence of numbers
back from rand. Why?
♦include <stdlib.h>
♦include <time.h>
(Note also that it’s rarely useful to call srand more than once during a run
of a program; in particular, don’t try calling srand before each call to rand,
in an attempt to get “really random” numbers.)
13.18
Question: I need a random true/false value, so I’m just taking rand () % 2,
but it’s alternating 0, 1, 0, 1, 0.... Why?
are written, the low-order n bits repeat with period 2".) For this reason, it’s
preferable to use the higher-order bits: see question 13.16.
13.19
Question: How can I return a sequence of random numbers that don’t
repeat at all?
a [i] = i + 1;
13.20
Question: How can I generate random numbers with a normal, or Gauss¬
ian, distribution?
1. Exploit the Central Limit Theorem (“law of large numbers”) and add up
several uniformly distributed random numbers:
242 CHAPTER 13
#include <stdlib.h>
#include <math.h>
#define NSUM 25
double gaussrand()
{
double x = 0;
int i ;
x += (double)rand() / RAND_MAX;
x - = NSUM / 2.0;
x /= sqrt(NSUM / 12.0);
return x;
ttinclude <stdlib.h>
#include <math.h>
#define PI 3.141592654
double gaussrand()
{
static double U, Vi-
static int phase = 0;
double Z;
if(phase == 0) {
phase = 1 - phase;
return Z;
}
LIBRARY FUNCTIONS 243
#include <stdlib.h>
#include <math.h>
double gaussrand()
{
static double VI, V2, S;
static int phase = 0;
double X;
if(phase -- 0) {
do {
double U1 = (double)rand() / RAND_MAX;
double U2 = (double)rand() / RAND_MAX;
VI = 2 * U1 - 1;
V2 = 2 * U2 - 1;
S = VI * VI + V2 * V2;
} while(S >= 1 || S == 0);
} else
X = V2 * sqrt(-2 * log(S) / S);
phase = 1 - phase;
return X;
}
These methods all generate numbers with mean 0 and standard deviation
1. (To adjust to another distribution, multiply by the standard deviation and
add the mean.) Method 1 is poor “in the tails” (especially if NSUM is small),
but methods 2 and 3 perform quite well. See the references for more infor¬
mation.
13.21
Question: I’m porting a program, and it calls a function drand4 8, which
my library doesn’t have. What is it?
#include <stdlib.h>
double drand48()
{
return rand() / (RAND_MAX + 1.);
}
double drand48()
{
double x = 0;
return x;
}
Before using code like this, though, beware that it is numerically suspect,
particularly if (as is usually the case) the period of rand is on the order of
RAND_MAX. (If you have a longer-period random number generator avail¬
able, such as BSD random, definitely use it when simulating drand4 8.)
13.22
13.23
13.24
index? strchr.
rindex? strrchr.
bcmp? memcmp.
memset, with a second argument of 0.
bzero?
If, on the other hand, you’re using an older system that is missing the functions in
the second column, you may be able to implement them in terms of, or substitute,
the functions in the first. See also questions 12.22 and 13.21.
13.25
Question: I keep getting errors due to library functions being undefined,
even though I’m including all the right header files.
Answer: In general, a header file gives you only the declarations of library
functions, not the library functions themselves. Header files happen at com¬
pile time; libraries happen at link time.
In some cases (especially if the functions are nonstandard), you may have
to explicitly ask for the correct libraries to be searched when you link the pro¬
gram. (Some systems may be able to arrange that whenever you include a
header, its associated library, if nonstandard, is automatically requested at
link time, but such a facility is not widespread.) See also questions 11.30,
13.26, and 14.3.
13.26
Question: I’m still getting errors due to library functions being undefined,
even though I’m explicitly requesting the right libraries while linking.
Answer: Many linkers make one pass over the list of object files and
libraries you specify and extract from libraries only those modules that satisfy
references that have so far come up as undefined. Therefore, the order in
which libraries are listed with respect to object files (and one another) is sig¬
nificant; usually, you want to search the libraries last.
For example, under UNIX, a command line such as
usually won’t work. Instead, put any -1 options at the end of the command
line:
cc myprog.c -lm
If you list a library first, the linker doesn’t know that it needs anything out
of it yet and passes it by. See also question 13.28.
LIBRARY FUNCTIONS 247
13.27
Question: Why is my simple program, which hardly does more than print
“Hello, world!” in a window, compiling to such a huge executable (several
hundred K)? Should 1 include fewer header files?
Answer: What you’re seeing is the current (poor) state of the “art” in
library design. Run-time libraries tend to accumulate more and more features
(especially having to do with graphical user interfaces). When one library
function calls another library function to do part of its job (which ought to
be a Good Thing; that’s what library functions are for), it can happen that
calling anything in the library (particularly something relatively powerful
such as printf) eventually pulls in practically everything else, leading to
horribly bloated executables.
Including fewer header files probably won t help, because declaring a few
functions that you don’t call (which is mostly all that happens when you
include a header you don’t need) shouldn’t result in those functions being
placed in your executable, unless they do in fact get called. See also question
13.25.
You may be able to track down and derail a chain of unnecessarily coupled
functions that are bloating your executable or perhaps complain to your ven¬
dor to clean up the libraries.
13.28
Question: What does it mean when the linker says that _end is undefined?
Answer: That message is a quirk of the old UNIX linkers. You get an error
about _end being undefined only when other things are undefined, too. Fix
the others, and the error about _end will disappear. (See also questions 13.25
and 13.26.)
Floating Point
and the problems are a bit worse in C because it has not traditionally been
Question: When I set a float variable to, say, 3.1, why is print f print¬
ing it as 3.0999999?
* Converting binary floating-point numbers to and from base 10 without discrepancies is an interesting prob¬
lem; two excellent papers on the subject by Clinger, Steele, and White are mentioned in the bibliography.
248
FLOATING POINT 249
Question: I’m trying to take some square roots, and I’ve simplified the code
down to
main ()
{
printf("%f\n", sqrt(144.));
Answer: Make sure that you have included <math.h> and correctly
declared other functions returning double. (Another library function to be
careful with is atof, which is declared in <stdlib. h>.) See also questions
1.25, 14.3, and 14.4.
14.3
Answer: Make sure that you’re actually linking with the math library. For
instance, under UNIX, you usually need to use the - lm option, at the end of the
command line, when compiling/linking. See also questions 13.25 and 13.26.
14.4
laws do not hold completely; that is, order of operation may be important, and
repeated addition is not necessarily equivalent to multiplication. Underflow,
cumulative precision loss, and other anomalies are often troublesome.
Don't assume that floating-point results will be exact, and especially don’t
assume that floating-point values can be compared for equality. (Don’t throw
haphazard “fuzz factors” in, either; see question 14.5.) Beware that some
machines have more precision available in floating-point computation regis¬
ters than in double values stored in memory, which can lead to floating¬
point inequalities when it would seem that two values just have to be equal.
These problems are no worse for C than for any other computer language.
Certain aspects of floating point are usually defined as “however the proces¬
sor does them (see also questions 11.33 and 11.34); otherwise, a compiler
for a machine without the “right” model would have to do prohibitively
expensive emulations.
This book cannot begin to list the pitfalls associated with, and
workarounds appropriate for, floating-point work. A good numerical pro¬
gramming text should cover the basics; see also the references. (Beware,
though, that subtle problems can occupy numerical analysts for years.)
References: Kernighan and Plauger, The Elements of Programming Style §6 pp. 115-8
Knuth, Volume 2 chapter 4
Goldberg, “What Every Computer Scientist Should Know about Floating-Point Arithmetic”
14.5
double a, b;
if(a == b) /* WRONG */
#include <math.h>
for a suitably chosen epsilon. The value of epsilon may still have to be
chosen with care: Its appropriate value may be quite small and related only
to the machine’s floating-point precision, or it may be larger if the numbers
being compared are inherently less accurate or are the result of a chain of cal¬
culations that compounds accuracy losses over several steps. (Also, you may
have to make the threshold a function of b or of both a and b.)
A decidedly inferior approach, not generally recommended, would be to
use an absolute threshold:
Absolute “fuzz factors,” such as 0.001, never seem to work for very long,
however. As the numbers being compared change, it’s likely that two small
numbers that should be taken as different happen to be within 0.001 of each
other or that two large numbers, which should have been treated as equal,
differ by more than 0.001 . (And, of course, the problems merely shift around
and do not go away when the fuzz factor is tweaked to 0.005 or 0.0001 or
any other absolute number.)
Doug Gwyn suggests using the following relative difference function. It
returns the relative difference of two real numbers: 0.0 if they are exactly the
same; otherwise, the ratio of the difference to the larger of the two.
{
double c = Abs(a);
double d = Abs(b);
d = Max(c, d) ;
Typical usage is
Answer: The simplest and most straightforward way is with code such as
(int)(x + 0.5)
Answer: One reason is probably that few processors have a built-in expo¬
nentiation instruction. C has a pow function (declared in <math.h>) for per¬
forming exponentiation, although explicit multiplication is often better for
small positive integral exponents.* In other words, pow(x, 2 . ) is probably
inferior to x * x. (If you’re tempted to make a Square () macro, though,
check question 10.1 first.)
In particular, not all implementations of pow yield the expected results when both arguments are integral. For
example, on some systems, (int)pow(2 . , 3 .) gives 7 due to truncation; see also question 14.6.
FLOATING POINT 253
14.8
#ifndef M_PI
tdefine M_PI 3.1415926535897932385
#endif
to provide your own definition only if a system header file has not.)
14.9
Question: How do I set variables to or test for IEEE NaN (“Not a Num¬
ber”) and other special values?
*The concern here is one of “namespace pollution”; see also question 1.29.
254 CHAPTER 14
Don’t be too surprised, though, if these don’t work or if they abort the com¬
piler with a floating-point exception.
(The most reliable way of setting up these special values would use a hex
representation of their internal bit patterns, but initializing a floating-point
value with a bit pattern would require using a union or some other type pun¬
ning mechanism and would obviously be machine dependent.)
See also question 19.39.
14.10
14.11
typedef struct {
double real;
double imag;
} complex;
{
complex ret;
ret.real = real;
ret.imag = imag;
return ret;
{
return cpx_make(Real(a) + Real(b), Imag(a) + Imag(b));
14.12
14.13
Question: I’m having trouble with a Turbo C program that crashes and says
something like “floating point formats not linked.” What am I missing?
atively rare but are vital in the context of C’s printf function and in related
because formal support for them arose only under the ANSI C Standard
fixed part and a variable-length part. Thus, we find ourselves using bombas¬
ment list.” (You will also see the terms “variadic” and “varargs” used: Both
point to the beginning of the argument list by calling va_start. Next, argu¬
ments are retrieved from the variable argument list by calling va_arg, which
257
258 CHAPTER 15
typedef that hides the details of the actual data structure used.)
Varargs functions may use special calling mechanisms, different from the
must always be in scope before a varargs call (see question 15.1). However, a
prototype obviously cannot specify the number and type(s) of the variable
promotions” (see question 15.2), and no type checking can be performed (see
question 15.3).
15.1
Question: I heard that you have to include <stdio.h> before calling
printf. Why?
15.2
Question: How can %£ be used for both float and double arguments in
printf? Aren’t they different types?
printf("%d", n);
where n was actually a long int. Aren’t ANSI function prototypes sup¬
posed to guard against argument type mismatches like this?
15.4
Question: How can I write a function that takes a variable number of argu¬
ments?
if(first == NULL)
return NULL;
len = strlen(first);
va_start(argp, first);
va_end(argp)•
if(retbuf == NULL)
(void)strcpy(retbuf, first);
va_end(argp);
return retbuf;
(Note that a second call to va_start is needed to restart the scan when the
argument list is processed a second time. Note the calls to va_end: They re
important for portability, even if they don’t seem to do anything.)
A call to vs treat looks something like this;
Note the cast on the last argument; see questions 5.2 and 15.3. (Also note
that the caller is responsible for freeing the allocated memory.)
The preceding example was of a function that accepts a variable number
of arguments, all of type char *. Here is an example that accepts a variable
number of arguments of different types; it is a stripped-down version of the
familiar printf function. Note that each invocation of va_arg () specifies
the type of the argument being retrieved from the argument list.
(The miniprintf function here uses baseconv from question 20.10 to
format numbers. It is significantly imperfect in that it will not usually be able
to print the smallest integer, INT_MIN, properly.)
#include <stdio.h>
#include <stdarg.h>
void
miniprintf(char *fmt, ...)
{
char *p;
int. i;
unsigned u;
char *s;
va_list argp;
262 CHAPTER 15
putchar(*p);
continue;
}
switch(*++p) {
case 1c1:
i = va_arg(argp, int);
case 1d1:
i = va_arg(argp, int);
if(i < 0) {
case 'o':
case 's’:
case 'u':
case 'x':
u = va_arg(argp, unsigned int);
fputs(baseconv(u, 16), stdout);
break;
case '%':
putchar('%');
break;
}
}
va_end(argp);
15.5
Question: How can I write a function that, like printf, takes a format
string and a variable number of arguments, and passes them to printf to do
most of the work?
#include <stdio.h>
#include <stdarg.h>
{
va_list argp;
fprintf(stderr, "error: ");
va_start(argp, fmt);
vfprintf(stderr, fmt, argp);
va_end(argp);
fprintf(stderr, "\n");
}
References: K&R2 §8.3 p. 174, §B1.2 p. 245 H&S §15.12 pp. 379-80
ANSI §4.9.6.7, 4.9.6.8, 4.9.6.9 PCS §11 pp. 186-7
ISO §7.9.6.7, §7.9.6.8, §7.9.6.9
15.6
Question: How can I write a function analogous to scanf, i.e., that accepts
similar arguments, and calls scanf to do most of the work?
Answer: Unfortunately, vscanf and the like are not standard. You’re on
your own.
15.7
Question: I have a pre-ANSI compiler, without <stdarg.h>. What can I
do?
Answer: An older header, cvarargs . h>, offers about the same functional¬
ity. Here is the vs treat function from question 15.4, rewritten to use
cvarargs . h>:
VARIABLE-LENGTH ARGUMENT LISTS 265
#include <stdio.h>
#include <varargs.h>
#include <string.h>
char *vstrcat(va_alist)
va_dcl /* no semicolon */
{
int len = 0;
char *retbuf;
va_list argp;
char *p;
va_start(argp);
va_end(argp);
if(retbuf == NULL)
return NULL; /* error */
retbuf[0] = *\0';
strcat(retbuf, p);
va_end(argp);
return retbuf;
}
(Note that there is no semicolon after va_dcl, and that in this case, no spe¬
cial treatment for the first argument is necessary.) You may also have to
declare the string functions by hand rather than using <string.h>.
If you can manage to find a system with vfprintf but without
<stdarg.h>, the following is a version of the error function (from ques¬
tion 15.5) using <varargs.h>.
266 CHAPTER 15
#include <stdio.h>
#include <varargs.h>
void error(va_alist)
va_dcl /* no semicolon */
{
char *fmt;
va_list argp;
fprintf(stderr, "error: ");
va_start(argp);
fmt = va_arg(argp, char *);
vfprintf(stderr, fmt, argp);
va_end(argp);
fprintf(stderr, "\n");
}
15.8
Question: How can I discover how many arguments were used to call a
function?
tinel value (often 0, -1, or an appropriately cast null pointer) at the end of
the list (see the execl and vstrcat examples in questions 5.2 and 15.4).
Finally, if their types are predictable, you can pass an explicit count of the number
of variable arguments (although it’s usually a nuisance for the caller to generate).
15.9
Question: My compiler isn’t letting me declare a function
int f(...)
{
}
15.10
Question: I have a varargs function that accepts a float parameter. Why
isn’t the call va_arg(argp, float) extracting it correctly?
15.11
Question: I can’t get va_arg to pull in an argument of type pointer to func¬
tion. Why not?
the expansion of
va_arg(argp, funcptr)
is
Harder Problems
You can pick apart variable-length argument lists at run time, as we’ve seen.
But you can create them only at compile time. (We might say that strictly
speaking, there are no truly variable-length argument lists; every actual argu¬
ment list has some fixed number of arguments. A varargs function merely has
the capability of accepting a different length of argument list with each call.)
If you want to call a function with a list of arguments created on the fly at
15.12
Question: How can I write a function that takes a variable number of argu¬
ments and passes them to another function (which takes a variable number of
arguments)?
{
error (fmt, what goes here? ) ;
exit(EXIT_FAILURE);
but it’s not obvious how to hand faterror’s arguments off to error.
Proceed as follows. First, split up the existing error function to create a
new verror that accepts not a variable argument list but a single va—list
pointer. (Note that doing so is little extra work, because verror contains
much of the code that used to be in error, and the new error becomes a
simple wrapper around verror.)
270 CHAPTER 15
#include <stdio.h>
ttinclude <stdarg.h>
{
fprintf(stderr, "error: ”);
vfprintf(stderr, fmt, argp);
fprintf(stderr, "\n");
}
{
va_list argp;
va_start(argp, fmt);
verror(fmt, argp);
va_end(argp);
Now you can write f aterror and have it call verror, too:
ttinclude <stdlib.h>
{
va_list argp;
va_start(argp, fmt);
verror(fmt, argp);
va_end(argp);
exit(EXIT_FAILURE);
}
Note that the relation between error and verror is exactly that which
holds between, for example, printf and vprintf. In fact, as Chris Torek
has observed, whenever you find yourself writing a varargs function, it’s a
good idea to write two versions of it: one (like verror) that accepts a
va_list and does the work, the other (like the revised error) that is a sim¬
ple wrapper. The only real restriction on this technique is that a function like
verror can scan the arguments just once; it has no way to reinvoke
va_start.
If you do not have the option of rewriting the lower-level function (error,
in this example) to accept a va_list, such that you find yourself needing to
pass the variable arguments that one function (e.g., f aterror) receives on to
another as actual arguments, no portable solution is possible. (The problem
VARIABLE-LENGTH ARGUMENT LISTS 271
{
va_list argp;
va_s tart(argp, fmt);
error(fmt, argp); /* WRONG */
va_end(argp);
exit (EXIT_FAILURE) ;
char *fmt;
int al, a2, a3, a4, a5, a6;
{
error(fmt, al, a2, a3, a4, a5, a6); /* VERY WRONG */
exit (EXIT_FAILURE) ;
This example is presented only for the purpose of urging you not to use it;
please don’t try it just because you saw it here.
15.13
Question: How can I call a function with an argument list built up at run
time?
It’s not even worth asking the rhetorical question, Have you ever had a baf¬
fling bug that you just couldn’t track down? Of course you have; everyone
has. C has a number of splendid “gotchals” lurking in wait for the unwary;
this chapter discusses a few of them. (In fact, any language powerful enough
16.1
Question: Why is this loop always executing once?
{
printf("%d\n", i);
Answer: The accidental extra semicolon hiding at the end of the line con¬
taining the for constitutes a null statement—which is, as far as the compiler
is concerned, the loop body. The following brace-enclosed block, which you
thought (and the indentation suggests) was a loop body, is actually the next
272
STRANGE PROBLEMS 273
Question: I’m getting strange syntax errors on the very first declaration in a
file, but it looks fine.
16.3
Question: This program crashes before it even runs! (When single stepping
with a debugger, it dies before the first statement in main.) Why?
Answer: You probably have one or more very large (kilobyte or more) local
arrays. Many systems have fixed-size stacks, and those that perform dynamic
stack allocation automatically (e.g., UNIX) can be confused when the stack
tries to grow by a huge chunk all at once. It is often better to declare large
arrays with static duration (unless, of course, you need a fresh set with
each recursive call, in which case you could dynamically allocate them with
malloc; see also question 1.31).
Other possibilities are that your program has been linked incorrectly (com¬
bining object modules compiled with different compilation options or using
improper dynamic libraries), that run-time dynamic library linking is failing
for some reason or that you have somehow misdeclared main.
See also questions 11.12, 16.4, 16.5, and 18.4.
16.4
Question: I have a program that seems to run correctly, but it crashes as it’s
exiting, after the last statement in main(). What could be causing this?
274 CHAPTER 16
(The second and third problems are closely related to question 7.5; see also
question 11.16.)
16.5
Question: This program runs perfectly on one machine, but I get weird
results on another. Stranger still, adding or removing debugging printouts
changes the symptoms. What’s wrong?
Answer: Lots of things could be going wrong; here are a few of the more
common things to check:
*On a stack-based machine, at least, the value that an uninitialized local variable happens to receive tends to
depend on what is on the stack and hence what has been called recently. That’s why inserting or removing
debugging printouts can make a bug go away; printf is a large function, so calling it or not can make a large
difference in what’s left on the stack.
STRANGE PROBLEMS 275
Proper use of function prototypes can catch several of these problems; lint
would catch several more. See also questions 16.3, 16.4, and 18.4.
16.6
Question: Why does this code crash?
p[0] = 'H';
Answer: String constants are in fact constant. The compiler may place them
in nonwritable storage, and it is therefore not safe to modify them. When you
need writable strings, you must allocate writable memory for them, either by
declaring an array or by calling malloc. Try
*A bug of this sort in the author’s own formatting software delayed an already late manuscript of this book
by another few days, nearly costing him the goodwill of his editor.
276 CHAPTER 16
By the same argument, a typical invocation of the old UNIX mktemp func¬
tion
mktemp(tmpfile);
Question: I’ve got some code that’s trying to unpack external structures, but
it’s crashing with a message about an “unaligned access.” What does this
mean? The code looks like this:
struct mystruct {
char c;
long int i32;
int il6;
};
Answer: The problem is that you’re playing too fast and loose with your
pointers. Some machines require that data values be stored at appropriately
aligned addresses. For instance, 2-byte short ints might be constrained to
sit at even addresses and 4-byte long ints at multiples of 4. (See also ques¬
tion 2.12.) By converting a char * (which can point to any byte) to an
int * or long int * and then indirecting on it, you can end up asking the
STRANGE PROBLEMS 277
s.c = *p++;
s.il6 = *p++ « 8;
s.il6 |= *p++;
This code also gives you control over byte order. (This example, though,
assumes that a char is 8 bits and that the long int and int being
unpacked from the “external structure” are 32 and 16 bits, respectively.) See
question 12.42 (which contains some similar code) for a few explanations
and caveats.
See also question 4.5.
16.8
Question: What do “segmentation violation” and “bus error” mean?
What’s a “core dump”?
• inadvertent use of null pointers (see also questions 5.2 and 5.20)
• uninitialized, misaligned, or otherwise improperly allocated pointers (see
questions 7.1, 7.2, and 16.7)
• stale aliases to memory that has been relocated (see question 7.29)
278 CHAPTER 16
"■Yes, the name “core” derives ultimately from old ferrite core memories.
Computer programs are written not only to be processed by computers, but
other than simple acceptability to the compiler. Style considerations are nec¬
code style, like those on religion, can be debated endlessly. Good style is a
worthy goal and can usually be recognized, but it cannot be rigorously codi¬
trywide consensus on what constitutes good style, does not mean that pro¬
17.1
Question: What’s the best style for code layout in C?
Answer: Kermghan and Ritchie, while providing the example most often
copied, also supply a good excuse for disregarding it:
279
280 CHAPTER 17
The position of braces is less important, although people hold passionate beliefs.
We have chosen one of several popular styles. Pick a style that suits you, then use
it consistently.
It is more important that the layout chosen be consistent (with itself and with
nearby or common code) than that it be “perfect.” If your coding environ¬
ment (i.e., local custom or company policy) does not suggest a style and you
don’t feel like inventing your own, just copy K&R.
Each of the various popular styles has its good and bad points. Putting the
open brace on a line by itself wastes vertical space; combining it with the fol¬
lowing line makes it cumbersome to edit; combining it with the previous line
prevents it from lining up with the close brace and may make it more diffi¬
cult to see.
Indenting by eight columns per level is most common but often gets you
uncomfortably close to the right margin (which may be a hint that you should
break up the function). If you indent by one tab but set tab stops at some¬
thing other than eight columns, you’re requiring other people to read your
code with the same software setup that you used.
The elusive quality of “good style” involves much more than mere code
layout details; don’t spend time on formatting to the exclusion of more sub¬
stantive code quality issues.
See also question 17.2.
17.2
Question: How should functions be apportioned among source files?
Answer: Usually, related functions are put together in one file. Sometimes
(as when developing libraries), it is appropriate to have exactly one source file
(and, consequently, one object module) per independent function. Other
times, and especially for some programmers, numerous source files can be
cumbersome, and it may be tempting (or even appropriate) to put most or all
of a program in a few big source files. When the scope of certain functions or
global variables is to be limited by using the static keyword, source file
layout becomes more constrained: The static functions and variables and the
functions sharing access to them must all be in the same file.
In other words, there are a number of tradeoffs, so it is difficult to give
general rules. See also questions 1.7, 1.9, 10.6, and 10.7.
STYLE 281
17.3
Question: Here’s a neat trick for checking whether two strings are equal:
if(!strcmp(si, s2))
if(Streq(sl, s2))
17.4
Question: Why do some people write if ( 0 == x) instead of if (x == 0)?
if(x = 0)
If you’re in the habit of writing the constant before the ==, the compiler will
complain if you accidentally type
if(0 = x)
282 CHAPTER 17
17.5
Question: I came across some code that puts a (void) cast before each call
to printf. Why?
,fNote also that the reversed test convention is not sufficient, as it will not catch if (a = b)
STYLE 283
17.7
Question: Should 1 use symbolic names, such as TRUE and FALSE, for
Boolean constants or plain 1 and 0?
17.9
Question: Where can I get the “Indian Hill Style Guide” and other coding
standards?
cs.washington.edu pub/cstyle.tar.Z
(the updated Indian Hill guide)
ftp.cs.toronto.edu doc/programming
(including Henry Spencer’s
“10 Commandments for C Programmers”)
ftp.cs.umd.edu pub/style-guide
284 CHAPTER 17
(The Indian Hill guide is also available from SSC, P.O. Box 55549, Seattle,
WA 98155, (206) 782-7733, [email protected] .)
You may also be interested in the books The Elements of Programming
Style, Plum Hall Programming Guidelines, and C Style: Standards and
Guidelines-, see the bibliography. (The Standards and Guidelines book is not
in fact a style guide but rather a set of guidelines on selecting and creating
style guides.)
17.10
Question: Some people say that goto statements are evil and that I should
never use them. Isn’t that a bit extreme?
It is an old observation that the best writers sometimes disregard the rules of
rhetoric. When they do, however, the reader will usually find in the sentence some
compensating merit, attained at the cost of the violation. Unless he is certain of
doing as well, he will probably do best to follow the rules.
17.11
Question: People always say that good style is important, but when they go
out of their way to use clear techniques and make their programs readable,
they seem to end up with less efficient programs. Since efficiency is so impor¬
tant, isn’t it necessary to sacrifice some style and readability?
Answer: It’s true that grossly inefficient programs are a problem, but the
blind zeal with which many programmers often chase efficiency is also a
problem. Cumbersome, obscure programming tricks not only destroy read¬
ability and maintainability but may also lead to slimmer long-term efficiency
improvements than would more appropriate design or algorithm choices.
With care, it is possible to design code that is both clean and efficient.
See also question 20.13.
Tools and Resources
need a compiler, and certain other tools can be extremely handy as well. This
chapter discusses several tools, with a focus on lint, and also talks about
Several of the tools and resources mentioned here can be found on the
Internet. Be aware that site names and file locations can change; the addresses
printed here, though correct at the time of this writing, may not work by the
Tools
Question: Where can I find: Answer: Look for programs (see also
question 18.16) named:
286
TOOLS AND RESOURCES 287
(This list of tools is by no means complete; if you know of tools not men¬
tioned, you’re welcome to contact the author.)
Other lists of tools, and discussion about them, can be found in the Usenet
newsgroups comp.compilers and comp.software-eng.
See also questions 18.3 and 18.16.
18.3
Question: What’s a free or cheap C compiler I can use?
lint
C was developed along with the UNIX operating system and shares its phi¬
losophy of “each tool should do exactly one job and do it well.” Tradition¬
ally, a C compiler’s job was to generate machine code from source code, not
nique. That task was reserved for a separate program, named lint (after the
bits of fluff it supposedly picks from programs). Although lint has waned
in significance over the years, newer compilers have not always picked up on
the same diagnostic tasks in its stead, so there may still be a place for it in the
18.4
Question: I just typed in this program, and it’s acting strangely. What can
be wrong with it?
Answer: See if you can run lint first (perhaps with the -a, -c, -h, -p or
other options’5'). Many C compilers are really only half-compilers, taking the
attitude that it’s not their problem if you didn’t say what you meant or if
what you said is virtually guaranteed not to work. (But do also see whether
your compiler has extra warning levels that can be optionally requested.)
See also questions 16.5 and 16.8.
18.5
Question: How can I shut off the “warning: possible pointer alignment
problem” message that lint gives me for each call to malloc?
Answer: The problem is that traditional versions of lint do not know, and
cannot be told, that malloc “returns a pointer to space suitably aligned for
storage of any type of object.” It is possible to provide a pseudoimplementa¬
tion of malloc, using a #def ine inside of #ifdef lint, which effectively
shuts this warning off, but a simpleminded definition will also suppress
meaningful messages about truly incorrect invocations. It may be easier sim¬
ply to ignore the message, perhaps in an automated way with grep -v. (But
don’t get in the habit of ignoring too many lint messages; otherwise, one
day you’ll overlook a significant one.)
18.6
Question: Can I declare main as void to shut off these annoying “main
returns no value” messages?
18.7
Question: Where can I get an ANSI-compatible lint?
Answer: Products called PC-Lint and FlexeLint (in “shrouded source form,” for
compilation on almost any system) are available from Gimpel Software, 3207
Hogarth Ln., Collegeville, PA 19426 (610) 584-4261, or [email protected].
The ANSI-compatible UNIX System V release 4 lint is available separately
(bundled with other C tools) from UNIX Support Labs or from System V
resellers.
A redistributable, ANSI-compatible lint may be available from
ftp.eskimo.com in u/s/scs/ansilint/.
TOOLS AND RESOURCES 291
18.8
Question: Don’t ANSI function prototypes render lint obsolete?
Answer: Not really. Prototypes work only if they are present and correct; an
inadvertently incorrect prototype is worse than useless. Furthermore, lint
checks consistency across multiple source files and checks both data declara¬
tions and functions. Finally, an independent program like lint will probably
always be more scrupulous at enforcing compatible, portable coding practices
than will any particular, implementation-specific, feature- and extension¬
laden compiler.
If you do want to use function prototypes instead of lint for cross-file
consistency checking, make sure that you set the prototypes up correctly in
header files. See questions 1.7 and 10.6.
Resources
Once again, remember that the Internet is always in a state of flux, so some
of the network addresses listed in this section may have changed by the time
Finally, on some UNIX machines, you can try typing learn c at the shell
prompt.
(Disclaimer: I have not reviewed these tutorials; I have heard that at least one
of them contains a number of errors. Also, this sort of information rapidly
becomes out of date; these addresses may not work by the time you read this
and try them.)
Several of these tutorials, along with pointers to a great deal of other infor¬
mation about C, are accessible via the web at http://www.lysator.liu.Se/c/
index.html.
Vinit Carpenter maintains a list of resources for learning C and C++; it is
posted to comp.lang.c and comp.lang.C++ and archived where the on-line ver¬
sions of this book are (see question 20.40). A hypertext version is at
http://vmny.csd.mu.edu/.
See also question 18.10.
18.10
Question: What’s a good book for learning C?
Answer: There are far too many books on C to list here; it’s impossible to
rate them all. Many people believe that the best one was also the first: The C
Programming Language, by Brian Kernighan and Dennis Ritchie (“K&R,”
now in its second edition). Opinions vary on K&R’s suitability as an initial
programming text: Many of us did learn C from it and learned it well; some,
however, feel that it is a bit too clinical as a first tutorial for those without
much programming background.
TOOLS AND RESOURCES 293
18.11
Question: Where can I find answers to the exercises in K&R?
Answer: They have been written up in The C Answer Book; see the bibli¬
ography.
18.12
Question: Does anyone know where the source code from books like
Numerical Recipes in C, Plauger’s The Standard C Library, or Kernighan
and Pike’s The UNIX Programming Environment is available on line?
web pages. . , „
Some of the routines from Numerical Recipes have been released to
the public domain; look on ftp.std.com in the directory vendors/
Numerical-Recipes/Public-Domain/.
The source code in this book, though copyrighted, is explicitly made avail¬
able to use in your programs in any way that you wish. (Naturally, the aut or
and publisher would appreciate it if you acknowledged this book as your
source.) The larger code fragments, plus related material, can be found at
aw.com in directory cseng/authors/summit/cfaq/.
294 CHAPTER 18
18.13
Question: Where can I find the sources of the standard C libraries?
Answer: One source (though not public domain) is The Standard C Library,
by P.J. Plauger (see the bibliography). Implementations of all or part of the C
library have been written and are readily available as part of the netBSD and
GNU (also Linux) projects. See also question 18.16.
18.14
Question: I need code to parse and evaluate expressions. Where can I find
some?
Answer: Two available packages are “defunc,” available via ftp from
sunsite.unc. edu in pub/packages/development/libraries/defunc-1.3.tar.Z, and
“parse,” at lamont.ldgo.columbia.edu. Other options include the S-Lang
interpreter, available from amy.tch.harvard.edu in pub/slang, and the share¬
ware Cmm (“C-minus-minus” or “C minus the hard stuff”). See also question
18.16.
Some parsing/evaluation code can be found in Software Solutions in C
(Chapter 12, pp. 235-55).
18.15
Question: Where can I get a BNF or YACC grammar for C?
Answer: The definitive grammar is, of course, the one in the ANSI standard;
see question 11.2. Another grammar (along with one for C++) by Jim
Roskind is in pub/c++grammarl.l.tar.Z at ics.uci.edu . A fleshed-out, work¬
ing instance of the ANSI grammar (due to Jeff Lee) is on ftp.uu.net (see ques¬
tion 18.16) in usenet/net.sources/ansi.c.grammar.Z (including a companion
lexer). The FSFs GNU C compiler contains a grammar, as does the appendix
to K&R2.
TOOLS AND RESOURCES 295
18.16
Question: Where and how can I get copies of all these freely distributable
programs?
command, or you can send the mail message “help” to the address
archie@archie. cs.mcgill.ca for information.
If you have access to Usenet, see the regular postings in newsgroups
comp.sources.unix and comp.sources.misc, which describe the archiving pol¬
icies for those groups and how to access their archives. The group
comp.archives contains numerous announcements of anonymous ftp avail¬
ability of various items. Finally, comp.sources.wanted has an FAQ list, “How
to find sources,” with more information.
See also question 14.12.
18.17
Question: Where are the on-line versions of this book?
guage specifies all of the details about every interaction a program might
want to have with its environment, but it is natural for a programmer who is
the particular language being used. Real programs frequently need to perform
peripheral I/O devices, networking, etc. C’s definition, however, is silent on all
of these.
The specific techniques required to perform these tasks vary widely across
laundry list of things you can’t do in portable C: Most of the answers boil
down to “It’s system dependent.” (When the brief answers in this chapter
297
298 CHAPTER 19
categories:
19.1
Question: How can I read a single character from the keyboard without
waiting for the Return key? How can I stop characters from being echoed on
the screen as they’re typed?
• If you can use the “curses” library, you can call cbreak* (and perhaps
noecho), after which calls to getch will return characters immediately.
• If all you’re trying to do is read a short password without echo, you may
be able to use a function called getpass, if it’s available. (Another possi¬
bility for hiding typed passwords is to select black characters on a black
background.)
• Under “classic” versions of UNIX, use ioctl and the TIOCGETP and
TIOCSETP (or TIOCSETN) requests on file descriptor 0 to manipulate the
sgttyb structure, defined in <sgtty.h> and documented in tty(4). In the
sg_f lags field, set the CBREAK (or RAW) bit, and perhaps clear the
ECHO bit.
• Under System V UNIX, use ioctl and the TCGETAW and TCSETAW
requests on file descriptor 0 to manipulate the termio structure, defined in
ctermio. h>. In the c_lflag field, clear the ICANON (and perhaps
ECHO) bits. Also, set c_cc [VMIN] to 1 and c_cc [VTIME] to 0.
• Under any operating system (UNIX or otherwise) offering POSIX compat¬
ibility, use the tcgetattr and tcsetattr calls on file descriptor 0 to
manipulate the termios structure, defined in <termios.h>. In the
*In some old versions of curses, the function to request character-at-a-time input is crmode, not cbreak.
300 CHAPTER 19
c_lflag field, clear the ICANON (and perhaps ECHO) bits. Also, set
c_cc [ VMIN] to 1 and c_cc[VTIME] to 0.
• In a pinch, under UNIX, use system (see question 19.27) to invoke the
stty command to set terminal driver modes (as in the preceding three
items).
• Under MS-DOS, use getch or getche or the corresponding BIOS inter¬
rupts.
• Under VMS, try the Screen Management (SMG$) routines, or curses, or
issue low-level $QIO’s with the IO$_READVBLK function code (and per¬
haps IO$M_NOECHO, and others) to ask for one character at a time. (It’s
also possible to set character-at-a-time or “pass through” modes in the
VMS terminal driver.)
• Under other operating systems, you’re on your own.
(As an aside, note that simply using setbuf or setvbuf to set stdin to
unbuffered will not generally serve to allow character-at-a-time input.)
If you change terminal modes, save a copy of the initial state and be sure
to restore it no matter how your program terminates.
If you’re trying to write a portable program, a good approach is to define
your own suite of three functions to (1) set the terminal driver or input sys¬
tem into character-at-a-time mode (if necessary), (2) get characters, and (3)
return the terminal driver to its initial state when the program is finished.
As an example, here is a tiny test program that prints the decimal values
of the next 10 characters as they are typed, without waiting for Return. It is
written in terms of three functions, as described, and is followed by imple¬
mentations of the three functions for curses, classic UNIX, System V UNIX,
and MS-DOS. (The on-line archives associated with this book contain a more
complete set of functions.)
#include <stdio.h>
main()
{
int i ;
if(tty_break() != 0)
return 1;
for(i =0; i < 10; i++)
}
SYSTEM DEPENDENCIES 301
♦include <curses.h>
int tty_.break ()
{
initscr();
cbreak();
return 0;
{
return getch();
int tty_fix()
{
endwin();
return 0;
#include <stdio.h>
♦include <sgtty.h>
int tty_break()
{
struct sgttyb modmodes;
if(ioctl(fileno(stdin), TIOCGETP, &savemodes) < 0)
return -1;
havemodes = 1;
modmodes = savemodes;
modmodes.sg_flags |= CBREAK;
return ioctl(fileno(stdin), TIOCSETN, Sanodmodes);
}
302 CHAPTER 19
{
return getchar();
int tty_fix()
{
if(!havemodes)
return 0;
return ioctl(fileno(stdin), TIOCSETN, &savemodes)
#include <stdio.h>
#include <termio.h>
int tty_break()
{
struct termio modmodes;
{
return getchar();
int tty_fix()
{
i f (! havemodes)
return 0;
{
return getche();
Question: How can I find out whether characters are available for reading
(and if so, how many)? Alternatively, how can I do a read that will not block
if no characters are available?
Answer: These, too, are entirely specific to the operating system. Some ver¬
sions of curses have a nodelay function. Depending on your system, you
may also be able to use “nonblocking I/O,” a system call named select or
poll, the FIONREAD ioctl, c_cc[VTIME], kbhit, rdchk, or the
0_NDELAY option to open or fcntl. You can also try setting an alarm to
cause a blocking read to time out after a certain interval (under UNIX, look
at alarm, signal, and maybe setitimer).
If what you’re trying to do is read input from several sources without
blocking, you will definitely want to use some kind of a “select” call, because
a busy-wait, polling loop is terribly inefficient on a multitasking system.
See also question 19.1.
304 CHAPTER 19
19.3
Question: How can I display a percentage-done indication that updates
itself in place or show one of those “twirling baton” progress indicators?
Answer: These simple things, at least, you can do fairly portably. Printing
the character ' \r ' will usually give you a carriage return without a line feed,
so that you can overwrite the current line. The character ' \b' is a backspace
and will usually move the cursor one position to the left.
Using these characters, you can print a percentage-done indicator:
fflush(stdout);
do_timeconsuming_work();
}
printf("\ndone.\n");
or a baton:
printf("working: ");
for(i =0; i < lotsa; i++) {
printf("%c\b", " | / —\\" [i%4]);
fflush(stdout);
do_timeconsuming_work();
}
printf("done.\n");
19.4
Question: How can I clear the screen? How can I print things in inverse
video? How can I move the cursor to a specific x, y position?
Answer: Such things depend on the terminal type (or display) you’re using.
You will have to use a library such as termcap, terminfo, or curses, or some
system-specific routines to perform these operations.
SYSTEM DEPENDENCIES 305
Functions in the curses library to look for are clear, move, standout/
standend, and attron/attroff/attrset; the last three work with
attribute codes such as A_REVERSE. MS-DOS libraries typically have func¬
tions named gotoxy and clrscr or _clearscreen; you can also use the
ANSESYS driver or low-level interrupts. Under termcap or terminfo, use
tgetstr to retrieve strings like cl, so/se, and cm for clear screen, standout
mode, and cursor motion, respectively; then output the strings (using cm
additionally requires calling tgoto). Some baroque terminals require atten¬
tion to other “capabilities” as well; study the documentation carefully. Be
aware that some older terminals may not support the desired capabilities at
all.
For clearing the screen, a halfway portable solution is to print a form-feed
character (' \ f'), which will cause some displays to clear. Even more
portable would be to print enough newlines to scroll everything away. As a
last resort, you could use system (see question 19.27) to invoke an operat¬
ing system clear-screen command.
19.5
Question: How do I read the arrow keys? What about function keys?
Other I/O
19.7
Question: How can I do serial (“comm”) port I/O?
Answer: It’s system dependent. Under UNIX, you typically open, read, and
write a device file in /dev and use the facilities of the terminal driver to
adjust its characteristics. (See also questions 19.1 and 19.2.) Under MS-DOS,
you can use the predefined stream stdaux, a special file such as COM1,
some primitive BIOS interrupts, or (if you require high performance) any
number of interrupt-driven serial I/O packages.
19.8
Question: How can I direct output to the printer?
Answer: Under UNIX, either use popen (see question 19.30) to write to the
lp or lpr program, or perhaps open a special file such as /dev/lp. Under
MS-DOS, write to the (nonstandard) predefined stdio stream stdprn or
open the special files PRN or LPT1.
Answer: If you can figure out how to send characters to the device at all (see
question 19.8), it’s easy enough to send escape sequences. In ASCII, the ESC
code is 033 (27 decimal), so code like
fprintffofp, "\033[J");
19.10
Question: How can I do graphics?
Answer: Once upon a time, UNIX had a fairly nice little set of device-inde¬
pendent plot routines described in plot(3) and plot(5), but they’ve largely
fallen into disuse.
If you’re programming for MS-DOS, you’ll probably want to use libraries
conforming to the VESA or BGI standards.
If you’re trying to talk to a particular plotter, making it draw is usually a
matter of sending it the appropriate escape sequences; see also question 19.9.
The vendor may supply a C-callable library, or you may be able to find one
on the Internet.
If you’re programming for a particular window system (Macintosh, X win¬
dows, Microsoft Windows), you will use its facilities; see the relevant docu¬
mentation or newsgroup or FAQ list.
19.11
Question: How can I check whether a file exists? I want to warn the user if
a requested input file is missing.
19.12
Question: How can I find out the size of a file prior to reading it in?
19.13
Question: How can a file be shortened in place without completely clearing
or rewriting it?
19.14
Question: How can I insert or delete a line (or record) in the middle of a
file?
*If your operating system provides nonsequential, record-oriented files, it probably has insert/delete opera¬
tions, but C provides no particular support.
310 CHAPTER 19
When you find yourself needing to insert data into an existing file, here are
a few alternatives you can try:
• Rearrange the data file so that you can append the new information at the
end.
• Put the information in a second file.
• Leave some blank space (e.g. a line of 80 spaces or a field like
0000000000) in the file when it is first written and overwrite it later with
the final information (see also question 12.30).
19.15
Question: How can I recover the file name given an open stream or file
descriptor?
19.16
Question: How can I delete a file?
19.17
Question: Why can’t I open a file by its explicit path? This call is failing:
fopen("c:\newdir\file.dat", "r")
fopen("c:\\newdir\\file.dat", "r");
Alternatively, under MS-DOS, it turns out that forward slashes are also
accepted as directory separators, so you could use
fopen("c:/newdir/file.dat", "r") ;
"There is a slight semantic difference between remove and unlink: unlink is guaranteed (on UNIX, any-
way) to work even on open files; remove has no such guarantee.
312 CHAPTER 19
19.18
Question: I’m getting an error: “Too many open files.” How can I increase
the allowable number of simultaneously open files?
Answer: There are at least two resource limitations on the number of simul¬
taneously open files: the number of low-level “file descriptors” or “file han¬
dles” available in the operating system and the number of FILE structures
available in the stdio library. Both must be sufficient. Under MS-DOS sys¬
tems, you can control the number of operating system file handles with a line
in CONFIG.SYS. Some compilers come with instructions (and perhaps a
source file or two) for increasing the number of stdio FILE structures.
19.19
Question: How can I find out how much free space is available on disk?
Answer: There is no portable way. Under some versions of UNIX, you can
call statfs. Under MS-DOS, use interrupt 0x21 subfunction 0x36 or per¬
haps a function such as diskfree. Another possibility is to use popen (see
question 19.30) to invoke and read the output of a “disk free” command
(e.g., df on UNIX).
(Note that the amount of free space apparently available on a disk may not
match the size of the largest file you can store, for all sorts of reasons.)
19.20
Question: How can I read a directory in a C program?
Answer: See whether you can use the opendir and readdir functions,
which are part of the POSIX standard and are available on most UNIX vari¬
ants. Implementations also exist for MS-DOS, VMS, and other systems. (MS-
DOS also has FINDFIRST and FINDNEXT routines, which do essentially the
same thing.) The readdir function returns only file names; if you need more
information about the file, try calling stat. To match filenames to some
wildcard pattern, see question 13.7.
SYSTEM DEPENDENCIES 313
Here is a tiny example that lists the files in the current directory:
#include <stdio.h>
#include <sys/types,h>
#include <dirent.h>
main()
{
struct dirent *dp;
DIR *dfd = opendir
if(dfd != NULL) {
while((dp = readdir(dfd)) != NULL)
printf("%s\n", dp->d_name);
closedir(dfd);
}
return 0;
}
19.21
Answer: If your operating system supports these services, they are likely to
be provided in C via functions named mkdir and rmdir. Removing a direc¬
tory’s contents as well will require listing them (see question 19.20) and call¬
ing remove (see also question 19.16). If you don’t have these C functions
314 CHAPTER 19
available, try system (see question 19.27) along with your operating sys¬
tem’s delete command(s).
19.22
Answer: Your operating system may provide a function that returns this
information, but it’s quite system dependent. (Also, the number may vary
over time.) If you’re trying to predict whether you’ll be able to allocate a cer¬
tain amount of memory, just try it—call malloc (requesting that amount)
and check the return value.
19.23
Question: How can I allocate arrays or structures bigger than 64K?
“flat” compiler (e.g., djgpp; see question 18.3), some kind of a DOS exten¬
der, or another operating system.
19.24
Question: What does the error message “DGROUP data allocation exceeds
64K” mean, and what can I do about it? I thought that using large model
meant that I could use more than 64K of data!
19.25
Then, *magicloc refers to the location you want. (If you want to refer to a
byte at a certain address rather than to a word, use unsigned char .)
Under MS-DOS, you may find a macro like MK_FP () handy for working
with segments and offsets. As suggested by Gary Blaine, you can also declare
tricky array pointers that allow you to access screen memory using array
316 CHAPTER 19
you can access the character and attribute byte at row i, column j with
videomem[i] [j].
Many operating systems execute user-mode programs in a protected mode
where direct access to I/O devices (or to any address outside the running
process) is simply not possible. In such cases, you will have to ask the oper¬
ating system to carry out I/O operations for you.
See also questions 4.14 and 5.19.
19.26
Question: How can I access an interrupt vector located at the machine’s
location 0? If I set a pointer to 0, the compiler might translate it to a nonzero
internal null pointer value.
"System" Commands
19.27
Answer: Use the library function system, which does exactly that.
Some systems also provide a family of spawn routines that accomplish
approximately the same thing. These are not as portable as system, which is
required under the ANSI C Standard, although in any case, the interpretation
of the command string—its syntax and the set of commands accepted—will
obviously vary tremendously.
SYSTEM DEPENDENCIES 317
19.28
Question: How can I call system when parameters (filenames, etc.) of the
executed command aren’t known until run time?
Answer: Just use sprintf (or perhaps strcpy and strcat) to build the
command string in a buffer; then call system with that buffer. (Make sure
that the buffer is allocated with enough space; see also questions 7.2 and
12.21.)
Here is a contrived example suggesting how you might build a data file,
then sort it (assuming the existence of a sort utility and UNIX- or MS-DOS-
style input/output redirection):
char cmdbuf[50];
FILE *fp = fopen(datafile, "w");
fclose(fp);
system(cmdbuf);
fp = fopen(sortedfile, "r");
19.29
19.30
Question: How can I invoke another program or command and trap its
output?
Answer: UNIX and some other systems provide a popen function, which
sets up a stdio stream on a pipe connected to the process running a command,
so that the calling program can read the output (or alternatively supply the
input). Using popen, the last example from question 19.28 would look like
fp = popen(cmdbuf, "r");
pclose(fp);
(Do be sure to call pclose, as shown; leaving it out will seem to work at
first but may eventually run you out of processes.)
If you can t use popen, you may be able to use system, with the output
going to a file that you then open and read, as the code in question 19.28 was
doing already. *
If you’re using UNIX and popen isn’t sufficient, you can learn about
pipe, dup, fork, and exec.
Using system and a temporary file assumes that you don’t need the called program to run concurrently with
the main program.
SYSTEM DEPENDENCIES 319
(One thing that probably would not work, by the way, would be to use
f reopen.)
Process Environment
19.31
Answer: The string in argv [ 0 ] may represent all or part of the pathname,
or it may be empty. You may be able to duplicate the command language
interpreter’s search path logic to locate the executable if the name in
argv[0] is present but incomplete. However, there is no guaranteed solu¬
tion.
19.32
19.33
Question: How can a process change an environment variable in its caller?
19.34
Question: How can I open files mentioned on the command line and parse
option flags?
19.35
19.36
Question: How can I read in an object file and jump to functions in it?
about object file formats, relocation, etc., and this approach can t work if
code and data reside in separate address spaces or if code is otherwise privi¬
leged.
Under BSD UNIX, you could use system and Id -A to do the linking for
you. Many versions of SunOS and System V have the -ldl library containing
functions such as dlopen and dlsym, which allow object files to be dynam¬
ically loaded. Under VMS, use LIB$FIND_IMAGE_SYMBOL. The GNU
project has a package called “did".
19.37
Question: How can I implement a delay or time a user’s response with sub¬
second resolution?
For really brief delays, it’s tempting to use a do-nothing loop like
long int i;
for(i =0; i < 1000000; i++)
but resist this temptation if at all possible! For one thing, your carefully cal¬
culated delay loops will stop working next month when a faster processor
comes out. Perhaps worse, a clever compiler may notice that the loop does
nothing and optimize it away completely.
19.38
Question: How can I trap or ignore keyboard interrupts like control-C?
#include <signal.h>
signal(SIGINT, SIG_IGN);
Actually, there may be several different keyboard interrupts. On many systems, the “interrupt signal”
referred to as SIGINT is generated by control-C. UNIX systems also have SIGQUIT, usually generated by
controlA (although both SIGINT and SIGQUIT can in fact be bound to any key). MS-DOS systems also have
controI-Break, which usually results in SIGINT. On the Macintosh, SIGINT is sometimes generated by com¬
mand-period.
SYSTEM DEPENDENCIES 323
The test and extra call ensure that a keyboard interrupt typed in the fore¬
ground won’t inadvertently interrupt a program running in the background
(and it doesn’t hurt to code calls to signal this way on any system).5'
On some systems, keyboard interrupt handling is also a function of the
mode of the terminal-input subsystem; see question 19.1. On some systems,
checking for keyboard interrupts is performed only when the program is
reading input, and keyboard interrupt handling may therefore depend on
which input routines are being called (and whether any input routines are
active at all). On MS-DOS systems, setcbrk or ctrlbrk functions may
also be involved.
19.39
Answer: On many systems, you can define a function matherr, which will
be called when there are certain floating-point errors, such as errors in the
math functions in <math.h>. You may also be able to use signal (see ques¬
tion 19.38) to catch SIGFPE. See also question 14.9.
Answers: All of these questions are outside of the scope of this book and
have much more to do with the networking facilities you have available than
they do with C. Good books on the subject are Douglas Comer’s three-volume
*On modern UNIX systems using job control, background processes are in separate process groups and so
don't receive keyboard interrupts, but it still doesn’t hurt to code calls to signal th.s way, and any remain¬
Retrospective
19.41
19.42
Question: Why isn’t any of this standardized in C? Any real program has to
do some of these things.
Answer: In fact, some standardization has occurred along the way. In the
beginning, C did not have a standard library at all; programmers always had
to “roll their own” utility functions. After several abortive attempts, a certain
set of library functions (including the str* and stdio families of functions)
became a de facto standard, at least on UNIX systems, but the library was
not yet a formal part of the language. Vendors could (and occasionally did)
provide completely different functions along with their compilers.
In the ANSI/ISO C Standard, a library definition (based on the 1984
/usr/group standard and largely compatible with the traditional UNIX
library) was adopted with as much standing as the language itself. The stan¬
dard C library s treatment of file and device I/O is, however, rather minimal.
It states how streams of characters are written to and read from files, and it
provides a few suggestions about the display behavior of control characters,
such as \b, \r, and \t, but beyond that it is silent.
SYSTEM DEPENDENCIES 325
This chapter, as its name implies, covers a variety of topics that don’t fit into
any of the other chapters. The first two sections cover miscellaneous pro¬
gramming techniques and the manipulation of individual bits and bytes. Next
of Cs features are as they are and why C doesn t have a few features people
sometimes wish for. It leads into some questions involving C and other lan¬
guages.
Whole books have been written about algorithms, and this is not one of
them, but the section on algorithms covers a few questions that seem to come
up all the time among C programmers. Finally, the last section closes with
some trivia and information about the on-line versions of this book.
The questions and sections in this chapter are broken down as follows:
326
MISCELLANEOUS 327
Algorithms 20.28-20.33
Trivia 20.34-20.40
Miscellaneous Techniques
Answer: There are several ways of doing this. (These examples show hypo¬
thetical polar-to-rectangular coordinate conversion functions, which must
return both an x and a y coordinate.)
• Pass pointers to several locations that the function can fill in:
#include <math.h>
double x, y;
polar_to_rectangular(1., 3.14, &x, &y);
328 CHAPTER 20
struct xycoord
polar_to_rectangular(double rho, double theta)
{
struct xycoord ret;
ret.x = rho * cos(theta);
ret.y = rho * sin(theta);
return ret;
struct xycoord c;
20.2
Question: Whats a good data structure to use for storing lines of text? I
started to use fixed-size arrays of arrays of char, but they’re too restrictive.
MISCELLANEOUS 329
Answer: One good way of doing this is with a pointer (simulating an array)
to a set of pointers (each simulating an array) of char. This data structure is
sometimes called a “ragged array” and looks something like this:
t h i s \o
i s \0
a \0
t e s t \0
You could set up the tiny array in the figure with these simple declarations:
char **p = a;
#include <stdlib.h>
char **p = malloc(4 * sizeof(char *));
if(p != NULL) {
p[0] = malloc(5);
p[1] = malloc(3);
p[2] = malloc(2);
p[3] = malloc(5) ;
}
}
(Some libraries have a strdup function that would streamline the inner
malloc and strcpy calls. It’s not standard, but it’s obviously trivial to
implement something like it.)
330 CHAPTER 20
Here is a code fragment that reads an entire file into memory, using the
same kind of ragged array. This code is written in terms of the agetline
function from question 7.30.
#include <stdio.h>
#include <stdlib.h>
extern char *agetline(FILE *);
FILE *ifp;
}
}
lines[nlines++] = p;
20.3
Question: How can I open files mentioned on the command line and parse
option flags?
#include <stdio.h>
#include <string.h>
#include <errno.h>
{
int argi;
int aflag = 0;
argi++) {
char *p;
switch(*p) {
case 'a':
aflag = 1;
printf("-a seen\n");
break;
case 'b':
bval = argv[++argi];
break;
default:
fprintf(stderr,
} else {
if(ifp == NULL) {
fprintf(stderr, "can't open %s: %s\n”,
argv[argi], strerror(errno));
continue;
fclose(ifp);
return 0;
(This code assumes that fopen sets errno when it fails, which is not guar¬
anteed but usually works and makes error messages much more useful. See
also question 20.4.)
Several canned functions are available for doing command line parsing in
a standard way; the most popular one is getopt (see also question 18.16).
Here is the previous example, rewritten to use getopt:
{
int aflag = 0;
char *bval = NULL;
int c;
case 'b':
bval = optarg;
printf("-b seen (\"%s\")\n", bval);
break;
} else {
/* process filename arguments */
if(ifp == NULL) {
fprintf(stderr, "can't open %s: %s\n",
argv[optind], strerror(errno));
continue;
fclose(ifp);
return 0;
20.4
Question: What’s the right way to use errno?
Answer: In general, you should detect errors by checking return values and
use errno only to distinguish among the various causes of an error, such as
“File not found” or “Permission denied.” (Typically, you use perror or
strerror to print these discriminating error messages.) It’s necessary to
detect errors with errno only when a function does not have a unique,
unambiguous, out-of-band error return (i.e., because all of its possible return
values are valid; one example is atoi). In these cases (and in these cases only;
check the documentation to be sure whether a function allows this), you can
detect errors by setting errno to 0, calling the function, then testing errno.
(Setting errno to 0 first is important, as no library function ever does that
for you.)
To make error messages useful, they should include all relevant informa¬
tion. Besides the strerror text derived from errno, it may also be appro¬
priate to print the name of the program, the operation that failed (preferably
in terms that will be meaningful to the user), the name of the file for which
the operation failed, and, if some input file (script or source file) is being read,
the name and current line number of that file.
See also question 12.24.
20.5
Question: Flow can I write data files that can be read on other machines
with different word size, byte order, or floating-point formats?
Answer: The most portable solution is to use text files (usually ASCII), writ¬
ten with fprintf and read with fscanf or the like. (Similar advice also
applies to network protocols.) Be skeptical of arguments that imply that text
MISCELLANEOUS 335
files are too big or that reading and writing them is too slow. Not only is their
efficiency frequently acceptable in practice, but the advantages of being able
to interchange them easily between machines and to manipulate them with
standard tools can be overwhelming.
If you must use a binary format, you can improve portability, and perhaps
take advantage of prewritten I/O libraries, by making use of standardized for¬
mats such as Sun’s XDR (RFC 1014), OSI’s ASN.l (referenced in CCITT
X.409 and ISO 8825 “Basic Encoding Rules”), CDF, netCDF, or HDF. See
also questions 2.12, 12.38, and 12.42.
20.6
Question: If I have a char * variable pointing to the name of a function,
how can I call that function? Code like
or
"func", func,
"anotherfunc", anotherfunc,
};
336 CHAPTER 20
Then, search the table for the name and call via the associated function pointer
with code like this:
#include <stddef.h>
ttinclude <string.h>
{
int i;
return NULL;
The callable functions should all have compatible argument and return types.
(Ideally, the function pointers would also specify the argument types.)
It is sometimes possible for a program to read its own symbol table if it is still
present, but it must first be able to find its own executable (see question 19.31),
and it must know how to interpret the symbol table (some UNIX C libraries
provide an nlist function for this purpose). See also questions 2.15 and 19.36.
20.7
Question: How can I manipulate individual bits?
value Sc 0x04
value |= 0x04
To clear a bit, use the bitwise complement (~) and the AND (& or &=) operators:
(The preceding three examples all manipulate the third-least significant, or 22,
bit, expressed as the constant bitmask 0x04.)
To manipulate an arbitrary bit, use the shift-left operator (<<) to generate
the mask you need:
To avoid surprises involving the sign bit, it is often a good idea to use
unsigned integral types in code that manipulates bits and bytes.
See also questions 9.2 and 20.8.
References: K&R1 §2-9 pp. 44-5 ISO S6.3.3.3, §6.3.7, §6.3.10, §6.3.12
K&R2 §2.9 pp. 48-9 H&S §7.5.5 p. 197, §7.6.3 pp. 205-6, §7.6.6 p. 210
ANSI §3.3.3.3, §3.3.7, §3.3.10, §3.3.12
338 CHAPTER 20
Answer: Use arrays of char or int with a few macros to access the desired
bit in the proper cell of the array. Here are some simple macros to use with
arrays of char:
(If you don’t have <limits .h>, try using 8 for CHAR_BIT.)
Here are some usage examples:
char bitarray[BITNSLOTS(47)] ;
BITSET(bitarray, 23);
• To compute the union of two bit arrays and place it in a third array (with
all three arrays as previously declared):
#include <stdio.h>
#include <string.h>
main()
{
char bitarray[BITNSLOTS(MAX)];
int i, j;
memset(bitarray, 0, BITNSLOTS(MAX));
}
}
return 0;
20.9
Question: How can I determine whether a machine’s byte order is big-
endian or little-endian?
int x = 1;
if(*(char *)&x == 1)
printf("little-endian\n");
else printf("big-endian\n");
or a union:
union {
int i;
char c[sizeof(int)];
} x;
x. i = 1;
340 CHAPTER 20
if(x.c[0] == 1)
printf("little-endian\n");
else printf("big-endian\n");
20.10
Question: How can I convert integers to binary or hexadecimal?
Answer: Make sure that you really know what you’re asking. Integers are
stored internally in binary, although for most purposes, it is not incorrect to
think of them as being in octal, decimal, or hexadecimal, whichever is conve¬
nient. The base in which a number is expressed matters only when that num¬
ber is read in from or written out to the outside world, either in the form of
a source code constant or in the form of I/O performed by a program.
In source code, a nondecimal base is indicated by a leading 0 or Ox (for
octal or hexadecimal, respectively). During I/O, the base of a formatted num¬
ber is controlled in the printf and scanf family of functions by the choice
of format specifier (%d, %o, %x, etc.) and in the strtol and strtoul func¬
tions by the third argument. During binary I/O, however, the base again
becomes immaterial: If numbers are being read or written as individual bytes
(typically with getc or putc) or as multibyte words (typically with fread
or fwrite), it is meaningless to ask what “base” they are in.
If what you need is formatted binary conversion, it’s easy enough to do.
Here is a little function for formatting a number in a requested base:
char *
baseconv(unsigned int num, int base)
{
static char retbuf[33];
char *p;
p = &retbuf[sizeof(retbuf)-1];
*P = ' \0 ' ;
MISCELLANEOUS 341
do {
return p;
(Note that this function, as written, returns a pointer to static data, such that
only one of its return values can be used at a time; see question 7.5. A better
size for the retbuf array would be sizeof (int) *CHAR_BIT+1; see ques¬
tion 12.21.)
For more information about “binary” I/O, see questions 2.11, 12.37, and
12.42. See also questions 8.6 and 13.1.
20.11
Question: Can I use base-2 constants (something like 0bl01010)? Is there
a printf format for binary?
Answer: No, on both counts. You can convert base-2 string representations
to integers with strtol. If you need to print numbers out in base 2, see the
example code in question 20.10.
Efficiency
20.12
Question: What is the most efficient way to count the number of bits that
are set in a value?
Answer: Many “bit-fiddling” problems like this one can be sped up and
streamlined using lookup tables (but see question 20.13). On the next page is
a little function that computes the number of bits in a value, 4 bits at a time.
342 CHAPTER 20
{
int n = 0;
for (; u ! = 0; u »= 4)
n += bitcounts[u & OxOf];
return n;
20.13
Question: How can I make my code more efficient?
• Sprinkle the code liberally with register declarations for oft-used vari¬
ables; place them in inner blocks, if applicable. (On the other hand, most
modern compilers ignore register declarations, on the assumption that
they can perform register analysis and assignment better than the pro¬
grammer can.)
• Check the algorithm carefully. Exploit symmetries where possible to reduce
the number of explicit cases.
• Examine the control flow: Make sure that common cases are checked for
first and handled more easily. If one side of an expression involving && or
| | will usually determine the outcome, make it the left-hand side, if possi¬
ble. (See also question 3.6.)
• Use memcpy instead of memmove, if appropriate (see question 11.25).
• Use machine- and vendor-specific routines and #pragmas.
• Manually place common subexpressions in temporary variables. (Good
compilers do this for you.)
• Move critical, inner-loop code out of functions and into macros or in-line
functions (and out of the loop, if invariant). If the termination condition of
a loop is a complex but loop-invariant expression, precompute it and place
it in a temporary variable. (Good compilers do these for you.)
• Change recursion to iteration, if possible.
• Unroll small loops.
• Discover whether while, for, or do/while loops produce the best code
under your compiler and whether incrementing or decrementing the loop
control variable works best.
• Remove goto statements—some compilers can’t optimize as well in their
presence.
344 CHAPTER 20
• Use pointers rather than array subscripts to step through arrays (but see
question 20.14).
• Reduce precision. (Using float instead of double may result in faster,
single-precision arithmetic under an ANSI compiler, although older com¬
pilers convert everything to double, so using float can also be slower.)
Replace time-consuming trigonometric and logarithmic functions with your
own, tailored to the range and precision you need, and perhaps using table
lookup. (Be sure to give your versions different names; see question 1.29.)
• Cache or precompute tables of frequently needed values. (See also question
20.12.)
• Use standard library functions in preference to your own. (Sometimes, the
compiler inlines or specially optimizes its own functions.) On the other
hand, if your program’s calling patterns are particularly regular, your own
special-purpose implementation may be able to beat the library’s general-
purpose version. (Again, if you do write your own version, give it a differ¬
ent name.)
• As a last, last resort, hand code critical routines in assembly language (or
hand tune the compiler’s assembly language output). Use asm directives, if
possible.
20.14
Question: Are pointers really faster than arrays? How much do function
calls slow things down? Is ++i faster than i = i + 1?
20.15
Question: Is it worthwhile to replace multiplications and divisions with
shift operators?
aIf it’s difficult to measure, it may suggest that you don’t have to worry about the difference after all.
346 CHAPTER 20
switch Statements
20.16
Question: Which is more efficient: a switch statement or an if/else
chain?
Answer: The differences, if any, are likely to be slight. The switch state¬
ment was designed to be efficiently implementable, although the compiler
may use the equivalent of an if/else chain (as opposed to a compact jump
table) if the case labels are sparsely distributed.
Do use switch when you can: It’s definitely cleaner and perhaps more
efficient (and certainly should never be any less efficient).
See also questions 20.17 and 20.18.
20.17
Question: Is there a way to switch on strings?
#define CODE_APPLE 1
#define CODE_ORANGE 2
#define CODE_NONE 0
MISCELLANEOUS 347
switch(classifyfunc(string)) {
case CODE_APPLE:
case C0DE_0RANGE:
case CODE_NONE:
} tab [ ] = {
{"app1e", CODE_APPLE},
{"orange”, CODE_ORANGE} ,
};
classifyfunc(char ‘string)
{
int i;
for(i =0; i < sizeof(tab) / sizeof(tab[0]); i++)
if(strcmp(tab[i].string, string) == 0)
return tab[i].code;
return CODE_NONE;
}
Otherwise, of course, you can fall back on a conventional if/else chain:
if(strcmp(string, "apple”) == 0) {
}
(A macro like Streq() from question 17.3 can make these comparisons a
bit more convenient.)
See also questions 10.12, 20.16, 20.18, and 20.29.
20.18
Question: Is there a way to have nonconstant case labels (i.e., ranges or
arbitrary expressions)?
20.19
Question: Are the outer parentheses in return statements really optional?
Answer: Yes.
Long ago, in the early days of C, they were required, and just enough peo¬
ple learned C then, and wrote code that is still in circulation, that the notion
that they might still be required is widespread.
(As it happens, parentheses are optional with the sizeof operator, too, as
long as its operand is a variable or a unary expression.)
20.20
Question: Why don’t C comments nest? How am I supposed to comment
out code containing comments? Are comments legal inside quoted strings?
Answer: C comments don’t nest mostly because PL/I’s comments, which C’s
are borrowed from, don’t either. Therefore, it is usually better to “comment
out” large sections of code, which might contain comments, with #ifdef or
#if 0 (but see question 11.19).
The character sequences / * and * / are not special within double-quoted
strings and do not therefore introduce comments, because a program (partic¬
ularly one generating C code as output) might want to print them. (It is dif¬
ficult to imagine why anyone would want or need to place a comment inside
a quoted string. It is easy to imagine a program needing to print "/*".)
Note also that // comments, as in C++, are not currently legal in C, so it’s
not a good idea to use them in C programs (even if your compiler supports
them as an extension).
20.21
Question: Why isn’t C’s set of operators more complete? A few operators,
such as AA, &&=, and ->=, seem to be missing.
The first is straight from the definition but is poor because it may evaluate
its arguments multiple times (see question 10.1). The second and third “nor¬
malize” their operands* to strict 0/1 by negating them twice—the second then
applies bitwise exclusive or (to the single remaining bit); the third one imple¬
ments exclusive-or as ! = . The fourth and fifth are based on an elementary
identity in Boolean algebra, namely, that
a © b = a © b
20.22
Question: If the assignment operator were : =, wouldn’t it then be harder to
accidentally write things like i f (a = b) ?
Answer: Yes, but it would also be just a little bit more cumbersome to type
all of the assignment statements a typical program contains.
* Normalization is important if the XOR() macro is to mimic the operation of the other Boolean operators in
C, namely, that the true/false interpretation of the operands is based on whether they are nonzero or zero (see
question 9.2).
MISCELLANEOUS 351
In any case, it’s really too late to be worrying about this sort of thing now.
The choices of = for assignment and == for comparison were made, rightly or
wrongly, over two decades ago and are not likely to be changed. (With
respect to the question, many compilers and versions of lint will warn
about if (a = b) and similar expressions; see also question 17.4.)
As a point of historical interest, the choices were made based on the obser¬
vation that assignment is more frequent than comparison and so deserves
fewer keystrokes. In fact, using = for assignment in C and its predecessor B
represented a change from B’s own predecessor BCPL, which did use : = as its
assignment operator. (See also question 20.38).
20.23
Question: Does C have an equivalent to Pascal’s with statement?
Answer: No. The way in C to get quick and easy access to the fields of a
structure is to declare a little local structure pointer variable (which, it must
be admitted, is not quite as notationally convenient as a with statement and
doesn’t save quite as many keystrokes, though it is probably safer). That is, if
you have something unwieldy like
structarray[complex_expression].a =
structarray[complex_expression].b +
structarray[complex_expression].c;
20.24
Question: Why doesn’t C have nested functions?
Answer: It’s not trivial to implement nested functions such that they have
the proper access to local variables in the containing function(s), so they were
deliberately left out of C as a simplification. (However, gcc does allow them,
as an extension.) For many potential uses of nested functions (e.g., qsort
352 CHAPTER 20
Other Languages
20.25
Question: How can I call FORTRAN (C++, BASIC, Pascal, Ada, LISP)
functions from C? (And vice versa?)
Answer: The answer is entirely dependent on the machine and the specific
calling sequences of the various compilers in use and may not be possible at
all. Read your compiler documentation very carefully; sometimes, there is a
“mixed-language programming guide,” although the techniques for passing
arguments and ensuring correct run-time startup are often arcane. The on¬
line versions of this book (see question 20.40) contain pointers to some more
information about interlanguage calling.
In C++, a "C" modifier in an external function declaration indicates that
the function is to be called using C calling conventions.
20.26
various translations available; the three most commonly mentioned are p2c,
ptoc, and f2c. The electronic FAQ list associated with this book contains a
bit more information. See questions 18.16 and 20.40.
20.27
Question: Is C++ a superset of C? What are the differences between C and
C++? Can I use a C++ compiler to compile C code?
Answer: C++ was derived from C and is largely based on it, but some legal
C constructs are not legal C++. Conversely, ANSI C inherited several features
from C++, including prototypes and const, so neither language is really a
subset or superset of the other.
The most important feature of C++ not found in C is, of course, the
extended structure known as a class, which, along with operator overload¬
ing, makes object-oriented programming convenient. There are several other
differences and new features: Variables may be declared anywhere in a block;
const variables may be true compile-time constants; structure tags are
implicitly typedeffed; an & in a parameter declaration requests pass by refer¬
ence; and the new and delete operators, along with per-object constructors
and destructors, simplify dynamic data structure management. Classes and
object-oriented programming introduce a host of new mechanisms: inheri¬
tance, friends, virtual functions, templates, etc. (This list of C++ features is
not intended to be complete; C++ programmers will notice many omissions.)
Some features of C that keep it from being a strict subset of C++ (that is,
that keep C programs from necessarily being acceptable to C++ compilers)
are that main may be called recursively, character constants are of type int,
prototypes are not required, and void * implicitly converts to other pointer
types. Also, every keyword in C++ that is not a keyword in C is available in
C as an identifier; C programs that use words such as class and friend as
ordinary identifiers will be rejected by C++ compilers.
In spite of the differences, many C programs will compile correctly in a
C++ environment, and many recent compilers offer both C and C++ compi¬
lation modes.
Reference: H&S p. xviii, §1.1.5 p. 6, §2.8 pp. 36—7, §4.9 pp. 104-7
354 CHAPTER 20
Algorithms
20.28
Question: I need a sort of an “approximate” strcmp routine for comparing
two strings for close, but not necessarily exact, equality. What’s a good way
to do that?
20.29
Question: What is hashing?
{
unsigned int h = 0;
MISCELLANEOUS 355
while(*str != '\0')
h += *str++;
return h % NBUCKETS;
>
A somewhat better hash function is
{
unsigned int h = 0;
while(*str != '\0')
h = (256 * h + *str++) % NBUCKETS;
return h;
20.30
20.31
Question: How can I find the day of the week given the date?
356 CHAPTER 20
#include <stdio.h>
#include <time.h>
struct tm tm;
tm.tm_mon = 2 - 1 ;
tm. tm_mday = 29;
tm.tm_year = 2000 - 1900;
tm.tm_hour = tm.tm_min = tm.tm_sec = 0;
tm.tm_isdst = -1;
if(mktime(&tm) != -1)
printf("%s\n", wday[tm.tm_wday]);
When using mktime like this, it’s usually important to set tm_isdst to
-1, as shown (especially if tm_hour is 0); otherwise, a daylight saving
time correction could push the time past midnight into another day.
2. Use Zeller’s congruence, which says that if
and if January and February are taken as months 13 and 14 of the previ¬
ous year (affecting both J and K), h for the Gregorian calendar is the
remainder when the sum
*At least one slightly modified form of Zeller’s congruence has been widely circulated; the formulation shown
here is the original.
MISCELLANEOUS 357
(where we use +5*J instead of -2*J to make sure that both operands of
the modulus operator % are positive; this bias totaling 7*J will obviously
not change the final value of h, modulo 7).
3. Use this elegant code by Tomohiko Sakamoto:
dayofweek(y, m, d) /* 0 = Sunday */
int y, m, d; /* 1 <= m <= 12, y > 1752 or so */
{
static int t[] = (0, 3, 2, 5, 0, 3, 5, 1, 4, 6, 2, 4};
y -= m < 3;
return (y + y/4 - y/100 + y/400 + t[m-l] + d) % 7;
20.32
Question: Will 2000 be a leap year? Is (year % 4 == 0) an accurate test
for leap years?
Answer: Yes and no, respectively. The rules for the present Gregorian calen¬
dar are that leap years occur every four years but not every 100 years, except
that they do occur every 400 years, after all. In C, these rules can be
expressed as:
‘Make sure that you check a good reference; some are wrong when it comes to calendar rules or mention the
existence of a 4000-year rule that has not been adopted and won’t be needed for another 2000 years, anyway.
358 CHAPTER 20
If you trust the implementor of the C library, you can use mktime to
determine whether a given year is a leap year; see the code fragments in ques¬
tions 13.14 or 20.31 for hints.
Note also that the transition from the Julian to the Gregorian calendar
involved deleting several days to make up for accumulated errors. (The tran¬
sition was first made in Catholic countries under Pope Gregory XIII in Octo¬
ber 1582 and involved deleting 10 days. In the British Empire, 11 days were
deleted when the Gregorian calendar was adopted in September 1752. A few
countries didn’t switch until the 20th century.) Calendar code that has to
work for historical dates must therefore be especially careful.
20.33
Question: Why can tm_sec in the tm structure range from 0 to 61, sug¬
gesting that there can be 62 seconds in a minute?
Trivia
20.34
Question: Here’s a good puzzle: How do you write a program that produces
its own source code as its output?
(This program, like many of the genre, assumes that the double-quote char¬
acter " has the value 34, as it does in ASCII.)
20.35
Question: What is “Duff’s Device”?
switch (count % 8)
{
case 0: do { *to = *from++;
In this loop, count bytes are to be copied from the array pointed to by
from to the memory location pointed to by to (which is a memory-mapped
device output register, which is why to isn’t incremented). It solves the prob¬
lem of handling the leftover bytes (when count isn’t a multiple of 8) by inter¬
leaving a switch statement with the loop, which copies bytes 8 at a time.
(Believe it or not, it is legal to have case labels buried within blocks nested
in a switch statement like this. In his announcement of the technique to C’s
developers and the world, Duff noted that Cs switch syntax, in particular
its “fall through” behavior, had long been controversial and that “this code
forms some sort of argument in that debate, but I’m not sure whether it’s for
or against.”)
360 CHAPTER 20
20.36
Question: When will the next International Obfuscated C Code Contest
(IOCCC) be held? How do I submit contest entries? Who won this year’s
IOCCC? How can I get a copy of the current and previous winning entries?
Answer: The contest schedule is tied to the dates of the USENIX confer¬
ences at which the winners are announced. At the time of this writing, it is
expected that the yearly contest will open in October. To obtain a current
copy of the rules and guidelines, send e-mail with the Subject: line “send
rules” to [email protected]. (Note that this is not the addresses for submitting
entries.)
The rules, guidelines, and timetables tend to change from year to year.
Make sure that you have the current contest’s announcement prior to sub¬
mitting entries.
Contest winners should be announced at the winter USENIX conference in
January and are posted to the Internet sometime thereafter. Winning entries
from previous years (back to 1984) are archived at ftp.uu.net (see question
18.16) under the directory pub/ioccc/.
As a last resort, previous winners may be obtained by sending e-mail to the
judges address with the string “send year winners” in the Subject: line, where
year is a single four-digit year, a year range, or the word “all.”
20.37
Question: What was the entry keyword mentioned in K&R1?
20.38
Question: Where does the name “C” come from, anyway?
20.39
Question: How do you pronounce “char”?
Answer: You can pronounce the C keyword “char” in at least three ways:
like the English words “char,” “care,” or “car”; the choice is arbitrary.
Answer: This book is an expanded version of the FAQ list from the Usenet
newsgroup comp.lang.c. A copy of the on-line list may be obtained from
aw.com in directory cseng/authors/summit/cfaq or ftp.eskimo.com in direc¬
tory u/s/scs/C-faq/. You can also retrieve it from Usenet; it is normally posted
to comp.lang.c on the first of each month, with an Expires: line that should
keep it around all month. A parallel, abridged version is available (and
posted), as is a list of changes accompanying each significantly updated ver¬
sion. (These on-line versions, though, do not contain nearly as much material
as this book does.)
362 CHAPTER 20
The various versions of the on-line list are also posted to the newsgroups
comp.answers and news.answers . Several sites archive news.answers postings
and other FAQ lists, including comp.lang.c’s; two sites are rtfm.mit.edu
(directories pub/usenet/news.answers/C-faq/ and pub/usenet/comp.lang.c/)
and ftp.uu.net (directory usenet/news.answers/C-faq/). An archie server (see
question 18.16) should help you find others; the command “find C-faq”
should list some of them. If you don’t have ftp access, a mailserver at
rtfm.mit.edu can mail you FAQ lists: Send a message containing the single
word help to [email protected] for more information.
Finally, a hypertext version of this book is available on the World-wide
Web. The hypertext version will be updated to correct any errors and may
expand to include even more questions and answers. The URL of the hyper¬
text version is http://www.aw.com/cseng/authors/summit/cfaq/cfaq.html.
Glossary
These definitions are of terms as they are used in this book. Some of these
terms have more formal, slightly different definitions; this glossary is not an
authoritative dictionary. Many of these terms are from the ANSI/ISO C Stan¬
363
364 GLOSSARY
big-endian adj. Refers to storage of a multibyte quantity with the most-significant byte
at the lowest address. See also byte order.
binary adj. 1. Base two. 2. Refers to I/O done in a byte-for-byte or bit-for-bit way,
without formatting or interpretation, i.e., a direct copy operation between internal
memory and external storage. 3. Refers to a file that is to be interpreted as a
sequence of raw bytes, in which any byte values may appear. Compare text. See
questions 12.38, 12.40, and 20.5. 4. Refers to an operator taking two operands.
Compare unary.
bind vt, vi. Informally, to “stick to” or “stick together”; usually used to indicate which
operand(s) are associated with which operator, based on precedence rules.
bitmask n. A mask, sense 1.
byte n. A unit of storage suitable for holding one character. Compare octet. See ques¬
tion 8.10. See ANSI §1.6 or ISO §3.4.
byte order n. The characteristic ordering of multibyte quantities (usually integral) in
memory, on disk, or in a network or other bytewise I/O stream. The two common
byte orders (most-significant-first and least-significant-first) are often called big-
endian and little-endian.
canonical mode n. The mode of a terminal driver in which input is collected a line at
a time, allowing the user to correct mistakes with the backspace/delete/rubout or
other keys. See question 19.1.
.c file n. A source file, sense 2. (See questions 1.7 and 10.6.)
cast n. The syntax
( type-name )
K&R n. 1. The book The C Programming Language (see the bibliography for a com¬
plete citation). 2. That book’s authors, Brian Kermghan and Dennis Ritchie, adj.
Refers to the early version of C described in the first edition (“K&R1”) of the book.
Ihs n. The left-hand side, usually of an assignment, or more generally, of any binary
operator.
lint n. A program written by Steve Johnson as companion to his pcc, for perform¬
ing cross-file and other error checking not normally performed by C compilers. The
name supposedly derives from the bits of fluff it picks from programs, vt. To check
a program with lint.
little-endian adj. Refers to storage of a multibyte quantity with the least-significant
byte at the lowest address. See also byte order.
GLOSSARY 367
a = b;
a is an lvalue and is not fetched but is written to. Compare rvalue. See also ques¬
tion 6.7. See ANSI §3.2.2.1 (especially footnote 31) or ISO §6.2.2.1.
mask 1. n. An integer value interpreted specifically as a pattern of Is and Os with
which to perform bitwise operations (&, |, etc.). 2. vt. To select certain bits using a
mask (sense 1) and a bitwise operator. See question 20.7.
member n. One of the typed components of a structure or a union.
namespace n. A context within which names (identifiers) may be defined. There are
several namespaces in C; for example, an ordinary identifier can have the same
name as a structure tag, without ambiguity. See question 1.29.
narrow ad/'. Refers to a type that is widened under the default argument promotions:
char, short, or float. See questions 11.3 and 15.2.
nonreentrant ad]. Refers to a piece of code that makes use of static memory or tem¬
porarily leaves global data structures in an inconsistent state, such that it cannot
safely be called while another instance of itself is already active. (That is, it cannot
be called from an interrupt handler, because it might have been the code inter¬
rupted.)
“notreached” interj. A directive indicating to lint or some other program checker
that control flow cannot reach a particular point and that certain warnings (e.g.,
“control falls off end of function without return”) should therefore be suppressed.
See question 11.12.
null pointer n. A distinguished pointer value that is not the address of any object or
function. See question 5.1.
null pointer constant n. An integral constant expression with value 0 (or such an
expression cast to void *), used to request a null pointer. See question 5.2.
order of evaluation n. The order in which the operations implied by an expression are
actually carried out by the processor. Compare precedence. See question 3.4.
out-of-band adj. Refers to a sentinel or otherwise exceptional value that is distinct
from all normal values that can appear in some context (e.g., in a set of function
return values, etc.). Compare in-band. Example: EOF (see question 12.1).
main ()
{
f (5) ;
return 0;
}
f(int i) ;
{
}
• any assignment context in which the destination (left-hand side) has pointer
type;
• an == or != comparison in which one side has pointer type;
• the second and third operands of the ? : operator, when one of them has pointer
type; and
• the operand of a pointer cast, such as (char *) or (void *).
ragged array n. An array, usually simulated with pointers, in which the rows are not
necessarily of the same length. See also dope vector. See questions 6.16 and 20.2.
reentrant adj. Refers to code that can safely be called from interrupts or in other cir¬
cumstances in which it is possible that another instance of the code is simultane¬
ously active. Reentrant code has to be very careful of the way it manipulates data:
All data must either be local or else protected by semaphores or the like.
RFC n. An Internet Request For Comments document, available by anonymous ftp from
ds.internic.net and many other sites.
rhs n. The right-hand side, usually of an assignment, or more generally, of any binary
operator.
rvalue n. Originally, an expression that could appear on the right-hand sign of an
assignment operator. More generally, any value that can participate in an expression
or be assigned to some other variable. In the assignment
a = b;
b is an rvalue and has its value fetched. Compare lvalue. See ANSI §3.2.2.1 (espe¬
cially footnote 31) or ISO §6.2.2.1. See also questions 3.16 and 4.5.
scope n. The region over which a declaration is active, adj. “In scope”: visible. See
question 1.29.
semantics n. The meaning behind a program: the interpretation that the compiler (or
other translator) places on the various source code constructs. Compare syntax.
370 GLOSSARY
tag n. The (optional) name for a particular structure, union, or enumeration. See ques¬
tion 2.1.
token n. 1. The smallest syntactic unit generally seen by a compiler or other transla¬
tor: a keyword, identifier, binary operator (including multicharacter operators such
as += and &&), etc. 2. A whitespace-separated word within a string (see question
13.6).
translation unit n. The set of source files seen by the compiler and translated as a unit:
generally one .c file (that is, source file, sense 2), plus all header files mentioned in
#include directives.
undefined adj. Refers to behavior that is not specified by the standard, for which an
implementation is not required to do anything reasonable. Example: the behavior of
the expression i = i++. See questions 3.3 and 11.33.
GLOSSARY 371
terminal driver n. That portion of the system software responsible for character-based
input and output, usually interactive; originally from and to a serially connected ter¬
minal, now more generally any virtual terminal, such as a window or network login
session. See question 19.1.
text adj. Refers to a file or I/O mode intended for handling human-readable text,
specifically, printable characters arranged into lines. Compare binary, sense 3. See
question 12.40.
translator n. A program (compiler, interpreter, lint, etc.) that parses and interprets
semantic meaning in C syntax.
unary adj. Refers to an operator taking one operand. Compare binary, sense 4.
unroll vt. To replicate the body of a loop one or more times (while correspondingly
reducing the number of iterations) to improve efficiency by reducing loop control
overhead (but at the expense of increased code size).
unsigned preserving adj. Refers to a set of rules, common in pre-ANSI implementa¬
tions, for promoting signed and unsigned types that meet across binary operators
and for promoting narrow unsigned types in general. Under the unsigned preserv¬
ing rules, promotion is always to an unsigned type. Compare value preserving. See
question 3.19.
unspecified adj. Refers to behavior that is not fully specified by the standard, for
which each implementation must choose some behavior, though it need not be doc¬
umented or even consistent. Example: the order of evaluation of function arguments
and other subexpressions. See question 11.33.
value preserving adj. Refers to a set of rules, mandated by the ANSI C Standard and
also present in some pre-ANSI implementations, for promoting signed and unsigned
types that meet across binary operators and for promoting narrow unsigned types
in general. Under the value preserving rules, promotion is to a signed type if it is
large enough to preserve all values; otherwise to an unsigned type. Compare
unsigned preserving. See question 3.19.
varargs adj. 1. Refers to a function that accepts a variable number of arguments, e.g.,
printf. (A synonym for variadic.) 2. Refers to one of the arguments in the vari¬
able-length part of a variable-length argument list.
variadic adj. Refers to a function that accepts a variable number of arguments, e.g.,
printf. (A synonym for varargs, sense 1.)
wrapper n. A function (or macro) that is “wrapped around” another, providing a bit
of added functionality. For example, a wrapper around malloc might check
malloc’s return value.
X3.159 n. The original ANSI C Standard, ANSI X3.159-1989. See question 11.1.
X3J11 n. The committee charged by ANSI with the task of drafting the C Standard.
X3J11 now functions as the U.S. Technical Advisory Group to the ISO C standard¬
ization working group WG14. See question 11.1.
Bibliography
372
BIBLIOGRAPHY 373
Darwin, Ian F. Checking C Programs with lint, O’Reilly, 1988, ISBN 0-937175-30-7.
Dijkstra, E. “Go To Statement Considered Harmful,” Communications of the ACM,
Vol. 11 #3, March 1968, pp. 147-8.
Goldberg, David. “What Every Computer Scientist Should Know about Floating-Point
Arithmetic,” ACM Computing Surveys, Vol. 23 #1, March 1991, pp. 5-48.
Harbison, Samuel P., and Guy L. Steele, Jr. C: A Reference Manual, Fourth Edition,
Prentice-Hall, 1995, ISBN 0-13-326224-3. [H&S]
Horton, Mark R. Portable C Software, Prentice-Hall, 1990, ISBN 0-13-868050-7.
[PCS]
Kernighan, Brian W., and P.J. Plauger. The Elements of Programming Style, Second
Edition, McGraw-Hill, 1978, ISBN 0-07-034207-5.
Kernighan, Brian W., and Dennis M. Ritchie. The C Programming Language, Pren¬
tice-Hall, 1978, ISBN 0-13-110163-3. [K&R1]
_. The C Programming Language, Second Edition, Prentice Hall, 1988, ISBN 0-
13-110362-8, 0-13-110370-9. [K&R2]
Knuth Donald E. The Art of Computer Programming. Volume 1: Fundamental Algo¬
rithms, Second Edition, Addison-Wesley, 1973, ISBN 0-201-03809-9. Volume 2:
Seminumerical Algorithms, Second Edition, Addison-Wesley, 1981, ISBN 0-201-
03822-6. Volume 3: Sorting and Searching, Addison-Wesley, 1973, ISBN 0-201-
03803-X. [Knuth]
374 BIBLIOGRAPHY
Libes, Don. Obfuscated C and Other Mysteries, Wiley, 1993, ISBN 0-471-57805-3.
Marsaglia, G., and T.A. Bray. “A Convenient Method for Generating Normal Vari¬
ables,” SIAM Review, Vol. 6 #3, July 1964.
Park, Stephen K., and Keith W. Miller. “Random Number Generators: Good Ones are
Hard to Find,” Communications of the ACM, Vol. 31 #10, October 1988, pp.
1192-1201 (also technical correspondence August 1989, pp. 1020-4, and July
1993, pp. 108-10).
Plauger, P.J. The Standard C Library, Prentice-Hall, 1992, ISBN 0-13-131509-9.
Plum, Thomas. C Programming Guidelines, Second Edition, Plum Hall, 1989, ISBN
0-911537-07-4.
Press, William H., Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery.
Numerical Recipes in C, Second Edition, Cambridge University Press, 1992, ISBN
0-521-43108-5.
Ritchie, Dennis M. “The Development of the C Language,” 2nd ACM HOPL Confer¬
ence, April 1993.
Tondo, Clovis L., and Scott E. Gimpel. The C Answer Book, Second Edition, Prentice-
Hall, 1989, ISBN 0-13-109653-2.
BIBLIOGRAPHY 375
U.S. Naval Observatory Nautical Almanac Office, The Astronomical Almanac for the
year 1994.
Wu, Sun, and Udi Manber. “AGREP—A Fast Approximate Pattern-Matching Tool,”
USENIX Conference Proceedings, Winter 1992, pp. 153-62.
There is another bibliography in the revised Indian Hill style guide (see question 17.9).
See also question 18.10.
Index
376
INDEX 377
clear screen 304-5 concatenation, strings 116, ctime function 233, 234
client/server applications 180,200 CTRL () macro 160-1
323-4 conceptual Boolean type ctype.hf$> 224-5
clock function 321 144 macros 143
CLOCKS_PER_SEC 321 conditional compilation cube 252
closedir function 313 156-9 currency formatted numbers
code configuration files, locating 198-9
commenting out 349 with executable 319 curses 299, 300, 301, 303,
disabling 180, 349 configuration management 304,305
layout 279-80 287 cursor, moving 304-5
metrics 287 conforming 364 Cyr, Susan xxxiii
CodeCenter 288 conforming to storage layouts
codes, character 138 39^10, 219-20, 276-7 data files 38
coding standards 283-4 const char * 174 binary 38, 335
coding style 279-80 const char ** 174 portable 334—5
collisions 354 const 173-5, 278, 353 reading 201-2
colons, in structure and array dimensions sorting 317,318
declarations 46 173 text 334
comm ports 306 vs. #define 173 data structures
comma, in argument lists as initializer 173 large 22, 123-4, 273,
52-3 and pointers 174-5 275,314-5
comma, as thousands and typedef 175 lines of text 328-30
separator 198-9 see also qualifiers, type date and time 233-6, 355-8
comma operator 52-3 qualifiers dates
command line, parsing constant expression 173 calculations 234-6,
330-3 constants 357-8
COMMAND.COM 318 base two 341 converting 234, 355-8
commands, invoking 316-9 character 353 historical 358
commaprint function by reference 72-3 day of week 355-7
198 string 275-6, 278 daylight saving time 234,
comments structure 38 356
C++ 349 control flow 343 dayofweek function 357
empty, in macros 160 unexpected 272-3 dbginf o function 165
natural-language 180 controlA 322 dbmalloc 287
nested 349 control-C 322 debug function 165
and strings 349 control-Z 217, 218, 366 DEBUG!) macro 163-6
common model 7 conventions, naming 148, disabling 164
common subexpression 283 debugging printouts,
elimination 343 conversions removing 163-4,274
comp.compilers 287, 288, date and time 234, 355-8 decay 42, 364
295 floating to integer 252 decimal-point character 198
comp.lang.c xxxi, 361 int/pointer 75-6, 90, declarations 152,364
comparison 315-6 array 94
vs. assignment 281, of integer bases 340-1 complex 14, 16-7, 174
350-1 pointer 72, 120 and definitions 6-7
floating point 250-1 to/from ANSI C 185, 187 function, mixed style
reversed 281-2 copies, of input, inadvertent 170, 171
strings 136-7 117, 193 function, old vs. new style
approximate 354 copyright 293 170-1, 186, 187
structure 37 core dump 277-8 implicit 21
vs. subtraction 230 Coronado Enterprises 292 inside out 17
compilation corruption, of malloc arena invalid 21
conditional 156-9 125 mimic use 17
failures 27-8, 185 cos function 249 pointer 94
phases of 157, 158 counting 1 bits 341-2 structure 42
separate 152 cpp 364 syntax errors 154
compilers cpx_add function 255 declarator 1, 17, 364
availability 288-9 cpx_make function 255 decrement operators 49, 57
experimenting with 188, crashes default argument promotions
190 after end 273-4 170, 196, 259,267
nonstandard extensions before beginning 273 ♦define 147-51
188 due to malloc 125, and Boolean type 142
optimizing 343 287-8 conditional 157
Ritchie’s original 36, 256, unexpected 42,273-4, vs. const 173
345 277-8, 289 and #ifdef 157
complex arithmetic 254-5 cross-reference generator vs. typedef 10
complex structure 254 286 see also macros,
compliance 188 cscope 286 preprocessor
INDEX 379
Sharnoff, David Muir 289 source lines, counting 287 storage layout, externally
shell escape 316-7 source obfuscator 287 imposed 4, 39-40,
shifts, vs. sources, availability 219-20, 276-7
multiplication/division xxxviii,255, 293, 294, strcat function 116, 136,
345-6 295-6 282
short int type 2 space available on disks 312 strchr function 226, 245
and scant 200 spawn 316,318 strcmp function 37, 137,
and va_arg 267 special characters 303, 304, 281
short-circuiting 52, 349, 324 and ! 281
370 Spencer, Henry xxxii, 227, and qsort 228-9
shrouder 287 283 strcpy function 137,282
shuffle 241 splitting string into words strdup function 329
side effects 49, 50, 370 225-7 streams 191,298,324
and evaluation order 49 sprintf 206—8, 222, 223, maximum number of 312
multiple 50, 53, 56 275, 317 recovering file name 310
Sieve of Eratosthenes 338 and buffer size 206-8 Streq macro 281,347
SIGDANGER 123 return value 208 strerror function 334
SIGFPE 323 sqrt function 249 strftime function 233,
SIGJGN 322 square 252 234
SIGINT 322 square roots 249 strictly conforming 370
sign extension 2 Square!) macro 147-8 stringizing 161, 178-9, 200,
sign preserving 62-4, 370 srand48 function 244 370
signal function 303, 322, srand function 240 macro name vs. value
323 sscanf function 206 178-9
signedness, of bitfields 46 stack 118,134,273,274 strings 181, 222, 370
silly little macros 148-9 and alloca 134 allocation of 115, 116,
Simonyi, Charles 283 Standard 168,370 117, 118-9
sin function 249 completeness of 190, approximate comparison
size_t type 275 324-5 354
vs. int/long 123, 124 Standard, ANSI/ISO C 168, assigning 137
andmalloc 123 294,372, 373 vs. characters 136
andprintf 123 availability 169-70 and comments 349
sizeof operator 216 and system-dependent comparing 136-7, 156
and allocated memory size tasks 324-5 concatenating 116,180
129 The Standard C Library constant 275-6, 278
and arrays 20-1, 112, 293, 294, 374 converting from numbers
113 standards, coding 283-4 222
and char 120-1, 139-40 stat function 117,308, converting to upper/lower
and character constants 312,328 case 224
139 state machine 18 extracting substrings
and #if 157 statements, multiple, in 223-4
andmalloc 125 macro 150-1 function names, to
and parentheses 348 static 370 function pointers
andprintf 123 static data, and reentrancy 335-6
and structures 40 118 initializers, exact size 181
sizes static storage class 8-9 literal 27, 28, 275-6, 370
of enumerations 45 static variables, initial values and initializers 28,
of files 218,308-9 26 181,275-6
of integer types 2 status, exit 177, 178 modifying 275-6,278
of integers in data files stdarg.h 260-6 wide 140
334 stdaux stream 306 managing 116
of types 3 stderr stream 191 returning 118-9
of unions 43 stdin stream 191, 300 sorting 228-9
sleep function 321 binary mode 217 subscripting 101
SLOC 287 and fflush function and switch 346-7
snprintf function 208 210-1 writable 275-6
sockets 323-4 recovering 216 strlen function, and
sorting 228-32 redirecting 214 malloc 125
comparing by subtracting <stdio.h>, and printf strncat function 223
230 258 strncpy function 222-3
data files 317, 318 <stdlib.h> 249 strptime function 234
large datasets 232 stdout stream 191 strrchr function 245
linked lists 232 binary mode 217 StrRel macro 281
strings 228-9 redirecting 214 strtod function 202
structures 229-32 stdprn stream 306 strtok function 201, 206,
soundex algorithm 354 Steele, Guy L. 293, 365 225
source files, arrangement of Stegun, Irene A. 242 strtol function 202, 206,
6-7, 152,280,291 storage class 8-9 340,341
INDEX 387
strtoul function 340 syntax, redefining 149 TRUE macro 142, 144-5,
struct hack 33-6 syntax errors 154,180, 154
structures 185-6 truncation
alignment 39-40 System V 244, 290 division 59
anonymous 38 I/O 299,302 files 309
assignment 36 system-dependent tasks, and floating point 252
comparison 37 the ANSI Standard tty_break function
constant 38 324-5 301-3
declarations 42 system function 316,317 tty_fix function 301-3
and colons 46 examples 300, 305, 314, tty_getchar function
incomplete 172 317 301-3
and prototypes 172 return value 317, 318 Turbo C 256
fields, accessing 41,351 trapping output 318-9 tutorials 291-2
fields, offset of 40-1 two-dimensional array see
as function arguments table lookup 341, 344 multidimensional array
36, 37, 38 tags 370 type qualifiers 14-5, 173-5
I/O 38-9,219-20 namespace 23 see also const, qualifiers
and implicit addresses 42 structure 31-2 typedef 10, 16, 175, 268,
layout, knowledge of structure, and typedef 275
127, 231 31 and Boolean types 142
mutually referential 12-3 Technical Corrigenda 168 and const 175
padding 39-40,134,219 temperature conversion 59 vs. tdefine 10
pointers to 42,351 temporary 149 vs. enumerations 44-5
pointers, and typedef 10 Commandments for C and #ifdef 158
11-3 Programmers 283 and qualifiers 175
and prototypes 172 termcap 304,305 and structure pointers
and qsort 229-32 terminal driver 299, 300, 11-3
returning 36, 328 323,371 and structure tags 31
self-referential 11-2 terminal, dumb 325 typeof 150
sorting 229-32 terminfo 304,305 types
tags 31-2 <termio.h> 299,302 abstract 32
tags, and typedef 31 <termios.h> 299 basic 1-4
undefined, pointers to 32 ternary operator 60-1 complex 16-7
unpacking 276-7 tests, reversed 281-2 incomplete 13, 20, 32
variable size 33-6 text 371 sizes of 2, 3-4
Strunk, William 285 text files, as data files 334 use of 2-3
stty command 300 text I/O 217-8
style, indentation 279-80 The Rule 95 unary 371
style vs. efficiency 285 Thompson, Ken 361 unbuffered input 298-303
style guides 283—4 time, converting 234 undefined 50, 55, 188-9,
style wars 284—5 time, current 233 190,274, 370
subprocesses, invoking time and date 233-6, 355-8 library functions 187,
316-9 time function 117, 233 245, 246, 247, 249
subscripting, strings 101 time_t type 233, 234 structures, pointers to 32
subscripting operator 93, converting 234 underscores, in identifiers
96 times function 321 24
commutativity of 101 timing, subsecond 321-2 unexpected control flow
subsecond resolution 321-2 tm structure 234,358 272-3
substrings, extracting 223—4 converting 234 unformatted I/O 216
subtracting dates 234-6 token pasting 160, 179 uninitialized variables 274
subtraction, vs. comparison tokens 180, 370 initial values 26-7
230 splitting string into 225-7 union of bit arrays 338
Sun Wu 354 tolower function 224-5 unions 43-4
swap macro 149-50 Tomohiko Sakamoto xxxiii, initializing 43
sweetie xxxiv 357 and structures 43
switch statements 346-52, Torek, Chris xxxii, 270 tagged 44
359 toupper function 224 UNIX
case ranges 348 TRACE macro 161 files/directories 308, 310,
non-constant case 348 translation unit 370 311,312
and strings 346-7 translations, newline 217 I/O 299, 300, 301, 303,
vs. if/else 346 translator 371 306, 307
symbol table 335, 336 trapping subprocess output unlink function 311
symbolic values, of 318-9 unrolling 343, 359, 371
enumerations 45 treaty 190 unsigned preserving 62-4,
symbols see identifier, names trigonometry 249 371
synopsis, of function 117 trigraphs 168 unsigned types 2, 346
syntactic macros 149, 161 true/false value, random unspecified 188-9,371
syntax 370 240-1 updating files 213,309-10
388 INDEX
389
HATE DUE / DATE DE RETOUR
DATE DUE
date DE RETOUR
CARR Mclean
38-296
C/Programming Languages
^Programming FAQs
STEVE SUMMIT
Summit furnishes you with answers to the most frequently asked ques¬
tions in C. Extensively revised from his popular FAQ on the Internet, more
than 400 questions are addressed with comprehensive examples to illus¬
trate key points and to provide practical guidelines for programmers.
C Programming FAQs is a welcomed reference for all C programmers,
providing accurate answers, insightful explanations, and clarification of
fine points with numerous code examples.
Highlights
• How-to manual covering the C language in a practical, nuts-and-bolts way
• Concise, definitively correct answers to more than 400 frequently asked
questions
• Description of real problems that crop up when writing actual programs
• Clarification of widely misunderstood issues: subtle portability problems,
proper language usage, system-specific issues
4
900 0
9 78 1 8451 98
ff Addison-Wesley Publishing Company ISBN O-EOl-flHSn-T