More On GCC
More On GCC
Spring 2024
Introduction to gcc
Abhijit Das
Pralay Mitra
What you do not know about gcc
Abhijit Das
Pralay Mitra
The four-stage compilation process
Preprocessing This involves the processing of the # directives. Examples:
• The #include’d files are inserted in your code.
• The #define’d macros are literally substituted throughout your code.
Compiling The input to this process is the preprocessed C file, and the output is an
assembly-language code targeted to the architecture of your machine.
Assembling The assembly-language code generated by compiling is converted to a machine
code called the object file. The external functions (like printf and sqrt) are still
undefined.
Linking The object file(s) is/are eventually converted to an executable file in this process. At
this point, the external functions from C runtime library and other libraries are
included in the executable file.
Loading Some functions available in shared (or dynamic) libraries are loaded during runtime
from shared object files.
The compilation process in a nutshell
Input source (.c, .h)
C
Preprocessing cpp O
Headers and macros processed (.i) M
P
Compiling gcc −S
I
Assembly code (.s) L
A
Assembling as
T
Machine code (.o) I
Static libraries (.a) Linking ld O
N
Executable machine code (a.out)
#define TEN 10
#define TWENTY 20
int main ( )
{
int a, b, c;
a = TEN;
b = a + TWENTY;
c = a * b;
printf("c = %d\n", c);
exit(0);
}
Preprocessing
a = 10;
b = a + 20;
c = a * b;
printf("c = %d\n", c);
exit(0);
}
$
Compiling
This needs invoking gcc with the –S flag. A file with extension .s is generated.
$ gcc -S demo.i
$ cat demo.s
.file "demo.c"
.text
.section .rodata
.LC0:
.string "c = %d\n"
.text
.globl main
.type main, @function movl %eax, -4(%rbp)
main: movl -4(%rbp), %eax
.LFB6: movl %eax, %esi
.cfi_startproc leaq .LC0(%rip), %rdi
endbr64 movl $0, %eax
pushq %rbp call printf@PLT
.cfi_def_cfa_offset 16 movl $0, %edi
.cfi_offset 6, -16 call exit@PLT
movq %rsp, %rbp .cfi_endproc
.cfi_def_cfa_register 6 ...
subq $16, %rsp $
movl $10, -12(%rbp)
movl -12(%rbp), %eax
addl $20, %eax
movl
movl
%eax, -8(%rbp)
-12(%rbp), %eax
PLT means Procedure Linkage Table.
imull -8(%rbp), %eax These functions are for runtime loading.
Assembling
• printf and exit are loaded from shared object(s) during runtime.
$ ldd a.out
linux-vdso.so.1 (0x00007ffe80ff2000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f98b5e19000)
/lib64/ld-linux-x86-64.so.2 (0x00007f98b602d000)
$
• If you want these functions to be in your executable, compile with the –static flag.
• This creates a huge a.out.
• You can see printf and exit defined in the executable.
$ gcc -static demo.o
$ ldd a.out
not a dynamic executable
$ nm a.out | grep printf
...
0000000000410bb0 T printf
...
$
CS29206 Systems Programming Laboratory
Spring 2024
Multi-file Applications
Abhijit Das
Pralay Mitra
Break your code across multiple files
• Modular programming is a good practice, and is needed in any large coding project.
• Large source files take huge time for recompilation.
• If the code is broken down in pieces, then only the pieces that are changed need recompilation.
• Large software development is a two-stage process.
• Generate object files from individual modules.
• Merge the object files into a single executable file.
typedef struct {
nodep front; // Pointer to the beginning of the linked list
nodep back; // Pointer to the end of the linked list
} queue;
#include <stdio.h>
#include <stdlib.h>
#include "defs.h"
#include "stack.h"
stack initstack ( )
{
stack S;
S = (stack)malloc(sizeof(node));
S -> data = 0; S -> next = NULL;
return S;
}
···
stack destroystack ( stack S )
{
node *p;
while (S) {
p = S; S = S -> next; free(p);
}
return NULL;
}
The file queue.c
#include <stdio.h>
#include <stdlib.h>
#include "defs.h"
#include "queue.h"
queue initqueue ( )
{
queue Q;
node *p;
p = (node *)malloc(sizeof(node));
p -> data = 0;
p -> next = NULL;
Q.front = Q.back = p;
return Q;
}
···
queue destroyqueue ( queue Q )
{
node *p;
while (Q.front) {
p = Q.front;
Q.front = (Q.front) -> next;
free(p);
}
Q.front = Q.back = NULL;
return Q;
}
The application staquecheck.c
#include <stdio.h>
#include <stdlib.h>
#include "defs.h"
#include "stack.h"
#include "queue.h"
#define ITER_CNT 10
int main ( )
{
stack S;
queue Q;
int i;
S = initstack();
for (i=0; i<ITER_CNT; ++i) { S = push(S, rand() % 100); printstack(S); }
S = destroystack(S);
Q = initqueue();
for (i=0; i<ITER_CNT; ++i) { Q = enqueue(Q, rand() % 100); printqueue(Q); }
Q = destroyqueue(Q);
exit(0);
}
Compile in one shot
$ gcc -Wall staquecheck.c stack.c queue.c
$ ls -l
total 48
-rwxr-xr-x 1 abhij abhij 17640 Dec 23 20:40 a.out
-rw-r--r-- 1 abhij abhij 152 Dec 23 19:43 defs.h
-rw-r--r-- 1 abhij abhij 1262 Dec 23 19:45 queue.c
-rw-r--r-- 1 abhij abhij 360 Dec 23 19:43 queue.h
-rw-r--r-- 1 abhij abhij 1098 Dec 23 19:45 stack.c
-rw-r--r-- 1 abhij abhij 315 Dec 23 19:43 stack.h
-rw-r--r-- 1 abhij abhij 983 Dec 23 20:34 staquecheck.c
$ ./a.out
...
$
• Never forget an executable name after –o. Writing the C source file name after –o will replace
the file.
Generating individual object files
• Header files residing in non-default directories should be included by the #include "..."
directive.
• You can add to the list of default include directories by the -I option.
$ gcc -Wall -c -I. stack.c
$ gcc -Wall -c -I. queue.c
$ gcc -Wall -o myapp -I. staquecheck.c stack.o queue.o
• These compilations add the current directory to the list of include directories.
• You can now use #include <defs.h>, #include <stack.h>, and #include <queue.h>
in the source codes.
The environment variable C_INCLUDE_PATH
Abhijit Das
Pralay Mitra
Introduction
• Prefix: lib
• Extension: .a
• The static math library has the name libm.a
• Functions from static libraries are inserted in the executable during linking
• Prefix: lib
• Extension: .so (may be followed by . and a version number)
• The shared math library has the name libm.so
• Functions from shared libraries are not inserted in the executable during linking
• The functions are read from the .so objects during runtime
Building the static staque library
• We have the files defs.h, stack.h, queue.h, stack.c, and queue.c as before.
• We want to build the static library libstaque.a. This will contain all the stack and queue
functions as listed earlier.
• The library is not meant to contain any main function.
• Application programs like staquecheck.c will contain the main functions as needed.
• Compile individual source files with the –c option to generate the object files.
• Combine the object files into an archive libstaque.a using the command ar.
Generate libstaque.a
$ nm libstaque.a
queue.o:
stack.o: 0000000000000144 T dequeue
00000000000001c9 T destroystack 0000000000000242 T destroyqueue
0000000000000036 T emptystack 000000000000004a T emptyqueue
U free 00000000000000dd T enqueue
U fwrite U free
U _GLOBAL_OFFSET_TABLE_ 0000000000000076 T front
0000000000000000 T initstack U fwrite
U malloc U _GLOBAL_OFFSET_TABLE_
00000000000000f4 T pop 0000000000000000 T initqueue
U printf U malloc
000000000000016a T printstack U printf
00000000000000a8 T push 00000000000001d6 T printqueue
U putchar U putchar
U stderr U stderr
0000000000000055 T top $
How to use the library
• The linker ld does not look in the current directory for searching libraries.
• The –L option advises the linker to add directories to the library path.
$ gcc -Wall -L. staquecheck.c -lstaque
$ ls -l
-rwxr-xr-x 1 abhij abhij 17536 Dec 24 18:52 a.out
-rw-r--r-- 1 abhij abhij 152 Dec 23 19:43 defs.h
-rw-r--r-- 1 abhij abhij 7046 Dec 24 18:25 libstaque.a
-rw-r--r-- 1 abhij abhij 1262 Dec 23 19:45 queue.c
-rw-r--r-- 1 abhij abhij 360 Dec 23 19:43 queue.h
-rw-r--r-- 1 abhij abhij 3424 Dec 24 18:23 queue.o
-rw-r--r-- 1 abhij abhij 1098 Dec 23 19:45 stack.c
-rw-r--r-- 1 abhij abhij 315 Dec 23 19:43 stack.h
-rw-r--r-- 1 abhij abhij 3248 Dec 24 18:23 stack.o
-rw-r--r-- 1 abhij abhij 473 Dec 24 18:52 staquecheck.c
-rw-r--r-- 1 abhij abhij 144 Dec 23 19:43 staque.h
$
How to avoid –L?
• A user with superuser privileges can copy the header files to one of these directories.
• Using subdirectories is a good option.
Installing the libstaque headers
• The functions declared in the header files are not implemented by your code.
• These functions are implemented in external library/libraries.
• The key word extern directs the compiler to wait for these implementations.
The header file queue.h
extern queue initqueue ( ) ;
extern int emptyqueue ( queue ) ;
extern int front ( queue ) ;
extern queue enqueue ( queue , int ) ;
extern queue dequeue ( queue ) ;
extern void printqueue ( queue ) ;
extern queue destroyqueue ( queue ) ;
Building the shared staque library
• We again need only the files defs.h, stack.h, queue.h, stack.c, and queue.c.
• We plan to generate libstaque.so.
• Compile individual source files with the –c option to generate the object files.
• Use the option –fPIC to generate position-independent codes.
• Combine the objects into the shared library using gcc –shared.
$ gcc -Wall -fPIC -c stack.c
$ gcc -Wall -fPIC -c queue.c
$ gcc -shared -o libstaque.so stack.o queue.o
$ ls -l
...
-rwxr-xr-x 1 abhij abhij 16928 Dec 24 20:51 libstaque.so
...
$
• You can nm libstaque.so to find all the defined and undefined symbols.
How to link libstaque.so?
• The linker is not supposed to link the stack and queue functions to applications.
• These functions will be read from libstaque.so during runtime.
• Again you need the –L option to add the path of the library.
• If you (in the superuser mode) copy libstaque.so to a system directory, then you do not need –L.
$ sudo cp libstaque.so /usr/local/lib/
$ gcc -Wall staquecheck.c -lstaque
$ ls -l
-rwxr-xr-x 1 abhij abhij 17064 Dec 24 21:05 a.out
...
$
Libstaque functions are undefined in your a.out
$ nm a.out | grep " U "
U destroyqueue
U destroystack
U enqueue
U exit@@GLIBC_2.2.5
U initqueue
U initstack
U __libc_start_main@@GLIBC_2.2.5
U printqueue
U printstack
U push
U putchar@@GLIBC_2.2.5
U rand@@GLIBC_2.2.5
$
Abhijit Das
Pralay Mitra
Some useful gcc options
–W –Wall includes the following (among others). Some of these have many
subcategories.
–Wcomment Warn about nested comments.
–Wformat Warn about type mismatches in scanf and printf.
–Wunused Warn about unused variables.
–Wimplicit Warn about functions used before declaration.
–Wreturn-type Warn about returning void for functions with non-void return values.
–Wall does not include the following (among others).
–Wconversion Warn about implicit type conversions.
–Wshadow Warn about shadowed variables.
–Werror Convert warnings to errors.
Some useful gcc options (contd.)
The C Preprocessor
Abhijit Das
Pralay Mitra
The C preprocessor
exit(0);
}
Redefine MYFLAG from command line
$ gcc -Wall -DMYFLAG macros.c
$ ./a.out
MYFLAG is defined
MYFLAG is undefined here
$
$ cpp -DMYFLAG macros.c
...
int main ()
{
printf("MYFLAG is defined\n");
exit(0);
}
$
#endif
Use of macros as values for substitution
int main ()
{
if (EXPR1 == EXPR2) printf("EXPR1 is equal to EXPR2\n");
else printf("EXPR1 is not equal to EXPR2\n");
if (EXPR1 == EXPR3) printf("EXPR1 is equal to EXPR3\n");
else printf("EXPR1 is not equal to EXPR3\n");
if (EXPR1 == EXPR4 * EXPR4) printf("EXPR1 is equal to EXPR4 * EXPR4\n");
else printf("EXPR1 is not equal to EXPR4 * EXPR4\n");
exit(0);
}
• This program cannot compile as such, because EXPR3 and EXPR4 are not defined.
• We define these macros by the –D option.
Examples of macros as values for substitution
$ gcc -Wall -DEXPR3="50 + 50" -DEXPR4="5 + 5" macroval.c
$ ./a.out
EXPR1 is equal to EXPR2
EXPR1 is equal to EXPR3
EXPR1 is not equal to EXPR4 * EXPR4
$
int main ()
{
printf("Welcome %s\n", MYNAME);
exit(0);
}
Abhijit Das
Pralay Mitra
Talking with the shell
• You run your compiled executable (like a.out) from the shell.
• You may add one or more command-line arguments.
• These arguments should somehow go to your C program.
• When the program finishes execution, it should return something to the shell.
• The return value conventionally indicates successful/unsuccessful termination.
The fully decorated main() function
int main ( int argc, char *argv[] )
–
···
˝
The shell talks to your program
• argc is the count of arguments including the program name (like ./a.out).
• argv is a null-terminated array of strings storing the command-line arguments.
• Each argument is a string.
• Use the library functions atoi, atol, atof, . . . (defined in stdlib.h) to convert arguments to int,
long int, double, . . . .
• For example, if you run ./a.out 2022 -name "Sad Tijihba" 6.32, then we have
• argc = 5,
• argv[0] = "./a.out",
• argv[1] = "2022",
• argv[2] = "-name",
• argv[3] = "Sad Tijihba",
• argv[4] = "6.32", and
• argv[5] = NULL.
Your program talks to the shell
if (argc != 6) {
fprintf(stderr, "*** Incorrect number of arguments\n");
exit(1);
}
printf("The equation of the circle: x^2 + y^2 %c %dx %c %dy %c %d = 0\n", s1, t1, s2, t2, s3, t3);
if ((x - c) * (x - c) + (y - d) * (y - d) <= r * r) printf("(%d,%d) is inside the circle\n", x, y);
else printf("(%d,%d) is outside the circle\n", x, y);
exit(0);
}
A chat transcript
Note: You will not understand now what the shell does with the values returned by exit(). Wait until
you gain familiarity with the shell.
Practice exercises
1. Suppose that a C file myfile.c uses a function myfunc() that is defined in a static library libfunc.a. What new
files will be created, if any, if the following command is executed? Assume that the library path is set correctly.
gcc -lfunc myfile.c -o outfile
2. An application program mathapp.c needs two libraries libalgebra.so and libgeometry.so. Each of these
libraries uses a library libarithmetic.so. Moreover, libalgebra.so additionally uses libbasicmath.so. Fi-
nally, libarithmetic.so and libbasicmath.so use the standard math library libm.so. Assume that the runtime
library path is appropriately set so that all these libraries can be located by the compiler and the runtime linker. Show
how you can compile mathapp.c.
3. Two C files file1.c and file2.c are to be compiled to form an executable file outfile. Both the files use a
static library libgraph.a stored in the directory /home/foobar/graph/lib and a static library libstring.a
stored in the directory /home/foobar/strings/lib. In order to access libgraph.a and libstring.a prop-
erly, the C files also need to include some header files stored in the directories /home/foobar/graph/include and
/home/foobar/strings/include. All the header files are to be accessed from the C files using #include <...>
format. Write a single gcc command to do this.
4. Repeat the last exercise assuming that the shared libraries libgraph.so and libstring.so are available in the
directory mentioned. Set LD_LIBRARY_PATH, and then use a single gcc command.
Practice exercises
5. Copy the shared library libstaque.so (see the slides) to a non-system directory /home/foobar/personal/lib.
The environment variable LD_LIBRARY_PATH is not set to include this directory. You have an application program
dfsbfs.c in the directory /home/foobar/algolab, that uses the stack and queue functions of the staque li-
brary. Figure out what extra compilation-time option you should supply to gcc so that ldd a.out shows that
libstaque.so is available in the directory /home/foobar/personal/lib and the runtime linker does not need
setting the LD_LIBRARY_PATH.
6. You are currently in the directory /home/userx/foobar. This directory contains three subdirectories include, foo,
and bar. The subdirectory include contains three header files common.h, foo.h and bar.h. The subdirectory foo
contains three source files foo1.c, foo2.c, and foo3.c, whereas the subdirectory bar contains two source files
bar1.c and bar2.c. The foo source files require the header files common.h and foo.h, whereas the bar source
files require the header files common.h and bar.h. The required header files are included in the source file in the
format #include "../include/...". The five source files are to compiled to a single foobar library. Describe
how you can do this in the following two cases: (i) you want a static library libfoobar.a, (ii) you want a dynamic
library libfoobar.so. These libraries should be built in your current directory /home/userx/foobar.
Practice exercises
7. A number-theory library and application programs using that library need an array of the primes < 20. So you plan
to use an int array storing these numbers in a header file for the library. However, header files are not the right place
for declaring global variables and arrays. Figure out what problem(s) you face if you have the following line in the
header file. What is the reason behind the problem(s)?
int SMALLPRIMES = { 2, 3, 5, 7, 11, 13, 17, 19 };
How can you overcome the problem(s)? You need to have this array in the header file both during the compilation of
the library and during the compilation of the application programs that use the library.
8. Consider the following program fragment.
unsigned short s;
int i, j;
scanf("%d%d", &i, &j);
s = i / j;
printf("%hu\n", s);
There is an obvious problem with this program. Find it, and show the gcc compilation options such that
(i) gcc will only warn about the problem during compilation,
(ii) gcc will give an error and not compile the program.
Practice exercises
9. Suppose that your C program has the following diagnostic printf statements.
printf("++: ...");
printf("+: ...");
printf("++: ...");
printf("+++: ...");
The printf starting with a single + is always to be printed. The printf’s starting with only two + are printed if the user
wants verbose output. The printf’s starting with two and three + are printed if the user wants very verbose output.
The user decides during compilation time whether (s)he uses the normal or the verbose or the very verbose mode.
Modify the above code (without deleting any printf and without using any extra variables) so that the user can select
the printing mode using appropriate compilation options. Show both the modified code and the compilation options.
10. Consider the following C program with undefined symbols N and A.
int main ()
{
int cnt = N, i, arr[N] = A;
for (i=0; i<cnt; ++i) printf("%d\n", arr[i]);
}
How can you define N and A as macros during compilation so that gcc successfully compiles the file?