A lean retargetable
C compiler
Chris Fraser, Bell Labs
Dave Hanson, Princeton
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 1
Optimize our time
◆ Minimize source code
◆ Compile fast
◆ Emit satisfactory code
◆ One literate program emits two
outputs:
– A Retargetable C Compiler: Design and
Implementation. Addison Wesley.
– [Link]
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 2
One source
The string table is an array of 1,024 hash buckets:
<<data>>=
static struct string {
char *str;
int len;
struct string *link;
} *buckets[1024];
@ Each bucket heads a list of strings that share
a hash value.
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 3
Sizes
◆ 12K lines target-independent
◆ Plus1K lburg
◆ Plus ~700 lines per target:
– tree grammar
– code for proc entry/exit, data ...
◆ 400KB code segment includes 3
real targets + 2 for debugging.
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 4
Compile/execution times
◆ Compiles itself in half the time
of gcc
◆ Emitted code generally within
20% of gcc’s
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 5
Code generation
interface: Dags
◆ Shared data structures
◆ 36 base opcodes:
– ADD INDIR JUMP …
◆ 9 base types:
–IDC…
◆ but only 108 combos:
– ADDI INDIRC ...
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 6
Interface functions
◆ begin/end module, function,
block
◆ select/emit code
◆ define symbol
◆ emit initialized data
◆ change segment
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 7
Interface record
typedef struct interface {
unsigned little_endian:1;
void *(defsymbol)(Symbol);
…
}
lcc -Wf-target=x86-linux foo.c
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 8
Code generation specs
◆ Tree grammars match IR and
emit asm code
◆ Sample rules:
reg: ADDI(reg,con)
“addu $%c,$%0,%1\n” 1
addr: ADDI(reg,con) “%1($%0)” 0
◆ Specs: ~200 rules
◆ Hard-coded, bottom-up, optimal
tree matchers, ~2000 lines
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 9
Twists
◆ Link-time CG: Fernandez
◆ Run-time CG: Poletto, Engler,
Kaashoek
◆ Emit Java, even C: Fraser,
Huelsbergen
◆ Debuggers: Hanson, Ramsey,
Raghavachari
◆ Optimize battery life: Tiwari
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 10
More twists
◆ Compress code: Fraser,
Proebsting
◆ Program directors: Sosic
◆ Browse code: Fraser, Pike
◆ Audit trees: Proebsting
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 11
Code compression
Proebsting and Fraser
◆ Accept a C program
◆ Emit:
– a custom interpreter
– postfix bytecodes
◆ Suits ROM, Java, optimizing
linkers?
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 12
Organization
program to compress
"i+1"
trees as ASCII
"ADDI(..., CNSTI[1])"
tree patterns
trees as C initializer
"ADDI(*,CNSTI[*])"
driver code generator
instruction-set generator
interpreter and
interpretive code
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 13
Assigning opcodes
◆ Enumerate all trees:
– ADDI(INDIRI(ADDRGP[i]),CNSTI[1])
◆ Patternize, up to some limit:
– ADDI(*,CNSTI[*])
– ADDI(*,CNSTI[1]) ...
◆ Generate a huge code generator
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 14
… continues
◆ Assign codes to all IR ops used
by the program at hand
◆ With leftover codes, pick
pattern that saves the most, then
loop
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 15
Results
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 16
Run-time CG
Poletto, Engler, Kaashoek
◆ Construct code to sum n int args:
void cspec ConstructSum(int n) {
int k, cspec c = `0;
for (k = 0; k < n; k++) {
int vspec v = (int vspec)
param(k, TC_I);
c = `(@c + @v);
}
return `{return @c};
}
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 17
Translate C to Java
Huelsbergen, Fraser
class FromLCC {
public static int _main() {
int pc = 0;
[Link] -= 16;
while(true) switch (pc) {
...
i=0 case 3: [Link](([Link]+4), 0);
case 6: [Link]((([Link](
rows[i]=1 ([Link]+4))<<2)+_rows), 1);
case 7: [Link](([Link]+4),
i++ ([Link](([Link]+4))+1));
if ([Link](([Link]+4)) < 8) {
if(i<8)goto case 6 pc=6; continue; }; ...
}
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 18
Program directors
Sosic
◆ Mix interpretive, compiled code
◆ Interpreter sends a (filtered)
stream of events from the
executor to the director
– time, pc, result, ...
◆ Director watches and ...
– animates calls,
– watches for corrupt state, ...
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 19
Audit trees
Proebsting
◆ Some trees make no sense:
– INDIRC(ADDF(*,*))
◆ One “back end” emits only Yes
or No but matches with a
grammar that specifies the valid
trees. We run it.
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 20
Big mistakes
◆ Need ASTs
◆ Need flow graphs
◆ “Economized” on long and
void* metrics for too long
◆ Need interface pickle (now
plural)
◆ Need better modularization:
– Half the patches create a new
error. See Dave’s coming book.
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 21
Smaller mistakes
◆ A graph-coloring register
allocator
◆ Instruction scheduling
◆ Peephole optimization
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 22
What we like
◆ Simple and thus good
infrastructure
◆ Fast
◆ Portable
◆ Complete
◆ Validated and kept that way
◆ We’d miss flexibility and fast
compiles more than global opts
Copyright 1995 by C. W. Fraser and D. R. Hanson 4/5/96 23