Type Checking in Compiler Design
Type Checking in Compiler Design
of Programming Languages
by
Ravi Sethi
Data representation 1
Types: data representation
The Role of Types
Basic Types
Arrays: Sequences of Elements
Records: Named Fields
Unions and Variant Records
Sets
Pointers: Efficiency and Dynamic Allocation
Types and Error Checking
Data representation 2
The role of types
Data object:
Refers to something meaningful to the application
Data representation:
Refers to the organization of values in a program
Objects in an application have corresponding
representations in a program
Example: An application uses “days” as an object (thus, “January 22”, “May 6”,
“tomorrow(d)”)
In the program days are represented as integers (22, 126, n+1)
Data representation 3
Values and their types
In imperative languages, data
representations are built from values
that can be manipulated directly by
the underlying machine
Basic types
int, char, float, pointer, …
Structured types
Arrays, records, sets, …
Data representation 4
Type expressions
Examples:
int temp [100]
typedef person {
char name[20];
char address[64];
}
Data representation 8
Unions and variant records
Eclipsed by the Object-Oriented
concepts
Define record types that share
common properties
Variant record: a part common to all
records of that type and a variant
part
Union: a special case of a variant
record with an empty common part
Data representation 9
Unions and variant records (cont’d)
Layout of Variant Records
1. Fixed Part
2. Tag Field
3. Variant Part
Variant Records could compromise
type safety
Data representation 10
Unions and variant records (cont’d)
Data representation 11
Sets
Set Values: in Pascal, all elements must be
of the same simple type
Set Types: Type set of S represents all
possible subsets of S
Example: var S : set of [1..3]
S can denote one of the following sets:
[ ],[1],[2],[3],[1,2],[1,3],[2,3],[1,2,3]
A set of n elements: implemented as a bit
vector of length n
The basic operation on set is a membership
test
Data representation 12
Pointers
Pointers: provide indirect access to elements of a
known type
More efficient to move or copy a pointer to the data
structure
Necessary to implement dynamic data structures
Lists,
Trees,
Graphs,
…
Size and layout of storage for are known statically
Dynamic data structures can grow/shrink at run time
by allocating/deallocating fixed size memory chunks
Data representation 13
Dangling pointers, garbage and
memory leaks
A pointer that still points to a storage
area that has been deallocated is left
“dangling”
Storage that is still allocated but that
it is no longer accessible (through a
pointer to it) is called “garbage”
Programs that create garbage are
said to have “memory leaks”
Data representation 14
Types of expressions
Types extend from values to expressions,
the type of an expression x + y can be
inferred from the types x and y
Types of variable bindings
1. Static or early bindings
2. Dynamic or late bindings
C, Pascal, … have static bindings of types
and dynamic bindings of values to
variables.
Lisp, Smalltalk have dynamic binding of
both values and types
Data representation 15
Type systems
Language design principle:
Every expression must have a type that is
known (at the latest, at run time)
Type system: a set of rules for associating
a type to an expression
allows one to determine the appropriate use the
operators in an expression
Basic rule of type checking
1. Overloading: Multiple meanings
2. Coercion: conversion from one type to another
3. Polymorphism: parameterized type
Data representation 16
Types and error checking
Static and Dynamic Checking
Type error occurs if an operation is
improperly applied
Programs are checked statically
Dynamic checking is done during
program execution
Strong type ensures freedom from type
errors
Data representation 17
Miscellaneous
Short cut evaluation of Boolean
expressions
Type coercion
Data representation 18
Chapter 5
of Programming Languages
Ravi Sethi
Data representation 19
Procedures
• Introductionto Procedures
Parameter Passing Methods
Scope rules for Names
Nested Scope in the Source Text
Data representation 20
INTRODUCTION TO PROCEDURES
Procedures are constructs for giving a
name to a piece of coding (body)
When the name is called , the body is
executed.
Data representation 22
Elements of a procedure
A name for the declared Procedure
A body consisting of local declaration and statements
The formal parameters which are place holders for actuals
An optional result type
Example (pascal)
function square ( x : integer): integer
begin
square := x
end ;
Example (C)
int square ( int x)
{
int sq;
sq = x * x;
return sq;
}
Data representation 23
RECURSION : MULTIPLE ACTIVATION
Activation - Each execution of a procedure body is referred to as an activation
of the procedure
Recursion - A procedure is recursive if it can be activated from within its own
procedure body
Example- Factorial function
function f( n : integer) : integer;
begin
if n = 0 then f := 1 else
f := n * f ( n - 1 )
end ;
f(n) is computed in terms of f(n-1), f(n-1) in terms of f(n-2) and so on
for n = 3 the sequence of activation is a s follows
f(3) = 3 * f(2)
f(2) = 2 * f(1)
f(1) = 1 * f(0)
f(0) = 1
f(1) = 1
Data representation
f(2) = 2 24
f(3) = 6
5.2 PARAMETER PASSING METHODS
• Call by Value
• Call by Reference
• Call by Value Result
Data representation 25
Value Parameter
main
future_value
total total 1/2 initial_balance
rate rate p
year1 nyear
year 2 year2-year1 b
Data representation
expressions Values are 26
Example
procedure swap(var x : integer; var y : integer );
var z : integer;
begin
z := x; x := y; y := z;
end
Data representation 27
OBSERVATIONS
• When a function completes the flow of control returns to the place that called it.
Data representation 28
SCOPE RULES FOR NAMES
procedure swap(var x, y: T)
The Procedure declaration also contains binding occurrences of the procedure name swap,the
formal parameters x and y .The Scopes of the formal parameters x and y and the scope of the
variable z consists of the procedure body.
Data representation 29
LEXICAL AND DYNAMIC SCOPES
Lexical Scope
• Also called Static Scope
• Binding of name occurrences to declarations done statically, at compile time
• A variable that is free in a procedure gets its value from the environment
in which the procedure is defined, rather than from where the procedure is
called
• binding of variable is defined by the structure of the program and not by what
happens at the run time.
V,W,X
(block A)
V,Y
(block B)
V,W,Z
(block C)
Data representation 30
Dynamic Scope
• A free variable gets its value from the environment from which it is
called , rather than from the environment in which it is defined.
Data representation 31
Program L;
var n : char { n declared in L }
procedure W;
begin
writeln(n) { Occurrence of n in W }
end;
procedure D;
var n : char; { n redeclared in D }
begin
n := ‘D’ ;
W { W called within D }
end;
begin { L }
n := ‘L’ ;
W; { W called from the main program L }
D
end.
Data representation 32
NESTED SCOPES- PROCEDURE DECLARATION IN
Program nested
PASCAL
(Input, Output);
var X,Y : Real ; Scope of Y
Procedure Outer
(var X : Real); Scope of M
var M,N : Integer ;
Procedure Inner
( Z : Real);
var N,O : Integer ; Scope of Z
begin { Inner}
………..
end : { Inner}
begin { outer}
----
end { outer }
begin { Nested }
------
Data representation 33
end Nested.
Activation Records
Each execution of the body is called
an activation of the body
associated with each activation of a
body is storage for the variables
declared in the body called an
activation record
Data representation 34
Mapping or Binding Times
Compile
Activation
Run
Data representation 35
Compile Time
Binding of name occurrences to
declarations is defined in terms of
lexical context
{
int i;
{
int i,j; …
}
…
}
Data representation 36
Activation Time
Binding of declarations to locations is
done at activation time - this is
important in recursive procedures
scope activation state
Data representation 37
Run Time
The binding of locations to values is
done dynamically at run time and can
be changed by assignments
Data representation 38
Control Flow Between Activations
In a sequential language, one procedure is
called at a time
P calls Q : P is put on hold, Q gets activated
and when finishes execution resumes with
P
Data representation 40
Elements of an Activation
Record
Points to the activation record of
Control link the caller
Static link, used to implement
Access link lexically scoped languages
Saved state
Parameters
Function result
Local variables
Data representation 41
Results can be different under
lexical and dynamic scope
Lexical - pointer to the block that
contains declaration
Dynamic - follow the control links for
the nearest binding
Data representation 42
Heap
Storage spot for activation records
the records stay here as long as they are
needed
pieces are allocated and freed in some
relatively unstructured manner
problems of storage allocation, recovery,
compaction and reuse may be severe
garbage collection - technique to reclaim
storage that is no longer needed
Data representation 43
Stack
Activation records held in a stack
storage reused efficiently
storage is allocated when activation
begins and released when ends
stack imposes restrictions on
language design - functions as
parameters
Data representation 44
Memory Layout
Data representation 46
Displays
Optimization technique for obtaining
faster access to nonlocals
Array of pointers to activation
records, indexed by lexical nesting
depth
Data representation 47
Homework
Problem 5.4 – page 199 of textbook
Consider the following procedure parens that reads strings such as []([]){[]} and checks whether the
opening parenthesis match the closing parenthesis.
void parens(void) {
for ( ; ; ) {
switch(lookahead) {
case ‘{‘:
M(‘{‘); parens(); M(‘}’); continue;
case ‘(‘:
M(‘(‘); parens(); M(‘)’); continue;
case ‘[‘:
M(‘[‘); parens(); M(‘]’); continue;
default: return;
}
}
}
• Complete the program by giving an implementation for procedure M and by supplying an appropriate
main program. The program should output the string “OK” iff the input string consists of balanced
parentheses.
• How would procedure parens handle strings like abc[a(b+d)f]gh ?
• Data
Howrepresentation
would you change procedure parens so that strings like the one above are considered OK? 48