Unit 4 Symbol Table
Unit 4 Symbol Table
Symbol Tables, Hashing, and Hash Tables – 1 Compiler Design – © Muhammed Mudawwar
Symbol Table Interface
The basic operations defined on a symbol table include:
allocate – to allocate a new empty symbol table
free – to remove all entries and free the storage of a symbol table
insert – to insert a name in a symbol table and return a pointer to its entry
lookup – to search for a name and return a pointer to its entry
set_attribute – to associate an attribute with a given entry
get_attribute – to get an attribute associated with a given entry
Ordered List
If an array is sorted, it can be searched using binary search – O(log2 n)
Insertion into a sorted array is expensive – O(n) on average
Useful when set of names is known in advance – table of reserved words
Symbol Tables, Hashing, and Hash Tables – 3 Compiler Design – © Muhammed Mudawwar
Hash Tables and Hash Functions
A hash table is an array with index range: 0 to TableSize – 1
Symbol Tables, Hashing, and Hash Tables – 4 Compiler Design – © Muhammed Mudawwar
Hash Functions
Hash functions can be defined in many ways . . .
Symbol Tables, Hashing, and Hash Tables – 5 Compiler Design – © Muhammed Mudawwar
Implementing a Hash Function
// Hash string s
// Hash value = (sn-1 + 16(sn-2 + .. + 16(s1+16s0)))
// Return hash value (independent of table size)
unsigned hash(char* s) {
unsigned hval = 0;
while (*s != ’\0’) {
hval = (hval << 4) + *s;
s++;
}
return hval;
}
Symbol Tables, Hashing, and Hash Tables – 6 Compiler Design – © Muhammed Mudawwar
Another Hash Function
// Treat string s as an array of unsigned integers
// Fold array into an unsigned integer using addition
// Return hash value (independent of table size)
unsigned hash(char* s) {
unsigned hval = 0;
while (s[0]!=0 && s[1]!=0 && s[2]!=0 && s[3]!=0){
unsigned u = *((unsigned*) s);
hval += u; s += 4;
}
if (s[0] == 0) return hval;
hval += s[0];
if (s[1] == 0) return hval; Last 3 characters
hval += s[1]<<8; are handled in a
if (s[2] == 0) return hval; special way
hval += s[2]<<16;
return hval;
}
Symbol Tables, Hashing, and Hash Tables – 7 Compiler Design – © Muhammed Mudawwar
Resolving Collisions – Open Addressing
A collision occurs when h(name1) = h(name2) and name1 z name2
Collisions are inevitable because
The name space of identifiers is much larger than the table size
Symbol Tables, Hashing, and Hash Tables – 8 Compiler Design – © Muhammed Mudawwar
Chaining by Separate Lists
Drawbacks of open addressing:
As the array fills, collisions become more frequent – reduced performance
Table size is an issue – dynamically increasing the table size is a difficulty
Symbol Tables, Hashing, and Hash Tables – 9 Compiler Design – © Muhammed Mudawwar
Definition
Symbol table: A data structure used by a compiler to keep
track of semantics of names.
• Data type.
• When is used: scope.
. The effective context where a name is valid.
• Where it is stored: storage address.
Operations:
• Search: whether a name has been used.
• Insert: add a name.
• Delete: remove a name when its scope is closed.
Hash table:
. most commonly used;
. very efficient provided the memory space is adequately larger than the number
of variables;
. performance maybe bad if unlucky or the table is saturated;
. coding is not too difficult.
• Quadratic-rehashing:
. try (h(n) + 12) mod m, and then
. try (h(n) + 22) mod m, . . .,
. try (h(n) + i2) mod m.
X record
C boolean