0% found this document useful (0 votes)

74 views

Compiler Construction

Uploaded by

Pushpendra Singh

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views

Compiler Construction

Uploaded by

Pushpendra Singh

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 810

COMPILER CONSTRUCTION

William M. Waite
Department of Electrical Engineering
University of Colorado
Boulder, Colorado 80309
USA
email: [email protected]

Gerhard Goos
Institut Programmstrukturen und Datenorganisation
Fakultat fur Informatik
Universitat Karlsruhe
D-76128 Karlsruhe
Germany
email: [email protected]

Compiler Construction, a modern text written by two leaders in the in the

eld, demonstrates how a compiler is built. Describing the necessary tools
and how to create and use them, the authors compose the task into mod-
ules, placing equal emphasis on the action and data aspects of compilation.
Attribute grammars are used extensively to provide a uniform treatment of
semantic analysis, competent code generation and assembly. The authors
also explain how intermediate representations can be chosen automatically
on the basis of attribute dependence. Thus, all aspects of the subject are
presented in terms of a uniform model subject to automation. This ap-
proach imparts a vivid understanding of the compilation and the decisions
that must be made when designing a compiler.
From the back page of the printed book.

All known errors from the rst and second printing (1994 and 1995) have been xed. While every
precaution has been taken in preparation of this book, the authors assume no responsibility for errors
or omissions, or damages resulting from the use of the information contained here.
c 1984{1994 by Springer-Verlag, Berlin, New York Inc. ISBN 0-387-90821-8 and ISBN 3-540-90821
c 1995 by William M. Waite and Gerhard Goos.
All rights reserved. No part of this book may be translated, reproduced, archived or sold in any form
without written permission from one of the authors.
The content of Compiler Construction is made available via the Web by permission of the authors
as a service to the community and only for educational purposes. The book may be accessed freely
via Web browsers. The URL is ftp://i44ftp.info.uni-karlsruhe.de/pub/papers/ggoos/Compi-
lerConstruction.ps.gz.
Karlsruhe, 22nd February 1996
To all who know more than one language
Preface
Compilers and operating systems constitute the basic interfaces between a programmer and
the machine for which he is developing software. In this book we are concerned with the
construction of the former. Our intent is to provide the reader with a rm theoretical basis
for compiler construction and sound engineering principles for selecting alternate methods,
implementing them, and integrating them into a reliable, economically viable product. The
emphasis is upon a clean decomposition employing modules that can be re-used for many com-
pilers, separation of concerns to facilitate team programming, and exibility to accommodate
hardware and system constraints. A reader should be able to understand the questions he
must ask when designing a compiler for language X on machine Y, what tradeos are possible,
and what performance might be obtained. He should not feel that any part of the design rests
on whim; each decision must be based upon specic, identiable characteristics of the source
and target languages or upon design goals of the compiler.
The vast majority of computer professionals will never write a compiler. Nevertheless,
study of compiler technology provides important benets for almost everyone in the eld.
It focuses attention on the basic relationships between languages and machines. Un-
derstanding of these relationships eases the inevitable transitions to new hardware and
programming languages and improves a person's ability to make appropriate tradeos
in design and implementation.
It illustrates application of software engineering techniques to the solution of a signicant
problem. The problem is understandable to most users of computers, and involves both
combinatorial and data processing aspects.
Many of the techniques used to construct a compiler are useful in a wide variety of appli-
cations involving symbolic data. In particular, every man-machine interface constitutes
a form of programming language and the handling of input involves these techniques.
We believe that software tools will be used increasingly to support many aspects of
compiler construction. Much of Chapters 7 and 8 is therefore devoted to parser gen-
erators and analyzers for attribute grammars. The details of this discussion are only
interesting to those who must construct such tools; the general outlines must be known
to all who use them. We also realize that construction of compilers by hand will remain
an important alternative, and thus we have presented manual methods even for those
situations where tool use is recommended.
Virtually every problem in compiler construction has a vast number of possible solutions.
We have restricted our discussion to the methods that are most useful today, and make no
attempt to give a comprehensive survey. Thus, for example, we treat only the LL and LR
parsing techniques and provide references to the literature for other approaches. Because we
do not constantly remind the reader that alternative solutions are available, we may sometimes
appear overly dogmatic although that is not our intent.
i
ii Preface

Chapters 5 and 8, and Appendix B, state most theoretical results without proof. Although
this makes the book unsuitable for those whose primary interest is the theory underlying a
compiler, we felt that emphasis on proofs would be misplaced. Many excellent theoretical
texts already exist; our concern is reduction to practice.
A compiler design is carried out in the context of a particular language/machine pair.
Although the principles of compiler construction are largely independent of this context, the
detailed design decisions are not. In order to maintain a consistent context for our major
examples, we therefore need to choose a particular source language and target machine. The
source language that we shall use is dened in Appendix A. We chose not to use an existing
language for several reasons, the most important being that a new language enabled us to
control complexity: Features illustrating signicant questions in compiler design could be
included while avoiding features that led to burdensome but obvious detail. It also allows
us to illustrate how a compiler writer derives information about a language, and provides an
example of an informal but relatively precise language denition.
We chose the machine language of the IBM 370 and its imitators as our target. This
architecture is widely used, and in many respects it is a dicult one to deal with. The
problems are representative of many computers, the important exceptions being those (such
as the Intel 8086) without a set of general registers. As we discuss code generation and
assembly strategies we shall point out simplications for more uniform architectures like
those of the DEC PDP11 and Motorola 68000.
We assume that the reader has a minimum of one year of experience with a block-
structured language, and some familiarity with computer organization. Chapters 5 and 8
use notation from logic and set theory, but the material itself is straightforward. Several
important algorithms are based upon results from graph theory summarized in Appendix B.
This book is based upon many compiler projects and upon the lectures given by the
authors at the Universitat Karlsruhe and the University of Colorado. For self-study, we
recommend that a reader with very little background begin with Section 1.1, Chapters 2
and 3, Section 12.1 and Appendix A. His objective should be to thoroughly understand the
relationships between typical programming languages and typical machines, relationships that
dene the task of the compiler. It is useful to examine the machine code produced by existing
compilers while studying this material. The remainder of Chapter 1 and all of Chapter 4 give
an overview of the organization of a compiler and the properties of its major data structures,
while Chapter 14 shows how three production compilers have been structured. From this
material the reader should gain an appreciation for how the various subtasks relate to one
another, and the important characteristics of the interfaces between them.
Chapters 5, 6 and 7 deal with the task of determining the structure of the source program.
This is perhaps the best-understood of all compiler tasks, and the one for which the most
theoretical background is available. The theory is summarized in Chapter 5, and applied in
Chapters 6 and 7. Readers who are not theoretically inclined, and who are not concerned
with constructing parser generators, should skim Chapter 5. Their objectives should be to
understand the notation for describing grammars, to be able to deal with nite automata,
and to understand the concept of using a stack to resolve parenthesis nesting. These readers
should then concentrate on Chapter 6, Section 7.1 and the recursive descent parse algorithm
of Section 7.2.2.
The relationship between Chapter 8 and Chapter 9 is similar to that between Chapter 5
and Chapter 7, but the theory is less extensive and less formal. This theory also underlies
parts of Chapters 10 and 11. We suggest that the reader who is actually engaged in com-
piler construction devote more eort to Chapters 8-11 than to Chapters 5-7. The reason is
that parser generators can be obtained \o the shelf" and used to construct the lexical and
syntactic analysis modules quickly and reliably. A compiler designer must typically devote
Preface iii

most of his eort to specifying and implementing the remainder of the compiler, and hence
familiarity with Chapters 8-11 will have a greater eect on his productivity.
The lecturer in a one-semester, three-hour course that includes exercises is compelled to
restrict himself to the fundamental concepts. Details of programming languages (Chapter 2),
machines (Chapter 3) and formal languages and automata theory (Chapter 5) can only be
covered in a cursory fashion or must be assumed as background. The specic techniques
for parser development and attribute grammar analysis, as well as the whole of Chapter 13,
must be reserved for a separate course. It seems best to present theoretical concepts from
Chapter 5 in close conjunction with the specic methods of Chapters 6 and 7, rather than as
a single topic. A typical outline is:
1. The Nature of the Problem 4 hours
1.1. Overview of compilation (Chapter 1)
1.2. Languages and machines (Chapters 2 and 3)
2. Compiler Data Structures (Chapter 4) 4 hours
3. Structural Analysis 10 hours
3.1. Formal Systems (Chapter 5)
3.2. Lexical analysis (Chapter 6)
3.3. Parsing (Chapter 7)
Review and Examination 2 hours
4. Consistency Checking 10 hours
4.1. Attribute grammars (Chapter 8)
4.2. Semantic analysis (Chapter 9)
5. Code Generation (Chapter 10) 8 hours
6. Assembly (Chapter 11) 2 hours
7. Error Recovery (Chapter 12) 3 hours
Review 2 hours
The students do not write a compiler during this course. For several years it has been
run concurrently with a practicum in which the students implement the essential parts of a
LAX compiler. They are given the entire compiler, with stubs replacing the parts they are to
write. In contrast to project courses in which the students must write a complete compiler, this
approach has the advantage that they need not be concerned with unimportant organizational
tasks. Since only the central problems need be solved, one can deal with complex language
properties. At the same time, students are forced to read the environment programs and to
adhere to interface specications. Finally, if a student cannot solve a particular problem it
does not cause his entire project to fail since he can take the solution given by the instructor
and proceed.

Acknowledgements
This book is the result of many years of collaboration. The necessary research projects and
travel were generously supported by our respective universities, the Deutsche Forschungsge-
meinschaft and the National Science Foundation.
It is impossible to list all of the colleagues and students who have in uenced our work.
We would, however, like to specially thank four of our doctoral students, Lynn Carter, Bruce
Haddon, Uwe Kastens and Johannes Rohrich, for both their technical contributions and their
willingness to read the innumerable manuscripts generated during the book's gestation. Mae
Jean Ruehlman and Gabriele Sahr also have our gratitude for learning more than they ever
wanted to know about computers and word processing as they produced and edited those
manuscripts.
iv Preface
Contents
Preface i
Contents v
1 Introduction and Overview 1
1.1 Translation and Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 The Tasks of a Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Data Management in a Compiler . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Compiler Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Properties of Programming Languages 11
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 Syntax, Semantics and Pragmatics . . . . . . . . . . . . . . . . . . . . 11
2.1.2 Syntactic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.3 Semantic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Data Objects and Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Elementary Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.2 Composite Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.3 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.4 Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.5 Type Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Control Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5 Program Environments and Abstract Machine States . . . . . . . . . . . . . . 25
2.5.1 Constants, Variables and Assignment . . . . . . . . . . . . . . . . . . 25
2.5.2 The Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.3 Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.6 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3 Properties of Real and Abstract Machines 37
3.1 Basic Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1.1 Storage Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.1.2 Access Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1.3 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2 Representation of Language Elements . . . . . . . . . . . . . . . . . . . . . . 43
3.2.1 Elementary Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2.2 Composite Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.3 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2.4 Control Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
v
vi Contents

3.3 Storage Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.3.1 Static Storage Management . . . . . . . . . . . . . . . . . . . . . . . . 55
3.3.2 Dynamic Storage Management Using a Stack . . . . . . . . . . . . . . 56
3.3.3 Dynamic Storage Management Using a Heap . . . . . . . . . . . . . . 60
3.4 Mapping Specications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.5 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4 Abstract Program Representation 69
4.1 Intermediate Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.1.1 Token Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.1.2 Structure Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.1.3 Computation Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.1.4 Target Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.2 Global Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2.1 Symbol Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2.2 Constant Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.2.3 Denition Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5 Elements of Formal Systems 83
5.1 Descriptive Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.1.1 Strings and Rewriting Systems . . . . . . . . . . . . . . . . . . . . . . 83
5.1.2 Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.1.3 Derivations and Parse Trees . . . . . . . . . . . . . . . . . . . . . . . . 86
5.1.4 Extended Backus-Naur Form . . . . . . . . . . . . . . . . . . . . . . . 89
5.2 Regular Grammars and Finite Automata . . . . . . . . . . . . . . . . . . . . 91
5.2.1 Finite Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.2.2 State Diagrams and Regular Expressions . . . . . . . . . . . . . . . . 93
5.3 Context-Free Grammars and Pushdown Automata . . . . . . . . . . . . . . . 96
5.3.1 Pushdown Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.3.2 Top-Down Analysis and LL(k) Grammars . . . . . . . . . . . . . . . . 98
5.3.3 Bottom-Up Analysis and LR(k) Grammars . . . . . . . . . . . . . . . 103
5.4 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6 Lexical Analysis 111
6.1 Modules and Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.1.1 Decomposition of the Grammar . . . . . . . . . . . . . . . . . . . . . . 111
6.1.2 Lexical Analyzer Interface . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.2 Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.2.1 Extraction and Representation . . . . . . . . . . . . . . . . . . . . . . 113
6.2.2 State Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.2.3 Programming the Lexical Analyzer . . . . . . . . . . . . . . . . . . . . 116
6.3 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7 Parsing 123
7.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.1.1 The Parser Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.1.2 Selection of the Parsing Algorithm . . . . . . . . . . . . . . . . . . . . 125
7.1.3 Parser Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.2 LL(1) Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Contents vii

7.2.1 Strong LL(k) Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . 127

7.2.2 The Parse Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.2.3 Computation of FIRST and FOLLOW Sets . . . . . . . . . . . . . . . 134
7.3 LR Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.3.1 The Parse Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.3.2 SLR(1) and LALR(1) Grammars . . . . . . . . . . . . . . . . . . . . . 138
7.3.3 Shift-Reduce Transitions . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.3.4 Chain Production Elimination . . . . . . . . . . . . . . . . . . . . . . 144
7.3.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.4 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
8 Attribute Grammars 153
8.1 Basic Concepts of Attribute Grammars . . . . . . . . . . . . . . . . . . . . . 153
8.2 Traversal Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
8.2.1 Partitioned Attribute Grammars . . . . . . . . . . . . . . . . . . . . . 158
8.2.2 Derived Traversals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.2.3 Pre-Specied Traversals . . . . . . . . . . . . . . . . . . . . . . . . . . 168
8.3 Implementation Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . 172
8.3.1 Algorithm Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
8.3.2 Attribute Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
8.4 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
9 Semantic Analysis 183
9.1 Description of Language Properties via Attribute Grammars . . . . . . . . . . 183
9.1.1 Scope and Name Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 184
9.1.2 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
9.1.3 Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
9.1.4 Expressions and Statements . . . . . . . . . . . . . . . . . . . . . . . . 196
9.2 Implementation of Semantic Analysis . . . . . . . . . . . . . . . . . . . . . . . 201
9.3 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
10 Code Generation 211
10.1 Memory Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
10.2 Target Attribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
10.2.1 Register Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
10.2.2 Targeting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
10.2.3 Use of Algebraic Identities . . . . . . . . . . . . . . . . . . . . . . . . . 219
10.3 Code Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
10.3.1 Machine Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
10.3.2 Code Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
10.4 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
11 Assembly 235
11.1 Internal Address Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
11.1.1 Label Value Determination . . . . . . . . . . . . . . . . . . . . . . . . 236
11.1.2 Span-Dependent Instructions . . . . . . . . . . . . . . . . . . . . . . . 236
11.1.3 Special Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
11.2 External Address Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
11.2.1 Cross-Referencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
11.2.2 Library Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
viii Contents

11.3 Instruction Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

11.3.1 Target Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
11.3.2 The Encoding Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
11.4 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
12 Error Handling 253
12.1 General Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
12.1.1 Errors, Symptoms, Anomalies and Limitations . . . . . . . . . . . . . 254
12.1.2 Responses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
12.1.3 Communication with the User . . . . . . . . . . . . . . . . . . . . . . 256
12.2 Compiler Error Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
12.2.1 Semantic Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
12.2.2 Syntactic Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
12.2.3 Lexical Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
12.3 Run-Time Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
12.3.1 Static Error Location . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
12.3.2 Establishing the Dynamic Environment . . . . . . . . . . . . . . . . . 268
12.3.3 Debugging Aids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
12.4 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
13 Optimization 273
13.1 The Computation Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
13.2 Local Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
13.2.1 Value Numbering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
13.2.2 Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
13.2.3 Peephole Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
13.2.4 Local Register Allocation . . . . . . . . . . . . . . . . . . . . . . . . . 285
13.3 Global Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
13.3.1 Global Data Flow Analysis . . . . . . . . . . . . . . . . . . . . . . . . 286
13.3.2 Code Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
13.3.3 Strength Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
13.3.4 Global Register Allocation . . . . . . . . . . . . . . . . . . . . . . . . . 294
13.4 Ecacy and Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
14 Implementation 299
14.1 Implementation Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
14.1.1 Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
14.1.2 Pass Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
14.1.3 Table Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
14.2 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
14.2.1 GIER ALGOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
14.2.2 Zurich Pascal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
14.2.3 IBM FORTRAN H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
14.3 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
A The Sample Programming Language LAX 319
A.1 Basic Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
A.2 Program Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
A.2.1 Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
A.2.2 Visibility Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
Contents ix

A.2.3 Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321

A.2.4 Statement Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
A.2.5 Iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
A.2.6 Labels and Jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
A.3 Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
A.3.1 Values, Types and Objects . . . . . . . . . . . . . . . . . . . . . . . . 323
A.3.2 Variable Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
A.3.3 Identity Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
A.3.4 Procedure Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . 324
A.3.5 Type Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
A.4 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
A.4.1 Evaluation of Expressions . . . . . . . . . . . . . . . . . . . . . . . . . 325
A.4.2 Coercions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
A.4.3 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
A.4.4 Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
A.4.5 Procedure Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
A.4.6 Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
B Useful Algorithms For Directed Graphs 329
B.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
B.2 Directed Graphs as Data Structures . . . . . . . . . . . . . . . . . . . . . . . 332
B.3 Partitioning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
B.3.1 Strongly Connected Components . . . . . . . . . . . . . . . . . . . . . 337
B.3.2 Renement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
B.3.3 Coloring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
B.4 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
References 347
x Contents
Chapter 1
Introduction and Overview
The term compilation denotes the conversion of an algorithm expressed in a human-oriented
source language to an equivalent algorithm expressed in a hardware-oriented target language.
We shall be concerned with the engineering of compilers { their organization, algorithms,
data structures and user interfaces.

1.1 Translation and Interpretation

Programming languages are tools used to construct formal descriptions of nite computations
(algorithms). Each computation consists of operations that transform a given initial state
into some nal state. A programming language provides essentially three components for
describing such computations:
Data types, objects and values with operations dened upon them.
Rules xing the chronological relationships among specied operations.
Rules xing the (static) structure of a program.
These components together constitute the level of abstraction on which we can formulate
algorithms in the language. We shall discuss abstractions for programming languages in detail
in Chapter 2.
The collection of objects existing at a given point in time during the computation consti-
tutes the state, s, of the computation at that time. The set, S , of all states that could occur
during computations expressed in the language is called the state space of the language. The
meaning of an algorithm is the (partially-dened) function f : S ! S by which it transforms
initial states to nal states.
Figure 1.1 illustrates the concept of a state. Figure 1.1a is a fragment of a program
written in Pascal. Since this fragment does not declare the identiers i and j , we add the
fact that both are integer variables. The values of i and j before the given fragment begins
to execute constitute the initial state; their values after execution ceases constitute the nal
state. Figure1.1b illustrates the state transformations carried out by the fragment, starting
from a particular initial state.
Let f be the function dened by the state transformation of some particular algorithm A.
If we are to preserve the meaning of A when compiling it to a new language then the state
transformation function f 0 of the translated algorithm A0 must, in some sense, `agree' with f .
Since the state space, S 0 , of the target language may dier from that of the source language,
we must rst decide upon a function, M , to map each state si 2 S to a subset M (s) of S 0 .
The function f 0 then preserves the meaning of f if f 0 (M (s)) is a subset of M (f (s)) for all
allowable initial states s 2 S .
1
2 Introduction and Overview

while i 6= j do
if i > j then i := i , j else j := j , i ;
a) An algorithm

Initial: i = 36 j = 24
i = 12 j = 24
Final: i = 12 j = 12
b) A particular sequence of states
Figure 1.1: Algorithms and States

For example, consider the language of a simple computer with a single accumulator and two
data locations called I and J respectively (Exercise 1.3). Suppose that M maps a particular
state of the algorithm given in Figure 1.1a to a set of machine states in which I contains the
value of the variable i, J contains the value of the variable j , and the accumulator contains
any arbitrary value. Figure 1.2a shows a translation of Figure 1.1a for this machine; a partial
state sequence is given in Figure 1.2b.
LOOP LOAD I
SUB J
JZERO EXIT
JNEG SUBI
STORE I
JUMP LOOP
SUBI LOAD J
SUB I
STORE J
JUMP LOOP
EXIT
a) An algorithm

Initial: I = 36 J = 24 ACC = ?
I = 36 J = 24 ACC = 36
I = 36 J = 24 ACC = 12
. . .
Final: I = 12 J = 12 ACC = 0
b) A sequence of states corresponding to Figure 1.1b
Figure 1.2: A Translation of Figure 1.1
In determining the state sequence of Figure 1.1b, we used only the concepts of Pascal as
specied by the language denition. For every programming language, PL, we can dene
an abstract machine : The operations, data structures and control structures of PL become
the memory elements and instructions of the machine. A `Pascal machine' is therefore an
imaginary computer with Pascal operations as its machine instructions and the data objects
possible in Pascal as its memory elements. Execution of an algorithm written in PL on such
a machine is called interpretation ; the abstract machine is an interpreter.
1.2 The Tasks of a Compiler 3

A pure interpreter analyzes the character form of each source language instruction every
time that instruction is executed. If the given instruction is only to be executed once, pure
interpretation is the least expensive method of all. Hence it is often used for job control
languages and the `immediate commands' of interactive languages. When instructions are
to be executed repeatedly, a better approach is to analyze the character form of the source
program only once, replacing it with a sequence of symbols more amenable to interpretation.
This analysis is simply a translation of the source language into some target language, which
is then interpreted.
The translation from the source language to the target language can take place as each
instruction of the program is executed for the rst time (interpretation with substitution ).
Thus only that part of the program actually executed will be translated; during testing this
may be only a fraction of the entire program. Also, the character form of the source program
can often be stored more compactly than the equivalent target program. The disadvantage of
interpretation with substitution is that both the compiler and interpreter must be available
during execution. In practice, however, a system of this kind should not be signicantly larger
than a pure interpreter for the same language.
Examples may be found of virtually all levels of interpretation. At one extreme are the
systems in which the compiler merely converts constants to internal form, xes the meaning
of identiers and perhaps transforms inx notation to postx (APL and SNOBOL4 are com-
monly implemented this way); at the other are the systems in which the hardware, assisted
by a small run-time system, forms the interpreter (FORTRAN and Pascal implementations
usually follow this strategy).

1.2 The Tasks of a Compiler

A compilation is usually implemented as a sequence of transformations (SL; L1 ); (L1 ; L2 ); : : : ;
(Lk ; TL), where SL is the source language and TL is the target language. Each language
Li is called an intermediate language. Intermediate languages are conceptual tools used in
decomposing the task of compiling from the source language to the target language. The
design of a particular compiler determines which (if any) intermediate language programs
actually appear as concrete text or data structures during compilation.
Any compilation can be broken down into two major tasks:
Analysis: Discover the structure and primitives of the source program, determining its
meaning.
Synthesis: Create a target program equivalent to the source program.
This breakdown is useful because it separates our concerns about the source and target
languages.
The analysis concerns itself solely with the properties of the source language. It converts
the program text submitted by the programmer into an abstract representation embodying
the essential properties of the algorithm. This abstract representation may be implemented
in many ways, but it is usually conceptualized as a tree. The structure of the tree represents
the control and data ow aspects of the program, and additional information is attached
to the nodes to describe other aspects vital to the compilation. In Chapter 2 we review
the general characteristics of source languages, pointing out the properties relevant for the
compiler writer. Figure 1.3 illustrates the general idea with an abstraction of the algorithm
of Figure 1.1a.
Figure 1.3a describes the control and data ow of the algorithm by means of the `kth
descendant of' relation. For example, to carry out the algorithm described by a subtree
4 Introduction and Overview

while

exp if

name # name exp asgn asgn

idn idn name > name name exp name exp

idn idn idn name - name idn name - name

idn idn idn idn

a) Control and data ow

Node Additional Information
idn identier corresponding declaration
name type of the variable
exp type of the expression value
b) Additional information about the source program
Node Additional Information
name corresponding data location
if address of code to carry out the else part
while address of the expression evaluation code
c) Additional information about the target program
Figure 1.3: An Abstract Program Fragment
rooted in a while node we rst evaluate the expression described by the subtree that is the
rst descendant of the while node. If this expression yields true then we carry out the
algorithm described by the subtree that is the second descendant. Similarly, to evaluate the
expression described by an expression subtree, we evaluate the rst and third descendants
and then apply the operator described by the second descendant to the results.
The algorithm of Figure 1.1a is not completely characterized by Figure 1.3a. Information
must be added (Figure 1.3b) to complete the description. Note that some of this information
(the actual identier for each idn) is taken directly form the source text. The remainder is
obtained by processing the tree. For example, the type of the expression value depends upon
the operator and the types of the operands.
Synthesis proceeds from the abstraction developed during analysis. It augments the tree
by attaching additional information (Figure 1.3c) that re ects the source-to-target mapping
discussed in the previous section. For example, the access function for the variable i in
Figure 1.1a would become the address of data location I according to the mapping M assumed
by Figure 1.2. Similarly, the address of the else part of the conditional was represented by
the label SUBI. Chapter 3 discusses the general characteristics of machines, highlighting
properties that are important in the development of source-to-target mappings.
Formal denitions of the source language and the source-to-target mapping determine the
structure of the tree and the computation of the additional information. The compiler simply
implements the indicated transformations, and hence the abstraction illustrated in Figure 1.3
forms the basis for the entire compiler design. In Chapter 4 we discuss this abstraction in
detail, considering possible intermediate languages and the auxiliary data structures used in
transforming between them.
Analysis is the more formalized of the two major compiler tasks. It is generally broken
1.3 Data Management in a Compiler 5

down into two parts, the structural analysis to determine the static structure of the source
program, and the semantic analysis to x the additional information and check its consistency.
Chapter 5 summarizes some results from the theory of formal languages and shows how they
are used in the structural analysis of a program. Two subtasks of the structural analysis are
identied on the basis of the particular formalisms employed: Lexical analysis (Chapter 6)
deals with the basic symbols of the source program, and is described in terms of nite-state
automata; syntactic analysis, or parsing, (Chapter 7) deals with the static structure of the
program, and is described in terms of pushdown automata. Chapter 8 extends the theoretical
treatment of Chapter 5 to cover the additional information attached to the components of the
structure, and Chapter 9 applies the resulting formalism (attribute grammars) to semantic
analysis.
There is little in the way of formal models for the entire synthesis process, although al-
gorithms for various subtasks are known. We view synthesis as consisting of two distinct
subtasks, code generation and assembly. Code generation (Chapter 10) transforms the ab-
stract source program appearing at the analysis/synthesis interface into an equivalent target
machine program. This transformation is carried out in two steps: First we map the algo-
rithm from source concepts to target concepts, and then we select a specic sequence of target
machine instructions to implement that algorithm.
Assembly (Chapter 11) resolves all target addressing and converts the target machine
instructions into an appropriate output format. We should stress that by using the term
`assembly' we do not imply that the code generator will produce symbolic assembly code for
input to the assembly task. Instead, it delivers an internal representation of target instructions
in which most addresses remain unresolved. This representation is similar to that resulting
from analysis of symbolic instructions during the rst pass of a normal symbolic assembler.
The output of the assembly task should be in the format accepted by the standard link editor
or loader on the target machine.
Errors may appear at any time during the compilation process. In order to detect as
many errors as possible in a single run, repairs must be made such that the program is
consistent, even though it may not re ect the programmer's intent. Violations of the rules of
the source language should be detected and reported during analysis. If the source algorithm
uses concepts of the source language for which no target equivalent has been dened in a
particular implementation, or if the target algorithm exceeds limitations of a specic target
language interpreter (e.g. requires more memory than a specic computer provides), this
should be reported during synthesis. Finally, errors must be reported if any storage limits of
the compiler itself are violated.
In addition to the actual error handling, it is useful for the compiler to provide extra
information for run-time error detection and debugging. This task is closely related to error
handling, and both are discussed in Chapter 12.
A number of strategies may be followed in an attempt to improve the target program
relative to some specied measure of cost. (Code size and execution speed are typical cost
measures.) These strategies may involve deeper analysis of the source program, more complex
mapping functions, and transformations of the target program. We shall treat the rst two
in our discussions of analysis and code generation respectively; the third is the subject of
Chapter 13.

1.3 Data Management in a Compiler

As with other large programs, data management and access account for many of the problems
to be solved by the design of a compiler. In order to control complexity, we separate the
6 Introduction and Overview

functional aspects of a data object from the implementation aspects by regarding it as an

instance of an abstract data type. (An abstract data type is dened by a set of creation,
assignment and access operators and their interaction; no mention is made of the concrete
implementation technique.) This enables us to concentrate upon the relationships between
tasks and data objects without becoming enmeshed in details of resource allocation that
re ect the machine upon which the compiler is running (the compiler host ) rather than the
problem of compilation.
A particular implementation is chosen for a data object on the basis of the relationship
between its pattern of usage and the resources provided by the compiler host. Most of the
basic issues involved become apparent if we distinguish three classes of data:
Local data of compiler tasks
Program text in various intermediate representations
Tables containing information that represents context-dependence in the program text
Storage for local data can be allocated statically or managed via the normal stacking
mechanisms of a block-structured language. Such strategies are not useful for the program
text, however, or for the tables containing contextual information. Because of memory lim-
itations, we can often hold only a small segment of the program text in directly-accessible
storage. This constrains us to process the program sequentially, and prevents us from rep-
resenting it directly as a linked data structure. Instead, a linear notation that represents a
specic traversal of the data structure (e.g. prex or postx) is often employed. Information
to be used beyond the immediate vicinity of the place where it was obtained is stored in ta-
bles. Conceptually, this information is a component of the program text; in practice it often
occupies dierent data structures because it has dierent access patterns. For example, tables
must often be accessed randomly. In some cases it is necessary to search them, a process that
may require a considerable fraction of the total compilation time. For this reason we do not
usually consider the possibility of spilling tables to a le.
The size of the program text and that of most tables grows linearly with the length of
the original source program. Some data structures (e.g. the parse stack) only grow with the
complexity of the source program. (Complexity is generally related to nesting of constructs
such as procedures and loops. Thus long, straight-line programs are not particularly complex.)
Specication of bounds on the size of any of these data structures leads automatically to
restrictions on the class of translatable programs. These restrictions may not be onerous to
a human programmer but may seriously limit programs generated by pre-processors.

1.4 Compiler Structure

A decomposition of any problem identies both tasks and data structures. For example, in
Section 1.2 we discussed the analysis and synthesis tasks. We mentioned that the analyzer
converted the source program into an abstract representation and that the synthesizer ob-
tained information from this abstract representation to guide its construction of the target
algorithm. Thus we are led to recognize a major data object, which we call the structure tree
in addition to the analysis and synthesis tasks.
We dene one module for each task and each data structure identied during the decom-
position. A module is specied by an interface that denes the objects and actions it makes
available, and the global data and operations it uses. It is implemented (in general) by a
collection of procedures accessing a common data structure that embodies the state of the
module. Modules fall into a spectrum with single procedures at one end and simple data
objects at the other. Four points on this spectrum are important for our purposes:
1.4 Compiler Structure 7

Procedure: An abstraction of a single "memoryless" action (i.e. an action with no

internal state). It may be invoked with parameters, and its eect depends only upon
the parameter values. (Example { A procedure to calculate the square root of a real
value.)
Package: An abstraction of a collection of actions related by a common internal state.
The declaration of a package is also its instantiation, and hence only one instance is
possible. (Example { The analysis or structure tree module of a compiler.)
Abstract data type: An abstraction of a data object on which a number of actions can
be performed. Declaration is separate from instantiation, and hence many instances
may exist. (Example { A stack abstraction providing the operations push, pop, top,
etc.)
Variable: An abstraction of a data object on which exactly two operations, fetch and
store, can be performed. (Example { An integer variable in most programming lan-
guages.)
Abstract data types can be implemented via packages: The package denes a data type
to represent the desired object, and procedures for all operations on the object. Objects are
then instantiated separately. When an operation is invoked, the particular object to which it
should be applied is passed as a parameter to the operation procedure.
The overall compiler structure that we shall use in this book is outlined in Figures 1.4
through 1.8. Each of these gures describes a single step in the decomposition. The central
block of the gure species the problem being decomposed at this step. To the left are
the data structures from which information is obtained, and to the right are those to which
information is delivered. Below is the decomposition of the problem, with boxes representing
subtasks. Data structures used for communication among these subtasks are listed at the
bottom of the gure. Each box and each entry in any of the three data lists corresponds to
a module of the compiler. It is important to note that Figures 1.4 through 1.8 re ect only
the overall structure of the compiler; they are not owcharts and they do not specify module
interfaces.
INPUT OUTPUT
Source text Target Code
Error Reports

Compilation

Analysis Synthesis
LOCAL
Structure Tree
Figure 1.4: Decomposition of the Compiler
Our decomposition is based upon our understanding of the compilation problem and our
perception of the best techniques currently available for its solution. The choice of precise
boundaries is driven by control and data ow considerations, primarily minimization of ow
at interfaces. Specic criteria that in uenced our decisions will be discussed throughout the
text.
The decomposition is virtually independent of the underlying implementation, and of
the specic characteristics of the source language and target machine. Clearly these factors
8 Introduction and Overview

INPUT OUTPUT
Source text Target Code
Error Reports

Analysis

Structural Analysis Semantic Analysis

Figure 1.5: Decomposition of the Analysis Task

in uence the complexity of the modules that we have identied, in some cases reducing them
to trivial stubs, but the overall structure remains unchanged.
INPUT OUTPUT
Source text Error Reports
Connection Sequence

Structural Analysis

Lexical Analysis Parsing

LOCAL
Token Sequence
Figure 1.6: Decomposition of the Structural Analysis Task
Independence of the modules from the concrete implementation is obtained by assuming
that each module is implemented on its own abstract machine, which provides the precise
operations needed by the module. The local data structures of Figures 1.4-1.8 are thus
components of the abstract machine on which the given subproblem is solved.
INPUT OUTPUT
Structure Tree Error Reports
Target Code

Synthesis

Code Generation Assembly

LOCAL
Target Tree
Figure 1.7: Decomposition of the Synthesis Task
One can see the degree of freedom remaining in the implementation by noting that our
diagrams never prescribe the time sequence of the subproblem solutions. Thus, for exam-
ple, analysis and synthesis might run sequentially. In this case the structure tree must be
completely built as a linked data structure during analysis, written to a le if necessary, and
then processed during synthesis. Analysis and synthesis might, however, run concurrently
1.5 Notes and References 9

and interact as coroutines: As soon as the analyzer has extracted an element of the structure
tree, the synthesizer is activated to process this element further. In this case the structure
tree will never be built as a concrete object, but is simply an abstract data structure; only
the element being processed exists in concrete form.
INPUT OUTPUT
Structure Tree Error Reports
Target Tree

Code Generation

Targe Mapping Code Selection

LOCAL
Computation Graph
Figure 1.8: Decomposition of the Code Generation Task
In particular, our decomposition has nothing to do with the possible division of a com-
piler into passes. (We consider a pass to be a single, sequential scan of the entire text in
either direction. A pass either transforms the program from one internal representation to
another or performs specied changes while holding the representation constant.) The pass
structure commonly arises from storage constraints in main memory and from input/output
considerations, rather than from any logical necessity to divide the compiler into several se-
quential steps. One module is often split across several passes, and/or tasks belonging to
several modules are carried out in the same pass. Possible criteria will be illustrated by con-
crete examples in Chapter 14. Proven programming methodologies indicate that it is best to
regard pass structure as an implementation question. This permits development of program
families with the same modular decomposition but dierent pass organization. The above
consideration of coroutines and other implementation models illustrates such a family.

1.5 Notes and References

Compiler construction is one of the areas of computer science that early workers tried to
consider systematically. Knuth [1962] reports some of those eorts. Important sources from
the rst half of the 60's are an issue of the Communications of the ACM 1961 the report of a
conference sponsored by the International Computing Centre [ICC, 1962] and the collection of
papers edited by Rosen [1967]. Finally, Annual Review in Automatic Programming contains
a large number of fundamental papers in compiler construction.
The idea of an algorithmic conversion of expressions to a machine-oriented form originated
in the work of Rutishauser [1952]. Although most of our current methods bear only a dis-
tant resemblance to those of the 50's and early 60's, we have inherited a view of the description
of programming languages that provides the foundation of compiler construction today: In-
termediate languages were rst proposed as interfaces in the compilation process by a SHARE
committee Mock et al. [1958]; the extensive theory of formal languages, rst developed by
the linguist Noam Chomsky 1956, was employed in the denition of ALGOL 60 1963; the use
of pushdown automata as models for syntax analysis appears in the work of Samelson and
Bauer [1960].
The book by Randell and Russell [1964] remains a useful guide for a quick implemen-
tation of ALGOL 60 that does not depend upon extensive tools. Grau et al. [1967] describe
10 Introduction and Overview

an ALGOL 60 implementation in an extended version of ALGOL 60. The books by Gries

[1971], Aho and Ullman [1972, 1977] and Bauer and Eickel [1976] represent the state of
the art in the mid 1970's.
Recognition that parsing can be understood via models from the theory of formal lan-
guages led to a plethora of work in this area and provided the strongest motivation for the
further development of that theory. From time to time the impression arises that parsing is
the only relevant component of compiler construction. Parsing unquestionably represents one
of the most important control mechanisms of a compiler. However, while just under one third
of the papers collected in Pollack's 1972 bibliography are devoted to parsing, there was not
one reference to the equally important topic of code generation. Measurements [Horning
et al., 1972] have shown that parsing represents approximately 9% of a compiler's code and
11% of the total compilation time. On the other hand, code generation and optimization
account for 50-70% of the compiler. Certainly this discrepancy is due, in part, to the great
advances made in the theory of parsing; the value of this work should not be underestimated.
We must stress, however, that a more balanced viewpoint is necessary if progress is to be
maintained.
Modular decomposition Parnas [1972, 1976] is a design technique in which intermedi-
ate stages are represented by specications of the external behavior (interfaces) of program
modules. The technique of data-driven decomposition was discussed by Liskov and Zilles
[1974] and a summary of program module characteristics was given by Goos and Kastens
[1978]. This latter paper shows how the various kinds of program modules are constructed
in several programming languages. Our diagrams depicting single decompositions are loosely
based upon some ideas of Constantine et al. [1974].

Exercises
1.1 Consider the Pascal algorithm of Figure 1.1a.
(a) What are the elementary objects and operations?
(b) What are the rules for chronological relations?
(c) What composition rules are used to construct the static program?
1.2 Determine the state transformation function, f , for the algorithm of Figure 1.1a. What
initial states guarantee termination? How do you characterize the corresponding nal
states?
1.3 Consider a simple computer with an accumulator and two data locations. The instruc-
tion set is:
LOAD d: Copy the contents of data location d to the accumulator.
STORE d: Copy the contents of the accumulator to data location d.
SUB d: Subtract the contents of data location d from the accumulator, leaving
the result in the accumulator. (Ignore any possibility of over ow.)
JUMP i: Execute instruction i next.
JZERO i: Execute instruction i next if the accumulator contents are zero.
JNEG i: Execute instruction i next if the accumulator contents are less than
zero.
(a) What are the elementary objects?
(b) What are the elementary actions?
(c) What composition rules are used?
(d) Complete the state sequence of Figure 1.2b.
Chapter 2
Properties of Programming
Languages
Programming languages are often described by stating the meaning of the constructs (ex-
pressions, statements, clauses, etc.) interpretively. This description implicitly denes an
interpreter for an abstract machine whose machine language is the programming language.
The output of the analysis task is a representation of the program to be compiled in
terms of the operations and data structures of this abstract machine. By means of code
generation and the run-time system, these elements are modeled by operation sequences and
data structures of the computer and its basic software (operating system, etc.)
In this chapter we explore the properties of programming languages that determine the
construction and possible forms of the associated abstract machines, and demonstrate the
correspondence between the elements of the programming language and the abstract machine.
On the basis of this discussion, we select the features of our example source language, LAX.
A complete denition of LAX is given in Appendix A.

2.1 Overview
The basis of every language implementation is a language denition. (See the Bibliography
for a list of the language denitions that we shall refer to in this book.) Users of the language
read the denition as a user manual: What is the practical meaning of the primitive elements?
How can they be meaningfully used? How can they be combined in a meaningful way? The
compiler writer, on the other hand, is interested in the question of which constructions are
permitted. Even if he cannot at the moment see any useful application of a construct, or if
the construct leads to serious implementation diculties, he must implement it exactly as
specied by the language denition. Descriptions such as programming textbooks, which are
oriented towards the meaningful applications of the language elements, do not clearly dene
the boundaries between what is permitted and what is prohibited. Thus it is dicult to make
use of such descriptions as bases for the construction of a compiler. (Programming textbooks
are also informal, and often cover only a part of the language.)

2.1.1 Syntax, Semantics and Pragmatics

The syntax of a language determines which character strings constitute well-formed programs
in the language and which do not. The semantics of a language describe the meaning of a
program in terms of the basic concepts of the language. Pragmatics relate the basic concepts
11
12 Properties of Programming Languages

of the language to concepts outside the language (to concepts of mathematics or to the objects
and operations of a computer, for example).
Semantics include properties that can be deduced without executing the program as well
as those only recognizable during execution. Following Griffiths [1973], we denote these
properties static and dynamic semantics respectively. The assignment of a particular property
to one or the other of these classes is partially a design decision by the compiler writer. For
example, some implementations of ALGOL 60 assign the distinction between integer and real
to the dynamic semantics, although this distinction can normally be made at compile time
and thus could belong to the static semantics.
Pragmatic considerations appear in language denitions as unelaborated statements of
existence, as references to other areas of knowledge, as appeals to intuition, or as explicit
statements. Examples are the statements `[Boolean] values are the truth values denoted by the
identiers true and false' (Pascal Report, Section 6.1.2), `their results are obtained in the sense
of numerical analysis' (ALGOL 68 Revised Report, Section 2.1.3.1.e) or `decimal numbers have
their conventional meaning' (ALGOL 60 Report, Section 2.5.3). Most pragmatic properties
are hinted at through a suggestive choice of words that are not further explained. Statements
that certain constructs only have a dened meaning under specied conditions also belong
to the pragmatics of a language. In such cases the compiler writer is usually free to x the
meaning of the construct under other conditions. The richer the pragmatics of a language, the
more latitude a compiler writer has for ecient implementation and the heavier the burden
on the user to write his program to give the same answers regardless of the implementation.
We shall set the following goals for our analysis of a language denition:
Stipulation of the syntactic rules specifying construction of programs.
Stipulation of the static semantic rules. These, in conjunction with the syntactic rules,
determine the form into which the analysis portion of the compiler transforms the source
program.
Stipulation of the dynamic semantic rules and dierentiation from pragmatics. These
determine the objects and operations of the language-oriented abstract machine, which
can be used to describe the interface between the analysis and synthesis portions of the
compiler: The analyzer translates the source program into an abstract target program
that could run on the abstract machine.
Stipulation of the mapping of the objects and operations of the abstract machine onto
the objects and operations of the hardware and operating system, taking the pragmatic
meanings of these primitives into account. This mapping will be carried out partly by
the code generator and partly by the run-time system; its specication is the basis for
the decisions regarding the partitioning of tasks between these two phases.
2.1.2 Syntactic Properties
The syntactic rules of a language belong to distinct levels according to their meaning. The
lowest level contains the `spelling rules' for basic symbols, which describe the construction
of keywords, identiers and special symbols. These rules determine, for example, whether
keywords have the form of identiers (begin) or are written with special delimiters ('BEGIN',
.BEGIN), whether lower case letters are permitted in addition to upper case, and which
spellings (<=, .LE., 'NOT' 'GREATER') are permitted for symbols such as that cannot be
reproduced on all I/O devices. A common property of these rules is that they do not aect
the meaning of the program being represented. (In this book we have distinguished keywords
by using boldface type. This convention is used only to enhance readability, and does not
imply anything about the actual representation of keywords in program text.)
2.1 Overview 13

The second level consists of the rules governing representation and interpretation of con-
stants, for example rules about the specication of exponents in oating point numbers or
the allowed forms of integers (decimal, hexadecimal, etc.) These rules aect the meanings of
programs insofar as they specify the possibilities for direct representation of constant values.
The treatment of both of these syntactic classes is the task of lexical analysis, discussed in
Chapter 6.
The third level of syntactic rules is termed the concrete syntax. Concrete syntax rules
describe the composition of language contructs such as expressions and statements from basic
symbols. Figure 2.1a shows the parse tree (a graphical representation of the application of
concrete syntax rules) of the Pascal statement `if a or b and c then : : : else : : : '. Because
the goal of the compiler's analysis task is to determine the meaning of the source program,
semantically irrelevant complications such as operator precedence and certain keywords can
be suppressed. The language constructs are described by an abstract syntax that species
the compositional structure of a program while leaving open some aspects of its concrete
representation as a string of basic symbols. Application of the abstract syntax rules can be
illustrated by a structure tree (Figure 2.1b).
statement

if expression then else

... ...
simple expression

term or term

factor factor and factor

variable variable variable

identifier identifier identifier

a b c

a) Parse tree (application of concrete syntax rules)

statement

expression

... ...
term or term

factor factor and factor

variable variable variable

identifier identifier identifier

a b c

b) Structure tree (application of abstract syntax rules)

Figure 2.1: Concrete and Abstract Syntax
14 Properties of Programming Languages

2.1.3 Semantic Properties

Most current programming languages specify algorithms operationally, in contrast to `very
high level' languages that allow the user to formally describe a problem and leave the imple-
mentation to the compiler. Essential semantic elements of operational languages are -

Data objects and structures upon which operations take place

Operations and construction rules for expressions and other operative statements
Constructs providing ow of control, the dynamic composition of program fragments
Data objects appear as explicit constants, as values of variables and as results of opera-
tions. At any point in the execution of a program the totality of variable values represents
the state of the abstract machine. This state constitutes the environment for execution of
further operations.
Included in the set of operations are the access functions such as indexing of an array or
selection of a eld of a record, and operations such as the addition or comparison of two values.
These operations do not alter the state of the abstract machine. Assignment is an example
of an operation with a side eect that alters the contents of a variable, a component of the
state of the abstract machine. Most programming languages contain a large number of such
state-changing operations, all of which may be regarded as assignment combined with other
operations. Usually these operations are formulated as statements without results. Most
COBOL `verbs' designate such statements. Finally, operations include block entry and exit,
procedure call and return, and creation of variables. These operations, which we associate with
control of the state, change the state by creating and deleting objects (variables, parameters,
etc.) and altering the allowable access functions.
Flow of control includes conditional expressions or statements, case selection, iteration,
jumps and so forth. These elements appear in various forms in most programming languages,
and frequently take into account some special implementation possibility or practice. For
example, the conditional statement
if truth_value then s1 else s2 ;

and the case selection

case truth_value of true : s1 ; false : s2 end;
have identical eects in Pascal. As we shall see later, however, the two constructs would
probably be implemented dierently.
In considering semantic properties, it is important for the compiler writer to systematically
collect the countless details such as properties of data objects, operations and side eects,
possibilities for iteration, and so forth, into some schema. The clarity and adequacy of this
schema determines the quality of the compiler because the compiler structure is derived from
it. A shoddy schema makes well-nigh impossible a convincing argument that the compiler
translates the source language fully and completely.
For many languages, including ALGOL 60, ALGOL 68, Pascal and Ada, good schemata
are comparatively easy to obtain because the language denitions are suitably structured.
Other language denitions take the form of a collection of language element descriptions with
many exception rules; a systematic treatment of such languages is often impossible.
2.2 Data Objects and Operations 15

2.2 Data Objects and Operations

The most important characteristics of a programming language are the available data objects
and the operations that may be executed upon them. The term `object' means a concrete
instance of an abstract value. Many such instances of the same value may exist at the same
time. The set of values possible in a language, such as numbers, character strings, records
and so forth, is usually innite although a given program naturally uses only a nite number
of them.
Objects and values may be classied according to many criteria. For example, their
internal (to the computer) or external representation, the algorithm used to access them, or
the access rights might be used. Each such classication leads to an attribute of the object.
The most important classication is a partition of the set of values according to the applicable
operations; the corresponding attribute is called the type or mode of the value. Examples are
the numeric types integer and real, to which the basic arithmetic operations may be applied.
(The special role of zero in division is not covered by this classication.)
A rough subdivision of object types can be made on the basis of the possible access
functions. If an object can be accessed only in its entirety we say that its type is elementary.
If, however, the object consists of a collection of distinct components, which may be altered
individually, then we say that its type is composite. Thus if a programming language were to
explain oating point operations in terms of updating operations on fraction and exponent
individually, oating point values would be composite. This is not usually done; the oating
point operations can only yield complete oating numbers, and hence real is an elementary
type.
Every operation interprets its operands in a specied manner. The assignment of a type to
a value xes this interpretation and admits only those operations for which this interpretation
is meaningful. As usual with such attributes, there are many possible choices for the binding
time { the point at which a particular attribute is ascribed to a particular object: If the type
is rst xed upon execution of an operation, and if practically any operation can be applied
to any object (so long as its length is appropriate), then we term the language typeless or
type-free ; otherwise it is called a typed language. If the type of an object can be determined
explicitly from the program text, we speak of manifest type; the type is latent if it cannot be
determined until the program is executed. (A language whose types are manifest throughout
is sometimes called a strongly-typed language, while one whose types are latent is called
weakly-typed.) Objects with latent types must be provided with an explicit type indication
during execution. Most assembly languages are examples of typeless languages. In contrast,
ALGOL 60, FORTRAN and COBOL are languages with manifest types: All variables are
declared (either explicitly or implicitly) to have values of a certain type, and there are dierent
forms of denotation for constants of dierent types. SNOBOL4 has neither declarations nor
implied type specications for its variables; on the contrary, the type may change during
execution. Thus SNOBOL4 has latent types. The union modes in ALGOL 68 and the variant
records of Pascal and Ada take an intermediate position. A variable of such a `discriminated
union' has a latent type, but the possible value types may only be drawn from an explicitly-
stated set.
In a typeless language, the internal representation (`coding') of an object is the concern of
the programmer; the implementor of a typed language can x the coding because he is fully
aware of all desired interpretations. Erroneous coding by the programmer is thus impossible.
Further, inconsistent creation or use of a data object can be detected automatically and
hence the class of automatically-detected errors is broadened. With manifest types such
errors appear during compilation, with latent types they are rst detected during execution.
Moreover, in a language with latent types the erroneous creation of an object is only detected
16 Properties of Programming Languages

upon subsequent use and the necessary dynamic type checking increases the computation
time.

2.2.1 Elementary Types

Our purpose in this section and the next is to give an overview of the types usually found
in programming languages and explore their `normal' properties. The reader should note in
particular how these properties may be deduced from the language denition.
The elementary types can be partitioned according to the (theoretical) size of their value
sets. A type is called nite if only a xed number of values of this type exist; otherwise the
type is (potentially) innite.
Finite types can be dened by enumeration of all of the values of the type. Examples
are the type Boolean whose value set is ftrue,false g and the type character, with the entire
set of characters permitted by an implementation as its value set. Almost all operations
and properties of a type with n values can be dened giving a 1-1 correspondence with the
natural numbers 0; : : : ; n , 1 and then dening operations using these ordinal numbers. This
possibility does not imply that such a mapping is actually specied in every language; on the
contrary, nite types are introduced primarily to represent value sets for which a numerical
interpretation is meaningless. For example, the revised ALGOL 68 report denes no corre-
spondence between truth values and the integers values, but leaves its precise specication to
the implementor: `: : : this relationship is dened only to the extent that dierent characters
have dierent integral equivalents, and that there exists a "largest integral equivalent"' (Sec-
tion 2.1.3.1.g). This specication permits gaps in the sequence of corresponding integers, an
important point in many implementations.
In principle the value set of a nite type is unordered. If an ordering is needed, say to
dene relational operators or a successor function, the ordering induced by the mapping to
natural numbers is used. For example, Pascal species that the relation false < true holds
and thus demands the mapping false ! 0, true ! 1 (although the ordering of Boolean
values is really irrelevant). Often the mere existence of an ordering is sucient. For example,
the ALGOL 68 specication of character values permits the use of sorted tables or trees to
speed up searching, even though the user could not guarantee a particular ordering. Many
applications demand that some particular ordering (collating sequence ) be dened on the set
of characters; the task of lexicographic ordering in a telephone book is a common example.
Dierent collating sequences may be appropriate for dierent problems. COBOL recognizes
this fact by allowing the user to provide dierent collating sequences for dierent programs
or for dierent operations within the same program.
The integers and oating point numbers belong to the class of innite types. Most lan-
guage denitions rely upon the mathematical intuition of the reader for the denition of these
types. Some of our mathematical intuition is invalidated, however, because the machine rep-
resentations of these types are necessarily nite.
The important characteristics of integer type are that a successor function is dened on
the values, and that exact arithmetic is available. In contrast, a real value has no dened
successor (although a total ordering is dened) and arithmetic is inexact. Some of the familiar
axioms fail { for example, associativity is lost. In the representation of a oating point number
as a pair (s; e) such that v = s be is stored in a single word, additional range is obtained
at the cost of decreased precision. In comparison to the integer representation, the number
of signicant digits in s has been shortened to obtain space for the exponent e. The radix b
is usually 2, 8, 10 or 16. Both a range and a precision must be specied to characterize the
oating point domain, while a range alone suces for the integer domain. The specications
for the two domains are independent of one another. In particular, it is often impossible to
2.2 Data Objects and Operations 17

represent all valid integers exactly as oating point numbers because s is not large enough to
hold all integer values.
The number of signicant digits and the size of the exponent (and similar properties of
other types) vary from computer to computer and implementation to implementation. Since
an algorithm's behavior may depend upon the particular values of such parameters, the values
should be accessible. For this purpose many languages provide environment inquiries ; some
languages, Ada for example, allow specications for the range and precision of numbers in
the form of minimum requirements.
Restriction of the integer domain and similar specication of subranges of nite types is
often erroneously equated to the concept of a type. ALGOL 68, for example, distinguishes an
innity of `sizes' for integer and real values. Although these sizes dene dierent modes in the
ALGOL 68 sense, the Standard Environment provides identical operators for each; thus they
are indistinguishable according to the denition of type given at the beginning of Section 2.2.
The distinction can only be understood by examination of the internal coding.
The basic arithmetic operations are usually dened by recourse to the reader's mathe-
matical intuition. Only integer division involving negative operands requires a more exact
stipulation in a language denition. Number theorists recognize two kinds of integer division,
one truncating toward zero (-3 divided by 2 yields -1) and the other truncating toward nega-
tive innity (-3 divided by 2 yields -2). ALGOL 60 uses the rst denition, which also forms
the basis for most hardware realizations.
We have already seen that a correspondence between the values of a nite type and a
subset of the natural numbers can be dened. This correspondence may be specied by the
language denition, or it may be described but its denition left to the implementor. As
a general principle, similar relationships are possible between the value sets of other types.
For example, the ALGOL 68 Revised Report asserts that for every integer of a given length
there is an equivalent real of that length; the FORTRAN Standard implies a relation between
integer and real values by its denition of assignment, but does not dene it precisely.
Even if two values of dierent types (say 2 and 2.0) are logically equivalent, they must
be distinguished because dierent operations may be applied to them. If a programmer is to
make use of the equivalence, the abstract machine must provide appropriate transfer (con-
version ) operations. This is often accomplished by overloading the assignment operator. For
example, Section 4.2.4 of the ALGOL 60 Report states that `if the the type of the arithmetic
expression [in an assignment] diers from that associated with the variables and procedure
identiers [making up the left part list], appropriate transfer functions are understood to be
automatically invoked'. Another way of achieving this eect is to say that the operator indi-
cation `:=' stands for one of a number of assignment operations, just as `+' stands for either
integer or real addition.
The meaning of `:=' must be determined from the context in the above example. Another
approach to the conversion problem is to use the context to determine the type of value
directly, and allow the compiler to insert a transfer operation if necessary. We say that
the compiler coerces the value to a type appropriate for the context; the inserted transfer
operation is a coercion.
Coercions are most frequently used when the conversion is dened for all values of the type
being converted. If this is not the case, the programmer may be required to write an explicit
transfer function. In Pascal, for example, a coercion is provided from integer to real but not
from real to integer. The programmer must use one of the two explicit transfer functions
trunc or round in the latter case.
Sometimes coercions are restricted to certain syntactic positions. ALGOL 68 has elaborate
rules of this kind, dividing the complete set of available coercions into four classes and allowing
dierent classes in dierent positions. The particular rules are chosen to avoid ambiguity in
18 Properties of Programming Languages

the program. Ada provides a set of coercions, but does not restrict their use. Instead, the
language denition requires simply that each construct be unambiguously interpretable.
LAX provides Boolean, integer and real as elementary types. We omitted characters and
programmer-dened nite types because they do not raise any additional signicant issues.
Integer division is dened to truncate towards zero to match the behavior of most hardware.
Coercion from integer to real is dened, but there is no way to convert in the opposite
direction. Again, the reason for this omission is that no new issues are raised by it.

2.2.2 Composite Types

Composite objects are constructed from a nite number of components, each of which may
be accessed by a selector. A composite type is formed from the types of the components by
a type constructor, which also denes the selectors. Programming languages usually provide
two sorts of composite objects: records (also known as structures ) and arrays.
Records are composite objects with a xed number of components called elds. Identiers,
which cannot be computed by the program, are used as eld selectors. The type of the
composite object is given by enumeration of the types and selectors of the elds. In some
languages (such as COBOL and PL/1) the description of a record type is bound to a single
object.
A record is used to collect related items, for example the name, address, profession and
other data about a single person. Often the number or form of the data may vary in such
cases. For example, the location of a point in space could be given in terms of rectangular
(x; y; z ) or cylindrical (r; phi; z ) coordinates. In a record of type `point', variations in the form
of the data are thus possible. Pascal allows such a record with variants to be constructed:
type coordinates = (rectangular , cylindrical );
point = record
z : real ;
case c : coordinates of
rectangular : (x ,y : real );
cylindrical : (r ,phi : real );
end;

The elds appearing in every record of the type are written rst, followed by alternative sets
of elds; the c appearing in the case construct describes which alternative set is actually
present.
A union mode in ALGOL 68 is a special case of a variant record, in which every variant
consists of exactly one eld and the xed part consists only of the variant selector. Syntacti-
cally, the construct is not described as a record and the variant selector is not given explicitly.
In languages such as APL or SNOBOL4, essentially all objects are specied in this manner.
An important question about such objects is whether the variant is xed for the lifetime of a
particular object, or whether it forms a part of the state and may be changed.
Arrays dier from records in that their components may be selected via a computable,
one-to-one function whose domain is some nite set (such as any nite type or a subrange
p i q of the integers). In languages with manifest types, all elements of an array have the
same type. The operation a [ e ] (`select the component of a corresponding to e ') is called
indexing. Most programming languages also permit multi-dimensional rectangular arrays, in
which the index set represents a Cartesian product I1 I2 In over a collection of index
domains. Depending upon the time at which the number of elements is bound, we speak of
static (xed at compile time), dynamic (xed at the time the object is created) or exible
(variable by assignment) arrays (cf. Section 2.5.3).
2.2 Data Objects and Operations 19

One-dimensional arrays of Boolean values (bit vectors ) may also be regarded as tabular
encodings of characteristic functions over the index set I . Every value of an array c corre-
sponds to fi j c[i] = trueg. In Pascal such arrays are introduced as `sets' with type set of
index set ; in Ada they are described as here, as Boolean arrays. In both cases, the opera-
tions union (represented by + or or), intersection (*, and), set dierence (-), equality (= and
<>), inclusion (<, <=, >, >=) and membership (in) are dened on such sets. Diculties
arise in specifying set constants: The element type can, of course be determined by looking
at the elements of the constant. But if sets can be dened over a subrange of a type, it is not
usually possible to determine the appropriate subrange just by looking at the elements. In
Pascal the problem is avoided by regarding all sets made up of elements of a particular scalar
type to be of the same type, regardless of the subrange specied as the index set. (Sets of
integers are regarded as being over an implementation-dened subrange.) In Ada the index
set is determined by the context.
Only a few programming languages provide operations (other than set operations) that
may be applied to a composite object as a whole. (APL has the most comprehensive collection
of such operations.) Processing of composite objects is generally carried out componentwise,
with eld selection, indexing and component assignment used as access operations on the
composite objects. It may also be possible to describe groups of array elements, for example
entire rows or columns or even arbitrary rectangular index domains (a[i 1:i 2, j 1:j 2]
in ALGOL 68); this process is called slicing.

2.2.3 Strings
Strings are exceptional cases in most programming languages. In ALGOL 60, strings are
permitted only as arguments to procedures and can thus ultimately be used only as data
for code procedures (normally I/O routines). ALGOL 68 considers strings as exible arrays,
and in FORTRAN 77 or PL/1 the size can increase only to a maximum value xed when
the object is created. In both languages, single characters may be extracted by indexing; in
addition, comparison and concatenation may be carried out on strings whose length is known.
These latter operations consider the entire string as a single unit. In SNOBOL4 strings are
always considered to be single units: Assignment, concatenation, conversion to a pattern,
pattern matching and replacement are elementary operations of the language.
We omitted strings from LAX because they do not lead to any unique problems in compiler
construction.

2.2.4 Pointers
Records, arrays and strings are composite objects constructed as contiguous sequences of
elements. Composition according to the model of a directed graph is possible using pointers,
with which one node can point to another. In all languages providing arrays, pointers can be
represented by indices in an array. Some languages (such as ALGOL 68, Pascal and PL/1)
dene pointers as a new kind of type. In PL/1 the type of the object pointed to is not
specied, and hence one can place an arbitrary interpretation upon the target node of the
pointer. In the other languages mentioned, however, the pointer type carries the type of the
object pointed to.
Pointers have the advantage of security over indices in an array: Indices can be confused
with other uses of integers, pointers cannot. Above all, however, pointers can be used to ref-
erence anonymous objects that are created dynamically. The number of objects thus created
need not be known ahead of time. With indices the array bounds x the maximum number
of objects (except when the array is exible).
20 Properties of Programming Languages

Pascal pointers can reference only anonymous objects, whereas in ALGOL 68 either named
or anonymous objects may be referenced. When named objects have at most a bounded
lifetime, it is possible that a pointer to an object could outlive the object to which it points.
Such dangling references will be discussed in Section 2.5.2.
In addition to the technical questions of pointer implementation, the compiler writer
should be concerned with special testing aids (such as printing programs that can traverse a
structure, outputting links in some reasonable way). The reason is that programs containing
pointers are usually more dicult to debug than those not containing pointers.

2.2.5 Type Equivalence

Whenever we use an object in a typed language (e.g. as an operand of an operation), we
must verify that the type of the object satises the requirements of the context and is thus
admissible. To do this we need a technique to compare types with one another and to
determine whether they are equivalent.
The question of type equivalence is easy to answer as long as there are no type declarations,
and no subranges of a type are treated as types. Under such circumstances we use textual
equivalence : Two types are equivalent if their external representations are the same. Thus
for the elementary types Boolean, character, integer and real the same symbol is required.
Array types are equivalent if they have equivalent element types and the same number of
dimensions; the values of the bounds are compared only in languages with static arrays.
Pointers must point to objects of equivalent type. Procedures must have the same number of
parameters, and corresponding parameter and result types must be equivalent. For records,
it is usually required that both types and eld selectors be equivalent and appear in the same
order. Therefore the following records are all of dierent types:
record a : real ; b : integer end
record x : real ; y : integer end
record y : integer ; x : real end
When type declarations and pointers are both allowed, textual equivalence is no longer a
useful criterion. Attempting to extend the above denitions to recursive types leads to a
cycle in the test. For example, the equivalence of the following types depends upon the
equivalence of the second eld which, in turn, depends upon the equivalence of the original
types:
type
m = record x : real ; y : "m end;
p = record x : real ; y : "p end;
To break the cycle, we may generalize textual equivalence to either structural equivalence or
name equivalence.
Structural equivalence is used in ALGOL 68. In this case, each type identier (mode
indication ) is assumed to be a shorthand notation for the right side of the type declaration.
Two types are equivalent if they are textually equivalent after all type identiers have been
replaced by the right hand sides of their declarations. This process may introduce other type
identiers, and the substitution must be repeated; clearly a recursive type has an innite tex-
tual representation. In order to test for structural equivalence, these innite representations
must be compared. In Section 9.2 we shall see that a practical decision procedure using nite
representations and working in polynomial time is available.
Name equivalence states that two types are equivalent if and only if they are denoted
by the same identier, which identies the same denition in each case. m and p above
2.3 Expressions 21

are dierent types under this denition, since m and p are distinct identiers. The right
hand sides of the declarations of m and p are automatically dierent, since they are not
type identiers. Name equivalence is obviously easy to check, since it only involves xing the
identity of type declarations.
Name equivalence seldom appears in pure form. On the one hand it leads to a ood of type
declarations, and on the other to problems in linking to library procedures that have array
parameters. However, name equivalence is the basis for the denition of abstract data types,
where type declarations that carry the details of the representation are not revealed outside the
declaration. This is exactly the eect of name equivalence, whereas structural equivalence
has the opposite result. Most programming languages that permit type declarations use
an intermediate strategy. Euclid uses structural equivalence locally; as soon as a type is
`exported', it is known only by a type identier and hence name equivalence applies.
If the language allows subranges of the basic types (such as a subrange of integers in
Pascal) the question of whether or not this subrange is a distinct type arises. Ada allows
both: The subrange can be dened as a subtype or as a new type. In the second case, the
pre-dened operations of the base type will be taken over but later procedures requiring
parameters of the base type cannot be passed arguments of the new type.
The type equivalence rules of LAX embody a representative compromise. They require
textual equivalence as discussed above, but whenever a type is denoted by an identier it is
considered elementary. (In other words, if the compiler is comparing two type specications
for equality and an identier appears in one then the same identier must appear in the same
position in the other.) Implementation of these rules illustrate the compiler mechanisms
needed to handle both structure and name equivalence.

2.3 Expressions
Expressions (or formulas ) are examples of composite operations. Their structure resembles
that of composite objects: They consist of a simple operation with operands, which are either
ordinary data objects or further expressions. In other words, an expression is a tree with
operations as interior nodes and data objects as leaves.
An expression written in linear inx notation may lead to distinct trees when interpreted
according to dierent language denitions (Figure 2.2). In low-level languages modeled upon
PL/360, the operators are strictly left-associative with no operator precedence, and parenthe-
ses are prohibited; APL uses right-associativity with no precedence, but permits grouping by
parentheses. Most higher-level languages employ the normal precedence rules of mathematics
and associate operators of the same precedence to the left. FORTRAN 77 (Section 6.6.4) is
an exception: `Once [a tree] has been established in accordance with [the precedence, associ-
ation and parenthesization] rules, the processor may evaluate any mathematically equivalent
expression, provided that the integrity of parentheses is not violated.' The phrase `mathemat-
ically equivalent' implies that a FORTRAN compiler may assume that addition is associative,
even though this is not true for computer implementation of oating point arithmetic. (The
programmer can, however, always indicate the correct sequence by proper use of parentheses.)
The leaves of an expression tree represent activities that can be carried out indepen-
dently of all other nodes of the tree. Interior nodes, on the other hand, depend upon the
values returned by their descendants. The entire tree may thus be evaluated by the following
algorithm:
22 Properties of Programming Languages

+ d

* c

a b
a) Left-associative (e.g. PL/360)
*

a +

b *

c d
b) Right-associative (e.g. APL)
+

* *

a b c d

c) Normal precedence rules

Figure 2.2: Trees for a b + c d
repeat
Select an arbitrary leaf and carry out its designated activity (access to an object or
execution of an operation);
if the selected leaf is the root then terminate;
Transmit the result to the parent of the leaf and delete the leaf;
until termination
This evaluation algorithm performs the operations in some sequence permitted by the data
ow constraints embodied in the tree, but does not specify the order in which operands are
evaluated. It is based upon a principle known as referential transparency [Quine, 1960] that
holds in mathematics: The value of an expression can be determined solely from the values of
its subexpressions, and if any subexpression is replaced by an arbitrary expression with the
same value then the value of the entire expression remains unchanged.
In programming languages, evaluation of an expression may additionally alter the state
of the underlying abstract machine through a side eect. If the altered state is used in
another part of the expression then the principle of referential transparency does not hold,
and dierent evaluation orders may yield dierent results.
Side eects are generally undesirable because they complicate program verication and
optimization. Unfortunately, it is often impossible to mechanically guarantee that no side
2.4 Control Structures 23

eects are present. In Euclid an attempt was made to restrict the possibilities to the point
where the compiler could perform such a check safely. These restrictions include prohibition
of assignments to result parameters and global variables in functions, and prohibition of I/O
operations in functions.
Some side eects do not destroy referential transparency, and are thus somewhat less dan-
gerous. Section 6.6 of the FORTRAN 77 Standard formulates the weakest useful restrictions:
`The execution of a function reference in a statement may not alter the value of any other
entity within the statement in which the function reference appears.'
In some expressions the value of a subexpression determines that of the entire expression.
Examples are:
a and ( ) when a = false
b or ( ) when b = true
c ( ) when c = 0
If the remainder of the expression has no side eect, only the subexpression determining the
value need be computed. The FORTRAN 77 Standard allows this short circuit evaluation
regardless of side eects; the description is such that the program is undened if side eects
are present, and hence it is immaterial whether the remainder of the expression is evaluated or
not in that case. The wording (Section 6.6.1) is: `If a statement contains a function reference
in a part of an expression that need not be evaluated, all entities that would have become
dened in the execution of that reference become undened at the completion of evaluation
of the expression containing the function reference.'
ALGOL 60, ALGOL 68 and many other languages require, in principle, the evaluation
of all operands and hence preclude such optimization unless the compiler can guarantee that
no side eects are possible. Pascal permits short circuit evaluation, but only in Boolean
expressions (User Manual, Section 4a): `The rules of Pascal neither require nor forbid the
evaluation of the second part [of a Boolean expression, when the rst part xes the value]'.
Ada provides two sets of Boolean operators, one (and, or) prohibiting short circuit evaluation
and the other (and then, or else) requiring it.
LAX requires complete evaluation of operands for all operators except and and or. The
order of evaluation is constrained only by data ow considerations, so the compiler may
assume referential transparency. This simplies the treatment of optimization. By requiring
a specic short circuit evaluation for and and or, we illustrate other optimization techniques
and also show how the analysis of an expression is complicated by evaluation order rules.

2.4 Control Structures

There are three possibilities for the composition of several actions: serial, collateral and
parallel. Serial execution is implied by any dependence of two actions upon one another.
Such dependence occurs when (say) one action uses the result of another; more generally, it
occurs in any case where the outcome depends upon the sequence in which the actions occur.
If the actions may be carried out serially or in parallel, or can be interleaved in time, then
we speak of collateral execution. Finally, we use the term parallel when either simultaneous
or interleaved execution is required.
When actions are composed serially, the sequence may be prescribed either implicitly or
explicitly. Most programming languages use the sequence in which the statements are written
as an implicit serial order. The semicolon separating two successive statements in ALGOL 60
and its successors is thus often called the `sequence operator'. For explicit control, we have
the following possibilities:
24 Properties of Programming Languages

Conditional clause
Case clause
Iteration (with or without a count)
Jump, exit, etc.
Procedure call
Conditional clauses make the execution of a component S dependent upon fulllment of
a Boolean condition. In many languages S may only take on one of a restricted number of
forms { in the extreme case, S may only be a jump.
The case clause is a generalization of the conditional clause in which the distinct values of
an expression are associated with distinct statements. The correspondence is either implicit
as in ALGOL 68 (the statements correspond successively to the values 1,2,3,: : : ), or explicit
as in Pascal (the value is used as a case label for the corresponding statement). The latter
construct allows one statement to correspond with more than one value and permits gaps in
the list of values. It also avoids counting errors and enhances program readability.
Several syntactically distinct iteration constructs appear in many programming languages:
with or without counters, test at the beginning or end, etc. The inecient ALGOL 60 rules
requiring the (arbitrarily complex) step and limit expressions to be re-evaluated for each
iteration have been replaced in newer languages by the requirement that these expressions be
evaluated exactly once. Another interesting point is whether the value of the counter may
be altered by assignment within the body of the iteration (as in ALGOL 60), or whether it
must remain constant (as in ALGOL 68). This last is important for many optimizations of
iterations, as is the usual prohibition on jumps into an iteration.
Many programming languages allow jumps with variable targets. Examples are the use
of indexing in an array of labels (the ALGOL 60 switch ) and the use of label variables (the
FORTRAN assigned GOTO). While COBOL or FORTRAN jumps control only the succession
of statements, jumps out of blocks or procedures in ALGOL-like languages in uence the
program state (see Section 2.5). Procedure calls also in uence the state.
The ALGOL 60 and ALGOL 68 denitions explain the operation of procedure calls by
substitution of the procedure body for the call ( copy rule ). This copying process could
form the basis for an implementation ( open subroutines ), if the procedure is not recursive.
Recursion requires that the procedure be implemented as a closed subroutine, a model on
which many other language denitions are based. Particular diculties await the writer of
compilers for languages such as COBOL, which do not distinguish the beginning and end of
the procedure body in the code. This means that, in addition to the possibility of invoking
the procedure by means of a call (PERFORM in COBOL), the statements could be executed
sequentially as a part of the main program.
Parallel execution of two actions is required if both begin from the same initial state and
alter this state in incompatible ways. A typical example is the parallel assignment x, y :=
y, x , in which the values are exchanged. To represent this in a sequential program, the
compiler must rst extend the state so that the condition `identical starting states for both
actions' can be preserved. This can be done here by introducing an auxiliary variable t , to
which x is assigned.
Another case of parallel execution of two actions arises when explicit synchronization is
embedded in these actions to control concurrent execution. The compiler must fall back upon
coroutines or parallel processing facilities in the operating system in order to achieve such
synchronization; we shall not discuss this further.
Collateral execution of two actions means that the compiler need not x their sequence
according to source language constraints. It can, for example, exchange actions if this will
2.5 Program Environments and Abstract Machine States 25

lead to a more ecient program. If both actions contain identical sub-actions then it suces
to carry out this sub-action only once; this has the same eect as the (theoretically possi-
ble) perfectly-synchronized parallel execution of the two identical sub-actions. If a language
species collateral evaluation, the question of whether the evaluation of f (x) in the assign-
ment a[i + 1] := f(x) + a[i + 1] can in uence the address calculation for a[i + 1] by
means of a side eect is irrelevant. The compiler need only compute the address of a[i + 1]
once, even if i were the following function procedure:
function i : integer ; begin k := k + 1; i := k end;
In this case k will be incremented only once, a further illustration of side eects and the
meaning of the paragraph from the FORTRAN 77 Standard quoted at the end of Section 2.3.

2.5 Program Environments and Abstract Machine States

The operations of a programming language are applied to states of the abstract machine for
this language and transform those states. The state is represented by the combination of the
data objects and values existing at a particular point in time, the hierarchy of procedure calls
not yet completed, and the representation of the next operation in the program text. The set
of data objects belonging to a state (independent of their values), together with the procedure
call hierarchy, constitute the environment (present in that state). We can thus distinguish
three distinct schemata for state transitions:
Specify a new successor operation (e.g. by means of a jump).
Change the value of an existing data object by means of an assignment.
Change the size of the state.
We have already discussed the rst possibility in Section 2.4.
2.5.1 Constants, Variables and Assignment
The data objects in a programming language either have constant values or are variable.
Constants are either specied by denotations (numbers, characters, strings) or are made to
correspond to identiers by giving a declaration. The latter are called symbolic constants, and
contain the manifest constants as a subclass. The value of a manifest constant is permanently
xed and can be determined at compile time. A compiler could replace each occurrence of a
manifest constant identier by its value, and then forget the identier completely. (The con-
stant declarations of Pascal, for example, create manifest constants.) In addition to manifest
constants, a language may permit dynamic constants. These can be treated by the compiler
as variables to which a value is assigned when the variable is declared, and to which further
assignments are prohibited. The following ALGOL 68 identity declaration creates a dynamic
constant c :
int c = if p then 3 * x else y + 1 ;
(If p , x and y are really manifest constants then the compiler could optimize by evaluating the
conditional statement and then treating c as a manifest constant as well. This optimization
is called folding { see Chapter 13.)
In the simplest case, variables are data objects with the following properties:
They are identied either by an identier or a composite access path such as a pair
(identier, index ).
26 Properties of Programming Languages

They possess a value (from a domain determined by their type).

There exists an access function to use their value as an operand.
There exists an access function/assignment to alter their value.
This model of an elementary variable explains the variable concepts in FORTRAN,
COBOL, ALGOL 60, and partially explains that of Pascal.
In many languages, the only assignment permitted to a variable of composite type is an
assignment to a component. For example, ALGOL 60 does not allow assignment of an entire
composite object and also prohibits composite objects as results of function procedures. A
composite object must, however, be considered basically as a unit. Thus any assignment to
a component is an assignment to the entire object.
A variable does not always retain the last assigned value until a new value is assigned.
Typical examples are the control variables in ALGOL 60 and FORTRAN iterations, whose
values are undened upon normal termination of the iteration. These rules permit the com-
piler to advance the control variable either before or after the termination test. (Clearly
the two possibilities lead to dierent results and hence the value of the controlled variable
cannot be guaranteed. ALGOL 68 avoids this problem because the control variable is local
to the iteration body.) Another example is the undenition of a COBOL record by the write
operation. This permits implementation of the write operation by either changing the buer
pointer or by transferring data. The FORTRAN 66 Standard gives (in Section 10.2.3.1) a
further list of situations in which variables become undened. A compiler writer should care-
fully examine the language denition for such rules, since they normally lead to optimization
possibilities.
The pointer objects discussed in Section 2.2.4 provide access paths to other objects. By
using pointers, an arbitrary number of access paths to a given object can be created. In the
special case of parameter transmission, additional access paths can be created even without
pointers (see Section 2.5.3). The following identity declaration from ALGOL 68 is an example
of the general case:
ref m x = :::;
Here the right hand side must give an access path to an object; x then identies a new access
path to this object. In contrast to the ALGOL 60 name parameter, the identity of the object
is xed at the time the identity declaration is executed. Some languages permit creation of
access paths with limited access rights: Assignments may be forbidden over certain access
paths or in certain contexts. For example, assignments to global parameters are forbidden
in Euclid functions. If such restrictions exist, adherence to them must be veried by the
compiler during semantic analysis.
Existence of several access paths to the same object complicates the data ow analysis
(analysis of assignment and use patterns) required to verify certain semantic constraints and
to check for the applicability of certain optimizations. If the compiler writer wishes to delay
an assignment, for example, he must be certain that an access to the new value will not be
attempted over a dierent access path. This complication is termed the aliasing problem.
The LAX identity declaration allows creation of an arbitrary number of new access paths
to any variable. It is, however, the only mechanism by which new access paths can be created.
This allows us to illustrate the aliasing problem in its full generality in one place, rather than
having it appear in several dierent constructs with possibly dierent constraints.
2.5.2 The Environment
The environment of a program fragment species not only which objects exist, but also
the access paths by which they may be reached. Changes in the accessibility (or visibility )
2.5 Program Environments and Abstract Machine States 27

of objects are generally associated with procedure call and return, and for this reason the
procedure call hierarchy forms a part of the environment. We shall now consider questions of
lifetime and visibility; the related topic of procedure parameter transmission will be deferred
to Section 2.5.3.
That part of the execution history of a program during which an object exists is called
the extent of the object. The extent rules of most programming languages classify objects as
follows:
Static: The extent of the object is the entire execution history of the program.
Automatic: The extent is the execution of a specied syntactic construct (usually a
procedure or block).
Unrestricted: The extent begins at a programmer-specied point and ends (at least
theoretically) at the end of the program's execution.
Controlled: The programmer species both the beginning and end of the extent by
explicit construction and destruction of objects.
Objects in COBOL and the blank common block of FORTRAN are examples of static
extent. Local variables in ALGOL 60 or Pascal, as well as local variables in FORTRAN sub-
programs, are examples of automatic extent. (Labeled common blocks in FORTRAN 66 also
have automatic extent, see Section 10.2.5 of the standard.) List elements in LISP and objects
created by the heap generator of ALGOL 68 have unrestricted extent, and the anonymous
variables of Pascal are controlled (created by new and discarded by dispose).
The possibility of a dangling reference arises whenever a reference can be created to an
object of restricted extent. To avoid errors, we must guarantee that the referenced object
exists at the times when references to it are actually attempted. A sucient condition to make
this guarantee is the ALGOL 68 rule (also used in LAX) prohibiting assignment of references
or procedures in which the extent of the right-hand side is smaller than the reference to which
it is assigned. It has the advantage that it can be checked by the compiler in many cases,
and a dynamic run-time check can always be made in the absence of objects with controlled
extent. When a language provides objects with controlled extent, as do PL/1 and Pascal,
then the burden of avoiding dangling references falls exclusively upon the programmer.
LAX constant are the only object having static extent. Variables are generally automatic,
although it is possible to generate unrestricted variables. The language has no objects with
controlled extent, because such objects do not result in any new problems for the compiler.
Static variables were omitted because the techniques used to deal with automatic variables
apply to them essentially without change.
By the scope of an identier denition we understand the region of the program within
which we can use the identier with the dened meaning. The scope of an identier denition
is generally determined statically by the syntactic construct of the program in which it is
directly contained. A range is a syntactic construct that may have identier denitions
associated with it. In a block-structured language, inner ranges are not part of outer ranges.
Usually any range may contain at most one denition of an identier. Exceptions to this
rule may occur when a single identier may be used for distinct purposes, for example as
an object and as the target of a jump. In ALGOL-like languages the scope of a denition
includes the range in which it occurs and all enclosed ranges not containing denitions of the
same identier.
Consider the eld selection p.f . The position immediately following the dot belongs
to the scope of the declaration of p 's record type. In fact, only the eld selectors of that
record type are permitted in this position. On the other hand, although the statement s
of the Pascal (or SIMULA) inspection with p do s also belongs to the scope of p 's record
28 Properties of Programming Languages

type declaration, the denitions from the inspection's environment remain valid in s unless
overridden by eld selector denitions. In COBOL and PL/1, f can be written in place of
p.f (partial qualication ) if there is no other denition of f in the surrounding range.
The concept of static block structure has the consequence that items not declared in a
procedure are taken from the static surrounding of the procedure. A second possibility is that
used in APL and LISP: Nonlocal items of functions are taken from the dynamic environment
of the procedure call.
In the case of recursive procedure calls, identically-declared objects with nested extents
may exist at the same time. Diculties may arise if an object is introduced (say, by parameter
transmission) into a program fragment where its original declaration is hidden by another
declaration of the same identier. Figure 2.3 illustrates the problem. This program makes
two nested calls of p , so that two incarnations, q1 and q2 , of the procedure q and two variables
i1 and i2 exist at the same time. The program should print the values 1, 4 and 1 of i2 , i1
and k . This behavior can be explained by using the contour model.
procedure outer ;
var n , k : integer ;
procedure p (procedure f; var j : integer );
label 1;
var i : integer ;
procedure q ;
label 2;
begin (* q *)
n := n + 1; if n = 4 then q;
n := n + 1; if n = 7 then 2 : j := j + 1;
i := i + 1;
end; (* q *)
begin (* p *)
i := 0;
n := n + 1; if
n = 2 then p (q , i ) else j := j + 1;
if
n = 3 then
1 : f;
i := i + 1;
writeln (' i = ', i :1);
end; (* p *)
procedure empty ; begin end
;
begin (* outer *)
n := 1; k := 0;
p (empty , k );
writeln (' k = ', k :1);
end; (* outer *)

Figure 2.3: Complex Procedure Interactions in Pascal

The contour model captures the state of the program execution as a combination of
the (invariant) program text and the structured set of objects (state) existing at respective
points in time. Further, two pointers, ip and ep belong to the state. ip is the instruction
pointer, which indicates the position in the program text. For block-structured languages
the state consists of a collection of nested local environments called contours. Each contour
corresponds to a range and contains the objects dened in that range. If the environment
pointer ep addresses a contour c , then all of the objects declared in c and enclosing contours
2.5 Program Environments and Abstract Machine States 29

are accessible. The contour addressed by ep is called the local contour. The object identied
by a given identier is found by scanning the contours from inner to outer, beginning at the
local contour, until a denition for the specied identier is found.
The structure of the state is changed by the following actions:
Construction or removal of an object.
Procedure call or range entry.
Procedure return or range exit.
Jump out of a range.
When an object with automatic extent is created, it lies in a contour corresponding to
the program construct in which it was declared; static objects behave exactly like objects
declared in the main program with automatic extent. Objects with unrestricted extent and
controlled objects lie in their own contours, which do not correspond to program constructs.
Upon entry into a range, a new contour is established within the local contour and the
environment pointer ep is set to point to it. Upon range exit this procedure is reversed: the
local contour is removed and ep set to point to the immediately surrounding contour.
Upon procedure call, a new contour c is established and ep set to point to it. In contrast
to range entry, however, c is established within the contour c ' addressed by ep at the time of
procedure declaration. We term c ' the static predecessor of c to distinguish it from c ", the
dynamic predecessor, to which ep pointed immediately before the procedure call. The pointer
to c " must be stored in c as a local object. Upon return from a procedure the local contour
of the procedure is discarded and the environment pointer reset to its dynamic predecessor.
To execute a jump into an enclosing range b , blocks and procedures are exited and the
corresponding contours discarded until a contour c corresponding to b is reached such that
c contained the contour of the jump. c becomes the new local contour, to which ep will
point, and ip is set to the jump target. If the jump target is determined dynamically as a
parameter or the content of a label variable, as is possible in ALGOL 60, then that parameter
or variable must specify both the target address and the contour that will become the new
local contour.
Figures 2.4 and 2.5 show the contour model for the state existing at two points during
the execution of the program of Figure 2.3. Notice that several contours correspond to the
same range when a procedure is called recursively. Further, the values of actual parameters
of a procedure call should be computed before the environment pointer is altered. If this is
not done, the pointer for parameter computation must be restored (as is necessary for name
parameters in ALGOL 60).
In order to unify the state manipulation, procedures and blocks are often processed iden-
tically. A block is then a parameterless procedure called `on the spot'. The contour of a block
thus has a dynamic predecessor identical with its static predecessor. The lifetimes of local
objects in blocks can be determined by the compiler, and a static overlay structure for them
can be set up within the contour of the enclosing procedure. The main program is counted
as a procedure for this purpose. The scope rules are not altered by this transformation. Con-
tours for blocks can be dispensed with, and all objects placed in the contour of the enclosing
procedure. Arrays with dynamic bounds lead to diculties with this optimization, since the
bounds can be determined only at the time of actual block entry.
The rules discussed so far do not permit description of either LISP or SIMULA. In LISP a
function f may have as its result a function g that accesses the local storage of f . Since this
storage must also exist during the call of g , the contour of f must be retained at least until
g becomes inaccessible. Analogously, a SIMULA class k (an object of unrestricted extent)
may have name parameters from the contour in which it was instantiated. This contour must
therefore be retained at least until k becomes inaccessible.
30 Properties of Programming Languages

Contour for procedure outer

n: 3
k: 0
Contour for procedure p
empty
f = empty
p
j=k
i1: 1
q1

Contour for procedure p

f = q1
j = i1
i2: 0
ep
q2

Note: Arrows show dynamic predecessor

Figure 2.4: Contours Existing When Control Reaches Label 1 in Figure 2.3

Contour for procedure outer

n: 7
k: 0
Contour for procedure p
empty
f = empty
p
j=k Contour for procedure q
i1: 2 ep
q1

Contour for procedure p

f = q1
j = i1
i2: 0
q2

Figure 2.5: Contours Existing When Control Reaches Label 2 in Figure 2.3
2.5 Program Environments and Abstract Machine States 31

We solve these problems by adopting a uniform retention strategy that discards an object
only when that object becomes inaccessible. Accessibility is dened relative to the current
contour. Whenever an object in a contour c references another object in a dierent contour,
c ', we implement that reference by an explicit pointer from c to c '. (Such references include
the dynamic predecessors of the contour, all reference parameters, and any explicit pointers
established by the user.) A contour is accessible if it can be reached from the current contour
by following any sequence of pointers or by a downhill walk. The dangling reference problem
vanishes when this retention strategy is used.
2.5.3 Binding
An identier b is termed bound (or local ) in a range if this range contains a denition for b ;
otherwise b is free (or global ) in this range. As denitions we have:
Declarations of object identiers (including procedure identiers)
Denitions: Label denitions, type denitions, FORTRAN labeled common blocks, etc.
Formal parameter denitions
In the rst and second cases the dened value along with all of its attributes is obvious
from the denition. In the third case only the identier and type of the dened value are
available via the program text. The actual parameter, the argument, will be associated with
the identier by parameter transmission at the time of the procedure call. We distinguish
ve essentially dierent forms of parameter transmission:
1. Value (as in ALGOL 60, SIMULA, Pascal, Ada, for example): The formal parameter
identies a local variable of the procedure, which will be initialized with the argument
value at the procedure call. Assignment to the parameter does not aect the caller.
2. Result (Ada): The formal parameter identies a local variable of the procedure with
undened initial value. Upon return from the procedure the content of this local variable
is assigned to the argument, which must be a variable.
3. Value/Result (FORTRAN, Ada): The formal parameter identies a local variable of
the procedure, which will be initialized with the argument value at the procedure call.
Upon return from the procedure the content of this local variable is assigned to the
argument if the argument is a variable. The argument variable may be xed prior to
the call or redetermined upon return.
4. Reference (FORTRAN, Pascal, Ada): A reference to the argument is transmitted to
the procedure. All operations on the formal parameter within the procedure are carried
out via this reference. (If the argument is an expression but not a variable, then the
result is placed in a temporary variable for which the reference is constructed. Some
languages, such as Pascal, do not permit use of an expression as an argument in this
case.)
5. Name (ALGOL 60): A parameterless procedure p , which computes a reference to the
argument, is transmitted to the procedure. (If the argument is an expression but not a
variable then p computes the value of the expression, stores it in a temporary variable
h , and yields a reference to h .) All operations on the formal parameter rst invoke p
and then operate via the reference yielded by p .
Call by value is occasionally restricted to a strict value transmission in which the formal
parameter identies not a local variable, but rather a local constant. Call by name is explained
in many language denitions by textual substitution of the argument for the parameter.
32 Properties of Programming Languages

ALGOL 60 provides for argument evaluation in the environment of the caller through a
consistent renaming.
The dierent parameter mechanisms can all be implemented in terms of (strict) call by
value, if the necessary kinds of data are available. For cases (2)-(4), the language must
provide the concept of arbitrary references as values. Call by name also requires the concept
of procedures as values (of procedure variables). Only when these concepts are unavailable are
the transmission mechanisms (2)-(5) important. This is clear in the language SIMULA, which
(in addition to the value and name calls inherited from ALGOL 60) provides call by reference
for classes and strings. A more careful study shows that in truth this could be handled by
an ordinary value call for references. In ALGOL 68 the call by reference is stated in terms of
the strict call by value, by using an identity declaration to make the formal parameter fp an
alias of the argument ap :
ref int fp = ap

Expressions that do not yield references are not permitted as arguments if this explanation
of call by reference is used, since the right hand side of the identity declaration must yield a
reference.
LAX follows the style of ALGOL 68, explaining its argument bindings in terms of identity
declarations. This provides a uniform treatment of all parameter mechanisms, and also elim-
inates the parameter mechanism as a distinct means of creating new access paths. Finally,
the identity declaration gives a simple implementation model.
Many language denitions do not specify parameter transmission mechanisms explicitly.
The compiler writer must therefore attempt to delineate the possibilities by a careful con-
sideration of their eects. For example, both case (3) and case (4) satisfy the conditions of
the FORTRAN 66 Standard, but none of the others do. Ada generally requires case (1), (2)
or (3). For composite objects, however, case (4) is permitted as an alternative. Use of this
alternative is at the discretion of the implementor, and the programmer is warned that any
assumptions about the particular transmission mechanism invalidates the program.
Programs whose results depend upon the parameter transmission mechanism are generally
dicult to understand. The dependencies arise when an object has two access paths, say via
two formal parameters or via a global variable and a formal parameter. This can be seen in
the program of Figure 2.6a, which yields the results of Figure 2.6b for the indicated parameter
mechanisms.
In addition to knowing what value an identier is bound to, it is important to know
when the binding takes place. The parameter transmission dierences discussed above can,
to a large extent, be explained in terms of binding times. In general, we can distinguish the
following binding times (explained in terms of the identity declaration ref realx =a[i,j+3] ):
1. Binding at each access (corresponding to call by name): Upon each access to x the identity
of a[i, j + 3] is re-determined.
2. Binding at rst access: Upon the rst access to x the identity of a[i, j + 3] will be
determined. All assignments to i and j up to that point will have an eect.
3. Binding upon declaration (corresponding to call by reference): After elaboration of the
identity declaration the identity of a[i, j + 3] is xed. In several languages the
identiers on the right-hand side must not be declared in the same range, to avoid
circular denitions.
4. Static binding: The identity of a[i, j + 3] is xed throughout the entire program. In
this case a must have static extent and statically-determined size. The values of i and
j must be dened prior to program execution and be independent of it (hence they
must be constants).
2.5 Program Environments and Abstract Machine States 33

begin
int m :=1, n ;
proc p = (??? int j , ??? int k ) int:
begin j := j + 1 ; m := m + k; j + k end;
n := p (m , m + 3)
end
Note: `???' depends upon the parameter mechanism.
a) An ALGOL 68 program
Mechanism m n j k Comment
Value 5 6 2 4 Strict value is not possible due to the assignment to j .
Value/Result 2 6 2 4 Pure result is unreasonable in this example.
Reference 6 10 6 4 Only j is a reference parameter because an expression
is illegal as a reference parameter in ALGOL 68. Hence
k is a value parameter.
Name 7 17 7 10
Note: m and n were evaluated at the end of the main program, j and k at the end of p .
b) The eect of dierent parameter mechanisms
Figure 2.6: Parameter Transmission

In this spectrum call by result would be classied as binding after access. Call by value
is a binding of the value, not of the reference.
Determination of identity is least costly at run time for static binding and most costly for
binding at access. During the analysis of the program, the compiler writer is most concerned
with gathering as much information as possible, to bind as early as he can. For this reason
static binding breaks into two subcases, which in general depend not upon the language but
upon other considerations:
4a. Binding at compilation time. The identity of the bound values is determined during
compilation.
4b. Binding at program initialization: The identity of les or of external procedures will be
determined during a pre-process to program execution.
In case 4a the knowledge of the bound values can be used in optimization. 4b permits
repeated execution of the program with dierent bindings without re-compilation.
Free identiers, which are not dened in a procedure, must be explained in the context of
the procedure so that their meaning can be determined. The denitions of standard identiers,
which may be used in any program without further declaration, are tted into this scheme
by assuming that the program is embedded in a standard environment containing denitions
for them.
By an external entity we mean an entity identied by a free identier with no denition
in either the program or the standard environment. A program with external entities cannot
be compiled and then directly executed. Another step, which obtains the objects associated
with external entities from a program library, must be introduced. We shall discuss this step,
the binding of programs, in Chapter 11. In the simplest case the binding can be separated
from the compilation as an independent terminal step. This separation is normally chosen
for FORTRAN implementations. One consequence is that the compiler has no complete
overview of the properties of external entities and hence cannot verify that they are used
consistently. Thus in FORTRAN it is not usually possible for the compiler to determine
34 Properties of Programming Languages

whether external subprograms and functions are called with the correct number and type
of parameters. For such checking, but also to develop the correct accesses, the compiler
must have specications like those for formal parameters for every external entity. Many
implementations of ALGOL 60, Pascal, etc. provide that such specications precede or be
included in independently compiled procedures. Since in these languages, as in many others,
separate compilation of language units is not specied by the language denition, the compiler
writer himself must design the handling of external values in conjunction with introduction
of these possibilities. Ada contains a far-reaching specication scheme for external entities.

2.6 Notes and References

We draw our examples from a number of languages. In order to avoid the necessity for
referencing the proper denition each time a language property is discussed, we give an
exhaustive list of the languages we use and their dening documents at the beginning of the
Bibliography.
Descriptions of languages in the ALGOL family are interpretive, as are those of FORTRAN
and COBOL. The description of PL/1 with the help of the Vienna denition method (VDL
[Lucas and Walk, 1969; Wegner, 1972]) is likewise interpretive. Other denition methods
are the axiomatic [Hoare and Wirth, 1973] and the denotational [Gordon, 1979; Tennent,
1981].
Many languages are described by a given implementation. We have nothing against this,
provided that the implementation is stated in an abstract form such as that of EVALQUOTE,
the function that implements the kernel of LISP interpretively. Often, however, it is never
dened in a high-level manner and a new implementation of the same language is very di-
cult. The macro implementation of SNOBOL4, Griswold [1972] although highly successful,
exhibits this problem.
We have associated the concept of type with the set of operations possible on a value. This
led us to conclude that size was a distinct property. Both ALGOL 68 and Pascal, however,
treat values of distinct sizes as having distinct types. Habermann [1973] gives a critical
assessment of this philosophy and its eect in Pascal.
We have only skimmed the properties of numeric types. Knuth [1969] presents the
general view of oating point numbers and shows how oating point operations relate to
the corresponding mathematical operations on real numbers. A machine-oriented model that
relates the parameters of the number system to specic characteristics of the target machine
is given by Brown [1977, 1981].
The contour model was originally described by Dijkstra [1960, 1963] as an implemen-
tation technique for ALGOL 60. Johnston [1971] coined the name and introduced the
graphical representation used here. A formal proof that the contour model is equivalent to
consistent renaming and the copy rule as used in the denition of ALGOL 60 was given by
Jones and Lucas [1971].
Parallel processing, exception handling and some other features of modern languages have
been intentionally omitted from the overview given in this chapter.

Exercises
2.1 [Housden, 1975; Morrison, 1982] Consider the manipulation of character string data
in a general purpose programming language.
2.6 Notes and References 35

(a) What set of operations should be available on strings?

(b) Should strings be regarded as elementary or composite objects? Why?
(c) Should strings be regarded as objects of a separate type (or types), or as arrays
of characters? Support your position.
2.2 Suppose that Pascal were changed so that the structural equivalence rule (Section 2.2.5)
held for types and so that " could precede any type constructor. Show that the types
m and p given in the text are equivalent, and that they are also equivalent to the type
q dened as follows:

type q = record
x : real ;
y : "record
x : real ;
y : q"
end
end;
2.3 Why is the Boolean expression (x -1) and (sqrt (1 + x) > y ) meaningless in
Pascal, FORTRAN or ALGOL 60? Consider only structurally equivalent expressions
in the various languages, making any necessary syntactic changes. Give a similar
expression in Ada that is meaningful.
2.4 Give the rules for contour creation and destruction necessary to support the module
concept in Ada.
2.5 Consider a block-structured language such as SIMULA, in which coroutines are allowed.
Generalize the contour model with a retention strategy to handle the following situation:
If n coroutines are started in block b , all have contour c as dynamic predecessor.
By means of call-by-name parameters, a coroutine can obtain access to an object o
belonging to c ; on the other hand, contour c can disappear (because execution of b
has terminated) long before termination of the coroutine. o is then nonexistent, but
the access path via the name parameter remains. What possible solutions do you see
for this problem?
2.6 The retention strategy discussed in connection with SIMULA in Exercise 2.5 could be
used to support parallel processing in ALGOL 68. Quote sections of the ALGOL 68
Report to show that a simpler strategy can be used.
2.7 What problems arise from result parameters in a language that permits jumps out of
procedures?
2.8 Consider a program in which several procedures execute on dierent processors in
a network. Each processor has its own memory. What parameter mechanisms are
appropriate in such a program?
36 Properties of Programming Languages
Chapter 3
Properties of Real and Abstract
Machines
In this chapter we shall discuss the target machine properties relevant for code generation,
and the mapping of the language-oriented objects and operations onto objects and operations
of the target machine. Systematic code generation must, of course, take account of the pecu-
liarities and weaknesses of the target computer's instruction set. It cannot, however, become
bogged down in exploitation of these special idiosyncrasies; the payo in code eciency will
not cover the implementation cost. Thus the compiler writer endeavors to derive a model of
the target machine that is not distorted by exceptions, but is as uniform as possible, to serve
as a base for code generator construction. To this end some properties of the hardware may
be ignored, or gaps in the instruction set may be lled by subroutine invocations or inline
sequences treated as elementary operations. In particular, the instruction set is extended by
the operations of a run-time system that interfaces input/output and similar actions to the
operating system, and attends to storage management.
Further extension of this idea leads to construction of abstract target machines imple-
mented on a real machine either interpretively or by means of a further translation. (Inter-
pretive abstract machines are common targets of code generation for microprocessors due to
the need for space eciency.) We shall not attempt a systematic treatment of the goals, meth-
ods and criteria for the design of abstract target machines here; see the Notes and References
for further guidance.

3.1 Basic Characteristics

Most computers have machine languages that are typeless in the sense of Section 2.2: The
interpretation of an object is determined by the operations applied to it. Exceptions are
computers like the Burroughs 5000 and its descendants that associate `tag bits' with each
word. The extra bits reduce the number of possible interpretations of the word, or even make
that interpretation unique.
Objects reside in storage of various classes. Access paths, characteristic of the particular
storage class, are used to access these objects as operands or results of operations. Storage
classes, access paths and operations together constitute a model dening the computer for
code generation purposes.
In this section we shall survey typical storage classes, access paths and operations, and
indicate how instructions may be encoded. The remainder of the chapter will show how these
facilities can be used to implement the source language concepts presented in Chapter 2.
37
38 Properties of Real and Abstract Machines

3.1.1 Storage Classes

Computer storage can usually be classied as follows for code generation purposes:
Main Storage: Randomly-accessible array of identically-sized locations.
Stack: Storage accessed in a last-in, rst-out manner.
Integer Accumulator: Storage on which integer arithmetic instructions operate.
Floating point Accumulator: Storage on which oating point arithmetic instructions
operate.
Base Register: Storage used in operand access functions to hold addresses.
Index Register: Storage used in operand access functions to hold integer osets.
Program Counter: Storage used to hold the address of the next instruction to be exe-
cuted.
Condition Code: Storage used to hold the result of a comparison or test instruction.
Other Special Register (e.g. Stack Pointer, Programmable Boolean Flag).
Examples of this classication applied to typical machines are given in Figure 3.1.
Every computer provides at least the main storage and program counter classes. (Whether
main storage is virtual or real is of no concern.) A particular storage component may belong
to more than one class. For example, the base register and index register classes are identical
on most computers. On the IBM 370 these are the `general-purpose registers', which also
serve as integer accumulators. Storage classes may also overlap without being identical, as in
the case of the Univac 1100 series. These computers have sixteen `index registers' belonging
to the index and base register classes and sixteen `general-purpose registers' belonging to the
integer accumulator and oating point accumulator classes. However, the two storage classes
overlap, with four registers belonging to both. These four registers may be accessed as index
registers or as general-purpose registers, and their properties depend upon the access path
used.
Whether a particular storage class exists, and if so what its properties are, is partially a
decision of the compiler writer. If, for example, he chooses to access a specic portion of the
main memory of the Motorola 68000 only via stack operations relative to register A7 then this
portion of the memory belongs to the storage class `stack' and not the class `main storage'.
Main storage.
General registers R0,...,R15 serving as integer accumulators, base registers or index registers.
Register pairs (R0,R1),(R2,R3),...,(R14,R15) serving as integer accumulators.
Floating point registers F0,F2,F4,F6 serving as oating point accumulators.
Program counter
Condition code
a) IBM 370

Main storage Data registers D0,...,D7 serving as integer accumulators or index registers.
Address registers A0,...,A7 serving as base or index registers.
Program counter PC
Condition code
Stack pointer A7
b) Motorola 68000
Figure 3.1: Storage Classes
3.1 Basic Characteristics 39

(Such a decision can be made dierently for the generated code and the run-time system,
implying that the memory belongs to one class as far as the generated code is concerned and
another for the run-time system.) Also, since the properties of a storage class depend to a
certain extent upon the available access paths, a Motorola 68000 stack will dier from that
of a Burroughs 6700/7700.
Most storage classes consist of a sequence of numbered elements, the storage cells. (The
numbering may have gaps.) The number of a storage cell is called its address . Every access
path yields an algorithm, the eective address of the access path, for computing the address
of the storage cell being accessed. We speak of byte-oriented computers if the cells in the main
storage class have a size of 8 bits, otherwise (e.g. 16, 24, 32, 48 or 60 bits per cell) we term
the computer word-oriented . For a word-oriented computer the cell sizes in the main storage
and register classes are usually identical, whereas the registers of a byte-oriented computer
(except for some microprocessors) are 2, 4 or possibly 8 bytes long. In this case the storage
cell of the integer accumulator class is usually termed a word.
All storage is ultimately composed of bits. Some early computers (such as the IBM 1400
series) used decimal arithmetic and addressing, and many current computers provide a packed
decimal (4 bits per digit) encoding. None of these architectures, however, consider decimal
digits to be atoms of storage that cannot be further decomposed; all have facilities for accessing
the individual bits of the digit in some manner.
Single bits and bit sequences such as the decimal digits discussed above cannot be accessed
directly on most machines. Instead, the bit sequence is characterized by a partial-word access
path specifying the address of a storage cell containing the sequence, the position of the
sequence from the left or right boundary of this unit, and the size of the sequence. Often this
partial word access path must be simulated by means of shifts and logical operations.
Aggregates hold objects too large for a single storage cell. An aggregate will usually be
specied by the address of its rst storage cell, and the cells making up the aggregate by their
addresses relative to that point. Often the address of the aggregate must be divisible by a
given integer, called the alignment. Figure 3.2 lists main storage operand sizes and alignments
for typical machines.
Operand Size (bits) Alignment
Byte 8 1
Halfword 16 2
Word 32 4
Doubleword 64 8
String up to 256 8 1
a) IBM 370 - Storage cell is an 8-bit byte
Operand Size (bits) Alignment
Bit 1 -
Digit 4 -
Byte 8 1
Word 16 2
Doubleword 32 2
b) Motorola 68000 - Storage cell is an 8-bit byte
Figure 3.2: Operand Sizes
Aggregates also appear in classes other than main storage. For example, the 16 general
purpose registers of the IBM 370 form a storage class of 4-byte cells addressed by the numbers
40 Properties of Real and Abstract Machines

0 through 15. Every register whose address is even forms the rst element of a larger entity
(a register pair) used in multiplication, division and shift operations. When a single-length
operand for such an operation is supplied, it should be placed in the proper register of a
pair rather than in an arbitrary register. The other register of the pair is then automatically
reserved for the operation, and cannot be used for other purposes.
The entities of a particular level in a hierarchy of aggregates may overlap. This occurs,
for example, for the segments in the main storage class of the Intel 8086 (65536-byte blocks
whose addresses are divisible by 16) or the 4096-byte blocks addressable via a base or index
register in the IBM 370.
Operations on registers usually involve the full register contents. When an object whose
size is smaller than that of a register is moved between a register and storage of some other
class, a change of representation may occur. The value of the object must, however, remain
invariant. Depending upon the type of the object, it may be lengthened by inserting leading
or trailing zeros, or by inserting leading or trailing copies of the sign. When it is shortened,
we must guarantee that no signicant information is lost. Thus the working length of an
object must be distinguished from the storage length.
3.1.2 Access Paths
An access path describes the value or location of an operand, result or jump target. We
classify an instruction as a 0-, 1-, 2-, or 3-address instruction according to the number of
access paths it species. Very seldom are there more than three access paths per instruction,
and if more do exist then they are usually implicit. (For example, in the MVCL instruction
of the IBM 370 the two register specications R1 and R2 actually dene four operands in
registers R1, R1+1, R2 and R2+1 respectively.)
Each access path species the initial element of an operand or result in a storage class.
Access paths to some of the storage classes (such as the stack, program counter, condition
code and special registers) are not normally explicit in the instruction. They will appear only
when there is some degree of freedom associated with their use, as in the PDP11 where any
register can be used as a stack pointer.
The most common explicit access paths involve one of the following computations:
Constant. The value appears explicitly in the instruction.
Register. The content of the register is taken as the value.
Register+constant. The sum of the content of the register and a constant appearing
explicitly in the instruction is taken as the value.
Register+register. The sum of the contents of two registers is taken as the value.
Register+register+constant. The sum of the contents of two registers and a constant
appearing in the instruction is taken as the value.
The computed value may itself be used as the operand (immediate ), it may be used as the
eective address of the operand in main storage (direct ), or it may be used as the address of
an address (indirect ). On some machines the object fetched from main storage in the third
case may specify another computation and further indirection, but this feature is rarely used
in practice. Figure 3.3 illustrates these concepts for typical machines.
The addresses of registers must almost always appear explicitly as constants in the instruc-
tion. In special cases they may be supplied implicitly, as when the content of the (unspecied)
program counter is added to a constant given in the instruction (relative addressing ). If the
computed value is used as an address then the registers must belong to the base register
or index register class; the sum of the (unsigned) base address and (signed) index is often
3.1 Basic Characteristics 41

i: Operand is the byte i from the instruction.

d(m,n): Operand is the 24-bit value obtained by (Rm)+(Rn)+d. Only the low-order 24 bits of each
register are used, and the value is interpreted as positive. Over ow in the addition is ignored.
If m or n is 0 then the content of the register is assumed to be 0; the actual content of general
register 0 is not used.
m: Operand is the content of general register Rm.
m: Operand is the content of general register pair (Rm,Rm+1).
m: Operand is the content of oating point register Fm.
d(m,n): Operand is the content of a memory area whose address is the value computed as discussed
above.
Implicit access to the condition code and program counter.
Note: 0 i < 28 , 0 d < 212 , 0 m; n < 24
a) IBM 370
=i16: Operand is the word following the instruction.
=i32: Operand is the doubleword following the instruction.
i16: Operand is the value (PC)+i16.
i8(Am): Operand is the value (PC)+(Am)+i8.
i8(Dn): Operand is the value (PC)+(Dn)+i8.
Am: Operand is the content of address register Am.
Dn: Operand is the content of data register Dn.
(Am): Operand is the content of a memory area whose address is the content of address register Am.
i16(Am): Operand is the content of a memory area whose address is the value of (Am)+i16.
i8(Am,Dn): Operand is the content of a memory area whose address is the value of (Am)+(Dn)+i8.
(Am)+: Operand is the content of a memory area whose address is the content of Am. Am is then
incremented by the operand length. The increment is never less than 2 for A7.
-(Am): Am is decremented by the operand length. Operand is then the content of a memory area
whose address is the content of Am. The decrement is never less than 2 for A7.
Implicit access to the condition code and program counter.
b) Motorola 68000
Figure 3.3: Access Paths

interpreted modulo the address size. The values of constants in instructions are frequently re-
stricted to nonnegative values, and often their maximum values are far less than the maximum
address. (An example is the restriction to the range [0,4095] of the IBM 370.)
Not all computers allow every one of the access paths discussed above; restrictions in the
combination (operation, access path) can also occur. Many of these restrictions arise from
the properties of the machine's registers. We distinguish ve architectural categories based
upon register structure:
Storage-to-storage. All operands of a computational operation are taken from main
storage, and the result is placed into main storage (IBM 1400 series, IBM 1620). Storage-
to-storage operations appear as a supplementary concept in many processors.
Stack. All operands of a computational operator are removed from the top of the stack,
and the result is placed onto the top of the stack (Burroughs 5000, 6000 and 7000 series,
ICL 2900 family). The stack appears as a supplementary concept in many processors.
Single Accumulator. One operand of a computational operator is taken from the accu-
mulator, and the result is placed into the accumulator; all other registers, including any
accumulator extension, have special tasks or cannot participate in all operations (IBM
7040/7090, Control Data 3000 series, many process-control computers, Intel 8080 and
microprocessors derived from it).
Multiple Accumulator. One operand of a computational operator is taken from one of
the accumulators, and the result is returned to that accumulator; long operands and
42 Properties of Real and Abstract Machines

results are accommodated by pairing the accumulators (DEC PDP11, Motorola 68000,
IBM 370, Univac 1100)
Storage Hierarchy. All operands of a computational operator are taken from accumula-
tors, and the result is returned to an accumulator (Control Data 6000, 7000 and Cyber
series). This architecture is identical to the storage-to-storage architecture if we view
the accumulators as primary storage and the main storage as auxiliary storage.
3.1.3 Operations
Usually the instruction set of a computer provides four general classes of operation:
Computation: Implements a function from n-tuples of values to m-tuples of values. The
function may aect the state. Example: A divide instruction whose arguments are a
single-length integer divisor and a double-length integer dividend, whose results are a
single-length integer quotient and a single-length integer remainder, and which may
produce a divide check interrupt.
Data transfer: Copies information, either within one storage class or from one storage
class to another. Examples: A move instruction that copies the contents of one register
to another; a read instruction that copies information from a disc to main storage.
Sequencing: Alters the normal execution sequence, either conditionally or uncondition-
ally. Examples: A halt instruction that causes execution to terminate; a conditional
jump instruction that causes the next instruction to be taken from a given address if a
given register contains zero.
Environment control: Alters the environment in which execution is carried out. The
alteration may involve a transfer of control. Examples: An interrupt disable instruc-
tion that prohibits certain interrupts from occurring; a procedure call instruction that
updates addressing registers, thus changing the program's addressing environment.
It is not useful to attempt to assign each instruction unambiguously to one of these classes.
Rather the classes should be used as templates to evaluate the properties of an instruction
when deciding how to implement language operations (Section 3.2.3).
It must be possible for the control unit of a computer to determine the operation and
all of the access paths from the encoding of an instruction. Older computer designs usually
had a single instruction size of, say, 24 or 36 bits. Fixed subelds were used to specify the
operation and the various access paths. Since not all instructions require the same access
paths, some of these subelds were unused in some cases. In an information-theoretic sense,
this approach led to an inecient encoding.
Coding eciency is increased in more modern computers by using several dierent instruc-
tion sizes. Thus the IBM 370 has 16, 32 and 48 bit (2, 4 and 6 byte) instructions. The rst
byte is the operation code, which determines the length and layout of the instruction as well
as the operation to be carried out. Nearly all microprocessors have variable-size operation
codes as well. In this case the encoding process carried out by the assembly task may require
larger tables, but otherwise the compiler is not aected. Variable-length instructions may
also lead to more complex criteria of optimality.
On some machines one or more operation codes remain unallocated to hardware functions.
Execution of an instruction specifying one of these operation codes results in an interrupt,
which can be used to activate a subprogram. Thus these undened operations can be given
meaning by software, allowing the compiler writer to extend the instruction set of the target
machine. Such programmable extension of the instruction set is sometimes systematically
supported by the hardware, in that the access paths to operands at specic positions are
3.2 Representation of Language Elements 43

placed at the disposal of the subprogram as parameters. The XOP instruction of the Texas
Instruments 990 has this property. (TRAP allows programmable instruction set extension on
the PDP11, but does not make special access path provisions.)

3.2 Representation of Language Elements

In this and following sections we shall discuss the mapping of the language elements of Chap-
ter 2 onto the machine elements of Section 3.1. This mapping is really the specication of
the tasks of the code generator and the run-time system, and must be performed for each
language/machine pair.

3.2.1 Elementary Objects

A combination of space and instruction questions must be answered in order to determine the
mapping of elementary types such as integer, real, character, Boolean and other enumerations.
Implementation of the relevant basic operations is particularly important for Boolean values.
For integers, the rst decision is whether to use a decimal (4 bits/digit) or binary encoding.
Decimal encoding implies that decimal operations exist (as on the IBM 370), or at least that
there is a facility to detect a carry (result digit>9) and to increment the next higher position
(as on many microprocessors). The values of variables have varying size with this encoding,
which complicates assignment operations. Decimal encoding is worth considering if very few
operations take place on each value (the cost of the translation from decimal to binary on input
and the reverse translation on output is greater than the expected gain from using binary
operations internally), or if the numeric incompatibility of binary and decimal arithmetic is
a signicant problem (as with some nancial applications).
Binary encodings are normally xed-length, and hence when a binary encoding is chosen
we must x the length of the representation in terms of the maximum source language integer.
Since most programming languages leave the range of integer values unspecied, we fall back
upon the rule of thumb that all addresses be representable as integers. This causes us to
consider integer representations of 16, 24 or 32 bits. The representation must at least include
all conceivable indexes; 16 bits will suce for this purpose on small machines. We must
also consider available instructions. For example, on the IBM 370 we would rule out 16 bits
because no divide instruction is included for 16 bit operands and because the test to determine
whether intermediate 32-bit results could be represented in 16 bits would slow execution
considerably. The extra instructions would, in many cases, wipe out the savings resulting
from the 16-bit representation. Similar reasoning would eliminate the 24-bit representation
on most computers.
A binary encoding with n bits can represent 2n distinct values, an even number. Any
range of integers symmetric about 0, however, contains an odd number of values. This
basic mismatch leads to anomalous behavior of machine arithmetic. The exact nature of the
anomaly depends upon the representation chosen for negative numbers. A sign-magnitude or
diminished-radix complement (e.g. 1's-complement) representation results in two zero values,
one positive and the other negative; a radix complement (e.g. 2's-complement) representation
results in a `most negative' number that has no positive counterpart. The extra-zero anomaly
is usually the more dicult of the two for the compiler writer. It may involve additional
instructions to ensure that comparisons yield the correct result, or complicated analysis to
prove that these instructions need not be generated.
Comparisons may prove dicult if they are not provided as machine instructions. Arith-
metic instructions must then be used, and precautions taken against erroneous results due
44 Properties of Real and Abstract Machines

to over- and under ow. For example, consider a machine with integers in the range [-
32767,32767]. If a > b is implemented as (a , b) > 0 then an over ow will occur when
comparing the values a = 16384 and b = ,16384. The comparison code must either antici-
pate and avoid this case, or handle the over ow and interpret the result properly. In either
case, a long instruction sequence may be required. Under ow may occur in oating point
comparisons implemented by a subtraction when the operand dierence is small. Since many
machines deliver 0 as a result, without indicating that an under ow has occurred, anticipation
and avoidance are required.
Actually, the symptom of the oating point under ow problem is that a comparison asserts
the equality of two numbers when they are really dierent. We could argue that the inherent
inaccuracy of oating point operations makes equality testing a risky business anyway. The
programmer must thoroughly understand the algorithm and its interaction with the machine
representation before using equality tests, and hence we can inform him of the problem and
then forget about it. This position is defensible provided that we can guarantee that a
comparison will never yield an incorrect relative magnitude (i.e. it will never report a > b
when a is less than b, or vice-versa).
If, as in Pascal, subranges m::n of integers can be specied as types, the compiler writer
must decide what use to make of this information. When the usual integer range can be
exceeded (not possible in Pascal) this forces the introduction of higher-precision arithmetic
(in the extreme case, of variable-length arithmetic). For small subranges the size of the
range can be used to reduce the number of bits required in the representation, if necessary
by replacing the integer i by (i , lower bound), although this last is not recommended. The
important question is whether arithmetic operations exist for the shorter operands, or at least
whether the conversion between working length and storage length can easily be carried out.
(Recall that no signicant bits may be discarded when shortening the representation.)
The possibilities for mapping real numbers are constrained by the oating point operations
of the hardware or the given subroutine package. (If neither is available on the target machine
then implementation should follow the IEEE standard.) The only real choice to be made
involves the precision of the signicand. This decision must be based upon the milieu in
which the compiler will be used and upon numeric problems whose discussion is beyond the
scope of this book.
For characters and character strings the choice of mapping is restricted to the specication
of the character code. Assuming that this is not xed by the source language, there are two
choices: either a standard code such as the ISO 7-bit code (ASCII), or the code accepted
by the target computer's operating system for input/output of character strings (EBCDIC
or other 6- or 8-bit code; note that EBCDIC varies from one manufacturer to another).
Since most computers provide quite ecient instructions for character translation, use of the
standard code is often preferable.
The representation of other nite types reduces to the question of suitably representing
the integers 0::n , 1, which we have already discussed. One exception is the Boolean values
false and true. Only a few machines are provided with instructions that access single bits.
If these instructions are absent, bit operations must be implemented by long sequences of
code (Figure 3.4). In such cases it is appropriate to implement Boolean variables and values
as bytes or words. Provided that the source language has not constrained their coding, the
choice of representation depends upon the realization of operations with Boolean operands or
Boolean results. In making this decision, note that comparison and relational operations occur
an order of magnitude more frequently than all other Boolean operations. Also, the operands
of and and or are much more frequently relations than Boolean variables. In particular, the
implementation of and and or by jump cascades (Section 3.2.3) introduces the possibilities
(false = 0, true 6= 0) and (false 0, true0) or their inverses in addition to the classical
3.2 Representation of Language Elements 45

(false = 0; true = 1). These possibilities underscore the use of more than one bit to represent
a Boolean value.
1 Bit The bit position is specied by two masks, M0=B'0...010...0' and
M1=B'1...101...1'.
1 Byte Let 0 represent false, K represent true.
a) Possible representations for Boolean values
Construct Code, depending on representation
Byte Bit
TM M0,p
BO L1
NI M1,q
q := p MVC q,p B L2
L1 OI M0,q
L2 continuation
p := not p XI K,p XI M0,p
TM M0,p
q := q or p OC q,p BZ L1
OI M0,q
L1 continuation
TM M0,p
q := q and p NC q,p BO L1
NI M0,q
L1 continuation
(The masks M0 and M1 are those appropriate to the second operand of the instruction in which they appear.)
b) Code using the masks from (a)
Figure 3.4: Boolean Operations on the IBM 370

3.2.2 Composite Objects

For composite objects, we are interested in the properties of the standard representation and
the possibilities for reducing storage requirements.
An object a : array [m .. n] of M will be represented by a sequence of (n - m + 1)
components of type M . The address of element a[i] becomes:
address (a[m]) + (i - m) * |M| = address (a[0]) + i * |M|

Here |M| is the size of an element in address units and address (a [ 0 ]) is the `c-
titious starting address' of the array. The address of a[0] is computed from the location
of the array in storage; such an element need not actually exist. In fact, address (a [0])
could be an invalid address lying outside of the address space.
The usual representation of an object b : array [m1 .. n1 ,: : : , mr .. nr ] of M
occupies k1 k2 ::: kr jM j contiguous memory cells, where kj = nj , mj + 1, j = 1; : : : ; r.
The address of element b[i1 ; : : : ; ir ] is given by the following storage mapping function when
the array is stored in row-major order :
address (b[m1 ; : : : ; mr ]) + (i1 , m1 ) k2 kr jM j + + (ir , mr ) jM j
= address (b[0; : : : ; 0]) + i1 k2 ::: kr jM j + + ir jM j
By appropriate factoring, this last expression can be rewritten as:
address (b[0; : : : ; 0]) + ((: : : (i1 k2 + i2 ) k3 + + ir ) jM j
46 Properties of Real and Abstract Machines

If the array is stored in column-major order then the order of the indices in the polynomial
is reversed:
address (b[0; : : : ; 0]) + ((: : : (ir kr,1 + ir,1 ) kr,2 + + i1 ) jM j
The choice of row-major or column-major order is a signicant one. ALGOL 60 does not
specify any particular choice, but many ALGOL 60 compilers have used row-major order.
Pascal implicitly requires row-major order, and FORTRAN explicitly species column-major
order. This means that Pascal arrays must be transposed in order to be used as parameters
to FORTRAN library routines. In the absence of language constraints, make the choice that
corresponds to the most extensive library software on the target machine.
Access to b[i1 ; :::; ir ] is undened if the relationship mj ij nj is not satised for some
j = 1; : : : ; r. To increase reliability, this relationship should be checked at run time if the
compiler cannot verify it in other ways (for example, that ij is the controlled variable of a
loop and the starting and ending values satisfy the condition). To make the check, we need
to evaluate a storage mapping function with the following xed parameters (or its product
with the size of the single element):
r; address (b[0; : : : ; 0]); m1 ; :::; mr ; n1 ; :::; nr
Together, these parameters constitute the array descriptor. The array descriptor must be
stored explicitly for dynamic and exible arrays, even in the trivial case r = 1. For static
arrays the parameters may appear directly as immediate operands in the instructions for
computing the mapping function. Several array descriptors may correspond to a single array,
so that in addition to questions of equality of array components we have questions of equality
or identity of array descriptors.
An r dimensional array b can also be thought of as an array of r , 1 dimensional arrays.
We might apply this perception to an object c : array[1::m; 1::n] of integer, representing it as
m one-dimensional arrays of type t = array[1::n] of integer. The ctitious starting addresses
of these arrays are then stored in an object a : array[1::m] of " t. To be sure, this descriptor
technique raises the storage requirements of c from m n to m n + m locations for integers
or addresses; in return it speeds up access on many machines by replacing the multiplication
by n in the mapping function address (c[0; 0]) + (i n + j ) jintegerj by an indexed memory
reference. The saving may be particularly signicant on computers that have no hardware
multiply instruction, but even then there are contraindications: Multiplications occurring in
array accesses are particularly amenable to elimination via simple optimizations.
The descriptor technique is supported by hardware on Burroughs 6700/7700 machines.
There, the rows of a two-dimensional array are stored in segments addressed by special seg-
ment descriptors. The segment descriptors, which the hardware can identify, are used to
access these rows. Actual allocation of storage to the rows is handled by the operating sys-
tem and occurs at the rst reference rather than at the declaration. The allocation process,
which is identical to the technique for handling page faults, is also applied to one-dimensional
arrays. Each array or array row is divided into pages of up to 256 words. Huge arrays can
be declared if the actual storage requirements are unknown, and only that portion actually
referenced is ever allocated.
Character strings and sets are usually implemented as arrays of character and Boolean
values respectively. In both cases it pays to pack the arrays. In principle, character string
variables have variable length. Linked lists provide an appropriate implementation; each list
element contains a segment of the string. List elements can be introduced or removed at will.
Character strings with xed maximum length can be represented by arrays of this length.
When an array of Boolean values is packed, each component is represented by a single bit,
even when simple Boolean variables are represented by larger storage units as discussed above.
A record is represented by a succession of elds. If the elds of a record have alignment
constraints, the alignment of the entire record must be constrained also in order to guarantee
3.2 Representation of Language Elements 47

that the alignment constraints of the elds are met. An appropriate choice for the alignment
constraint of the record is the most stringent of the alignment constraints of its elds. Thus
a record containing elds with alignments of 2, 4 and 8 bytes would itself have an alignment
of 8 bytes. Whenever storage for an object with this record type is allocated, its starting
address must satisfy the alignment constraint. Note that this applies to anonymous objects
as well as objects declared explicitly.
The amount of storage occupied by the record may depend strongly upon the order of
the elds, due to their sizes and alignment constraints. For example, consider a byte-oriented
machine on which a character variable is represented by one byte with no alignment constraint
and an integer variable occupies four bytes and is constrained to begin at an address divisible
by 4. If a record contained an integer eld followed by a character eld followed by a second
integer eld then it would occupy 12 bytes: There would be a 3-byte gap following the
character eld, due to the alignment constraint on integer variables. By reordering the elds,
this gap could be eliminated. Most programming languages permit the compiler to do such
reordering.
Records with variants can be implemented with the variants sharing storage. If it is
known from the beginning that only one variant will be used and that the value of the variant
selector will never change, then the storage requirement may be reduced to exactly that
for the specied variant. This requirement is often satised by anonymous records; Pascal
distinguishes the calls new(p) and new(p; variant selector ) as constructors for anonymous
records. In the latter case the value of the variant selector may not change, whereas in the
former all variants are permitted.
The gaps arising from the alignment constraints on the elds of a record can be eliminated
by simply ignoring those constraints and placing the elds one after another in memory. This
packing of the components generally increases the cost in time and instructions for eld
access considerably. The cost almost always outweighs the savings gained from packing a
single record; packing pays only when many identical records are allocated simultaneously.
Packing is often restricted to partial words, leaving objects of word length (register length)
or longer aligned. On byte-oriented machines it may pay to pack only the representation of
sets to the bit level.
Packing alters the access function of the components of a composite object: The selector
must now specify not only the relative address of the component, but also its position within
the storage cell. On some computers extraction of a partial word can be specied as part of an
operand address, but usually extra instructions are required. This has the result that packed
components of arrays, record and sets may not be accessible via normal machine addresses.
They cannot, therefore, appear as reference parameters.
Machine-dependent programs sometimes use records as templates for hardware objects.
For example, the assembly phase of a compiler might use a record to describe the encoding of
a machine instruction. The need for a xed layout in such cases violates the abstract nature
of the record, and some additional mechanism (such as the representation specication of
Ada) is necessary to specify this. If the language does not provide any special mechanism,
the compiler writer can overload the concept of packing by guaranteeing that the elds of a
packed record will be allocated in the order given by the programmer.
Addresses are normally used to represent pointer values. Addresses relative to the be-
ginning of the storage area containing the objects are often sucient, and may require less
storage than full addresses. If, as in ALGOL 68, pointers have bounded lifetime, and the
correctness of assignments to reference variables must be checked at run time, we must add
information to the pointer from which its lifetime may be determined. In general the starting
address of the activation record (Section 3.3) containing the reference object serves this pur-
pose; reference objects of unbounded extent are denoted by the starting address of the stack.
48 Properties of Real and Abstract Machines

A comparison of these addresses for relative magnitude then represents inclusion of lifetimes.

3.2.3 Expressions
Because of the diversity of machine instruction sets, we can only give the general principles
behind the mapping of expressions here. An important point to remember throughout the
discussion, both here and in Section 3.2.4, is that the quality of the generated code is deter-
mined by the way it treats cases normally occurring in practice rather than by its handling
of the general case. Moreover, local code characteristics have a greater impact than any op-
timizations on the overall quality. Figure 3.5 shows the static frequencies of operations in
a large body of Pascal text. Note the preponderance of memory accesses over computation,
but remember that indexing generally involves both multiplication and addition. Remember
also that these are static frequencies; dynamic frequencies might be quite dierent because
a program usually spends about 90% of its time in heavily-used regions accounting for less
than 10% of the overall code.
Structure Tree Operator Percent of All Operators
Access a variable 27
Assign 13
Select a eld of a record 9.7
Access a value parameter 8.1
Call a procedure 7.8
Index an array (each subscript) 6.4
Access an array 6.1
Compare for equality (any operands) 2.7
Access a variable parameter 2.6
Add integers 2.3
Write a text line 1.9
Dereference a pointer variable 1.9
Compare for inequality (any operands) 1.3
Write a single value 1.2
Construct a set 1.0
not 0.7
and 0.7
Compare for greater (any operands) 0.5
Test for an element in a set 0.5
or 0.4
All other operators 3.8
Figure 3.5: Static Frequencies of Pascal Operators [Carter, 1982]
Single target machine instructions directly implement operations appearing in the struc-
ture tree only in the simplest cases (such as integer arithmetic). A node of the structure
tree generally corresponds to a sequence of machine instructions, which may appear either
directly in the generated code or as a subroutine call. If subroutines are used then they may
be gathered together into an interpreter consisting of a control loop containing a large case
statement. The operations are then simply selectors used to choose the proper case, and
may be regarded as instructions of a new (abstract) machine. This approach does not really
answer the question of realizing language elements on a target machine; it merely changes the
target machine, hopefully simplifying the problem.
A closed sequence is invariably slower than the corresponding open sequence because of
3.2 Representation of Language Elements 49

the cost of the transfers in and out. It would therefore be used only if commensurate savings
in space were possible. Some care must be taken in evaluating the tradeos, because both
open and closed sequences usually involve setup code for the operands. It is easy to overlook
this code, making erroneous assumptions about the operand locations, and thereby arrive at
the wrong decision. Recall from Section 3.1.3 that it is sometimes possible to take advantage
of unused operation codes to access closed instruction sequences. Depending upon the details
of the hardware, the time overhead for this method may be either higher or lower than that
of a conventional call. It is probably most useful for implementing facilities that might be
provided by hardware. The typical example is oating point arithmetic on a microprocessor
with integer operations only. A oating point operation usually involves a long sequence of
instructions on such a machine (which may not even be capable of integer multiplication or
division), and thus the entry/exit overhead is negligible. If the user later adds a oating-
point chip, and controls it with the previously unused operation codes, no changes to the
code generator are required. Even when dierent operation codes are used the changes are
minimal.
An object, label or procedure is addressable if its eective address can be expressed by
the relevant access path of an instruction. For entities that are not addressable, additional
operations and temporary storage are required to compute the eective address. The allow-
able combinations of operation and access function exert a very strong in uence upon the
code generation process because of this. On the Motorola 68000, for example, specication
of the operation can be largely separated from selection of the access path, and operand ad-
dressability is almost independent of the operator. Many IBM 370 instructions, on the other
hand, work only when the second operand is in a register. In other cases memory access is
possible, but only via a base register without indexing. This leads to the problem that an
operand may be addressable in the context of one operation but not in the context of another.
When an instruction set contains such asymmetries, the simplest solution is to dene the
abstract machine for the source-to-target mapping with a uniform access function, reserving
the resources (usually one or two registers) needed to implement the uniform access function
for any instruction. Many code sequences require additional resources internally in any event.
These can often be standardized across the code sequences and used to provide the uniform
access function in addition. The only constraint on resources reserved for the uniform access
function is that they have no inter-sequence meaning; they can be used arbitrarily within a
sequence.
Consider the tree for an expression. The addressability of entities described by leaves
is determined by the way in which the environment is encoded in the machine state. (We
shall discuss possibilities for environment encoding in Section 3.3.) For entities described by
interior nodes, however, the addressability depends upon the code sequence that implements
the node. It is often possible to vary a code sequence, without changing its cost, to meet
the addressability requirements of another node. Figure 3.6 shows a typical example. Here
the constraints of the IBM 370 instruction set require that a multiplicand be in the odd-
numbered register of a pair, and that the even-numbered register of that pair be free. Similarly,
the optimum mechanism for converting a single-length value to double-length requires its
argument to be in the even register of the pair used to hold its result. An important part of
the source-to-target mapping design is the determination of the information made available
by a node to its neighbors in the tree, and how this information aects the individual code
sequences.
Interior nodes whose operations yield addresses, such as indexing and eld selection nodes,
may or may not result in code sequences. Addressability is the key factor in this decision:
No code is required if an access function describing the node's result can be built, and if
that access function is acceptable to the instruction using the result. The richer the set of
50 Properties of Real and Abstract Machines

L R1,I
A R1,J Result in R1
M R0,K Multiplicand from R1, product to (R0,R1)
D R0,L Dividend from (R0,R1)
a) Code for the expression ((i + j ) k=l)
L R0,I
A R0,J
A R0,K Result in R0
SRDA R0,32 Extend to double, result in (R0,R1)
D R0,L Dividend from (R0,R1)
b) Code for the expression ((i + j + k)=l)
Figure 3.6: Optimum Instruction Sequences for the IBM 370

access functions, the more nodes can be implemented simply by access function restructuring.
In fact, it is often possible to absorb nodes describing normal value operations into access
functions that use their result. Figure 3.7 is a tree for b[i +12]. As we shall see in Section 3.3,
the local byte array b might have access function 36(13) on an IBM 370 (here register 13 gives
the base address of the local contour, and 36 is the relative byte location of b within that
contour). After loading the value of i into register 1, the eects of the index and addition
nodes can be combined into the access function 48(13,1). This access function (Figure 3.3a)
can be used to obtain the second argument in any RX-format instruction on the IBM 370.
INDEX

b +

i 12

Figure 3.7: Tree for a Typical Array Access

Some machines incorporate automatic incrementing or decrementing of a register content
into certain access functions. These facilities are easy to use in source-to-target mappings for
special purposes such as stack manipulation. Their general use, for example in combining the
increment of a loop control variable with the last use of that variable as an index, is much
more dicult because it leads to `combinatorial explosion' in the number of cases that the
code generator must examine. Such optimizations should be provided by a separate process
(peephole optimization), rather than being incorporated into the source-to-target mapping.
Many Boolean expressions occur in contexts such as conditional statements and loops,
where the result is used only to determine the ow of control. Moreover, most of these ex-
pressions either are relations themselves or are composed of relations. On the majority of
computers a relation is evaluated by performing a comparison or arithmetic operation and
then executing a transfer of control based upon the result. The upshot is that such expres-
sions can be implemented most conveniently by omitting Boolean computations completely!
Figure 3.8 illustrates the concept, which is called a jump cascade.
The concept of a jump cascade is completely independent of the concept of short-circuit
evaluation discussed in Section 2.3. It appears that Figure 3.8 is performing short-circuit
evaluation because, for example, c is not fetched unless the value of a is less than that of
3.2 Representation of Language Elements 51

b. But fetching a simple variable has no side eect, and hence the short-circuit evaluation
is not detectable. If c were a parameterless function with a side eect then it should be
invoked prior to the start of the code sequence of Figure 3.8b, and the c in that code sequence
would represent temporary storage holding the function result. Thus we see that questions
of short-circuit evaluation aect only the relative placement of code belonging to the jump
cascade and code for evaluating the operands of the relations.
if (a < b ) and (c = d ) or (e > f ) then statement ;
a) A conditional

L R1,a
C R1,b
BNL L10 Note condition reversal here
L R1,c
C R1,d
BEQ L1 Condition is not reversed here
L10 L R1,e
C R1,f
BNH L2 Reversed
L1 : : : Code for statement
L2 : : : Code following the conditional

b) IBM 370 code corresponding to (a)

Figure 3.8: Jump Cascades

3.2.4 Control Structures

A node representing a control structure generally results in several disjoint code sequences
rather than a single code sequence. The meanings of and relationships among the sequences
depend primarily upon the source language, and hence general schemata can be used to
specify them. Each of the disjoint sequences then can be thought of as an abstract machine
operation with certain dened properties and implemented individually.
The goto statement is implemented by an unconditional jump instruction. If the jump
leaves a block or procedure then additional operations, discussed in Section 3.3, are needed to
adjust the state. In expression-oriented languages, a jump out of an expression may require
adjustment of a hardware stack used for temporary storage of intermediate values. This
adjustment is not necessary when the stack is simply an area of memory that the compiler
manages as a stack, computing the necessary osets at compile time. (Unless use of a hardware
stack permits cheaper access functions, it should be avoided for this reason.)
Schemata for common control structures are given in Figure 3.9. The operation `condi-
tion(expression,truelabel,falselabel)' embodies the jump cascade discussed in Section 3.2.3.
The precise mechanism used to implement the analogous `select' operation depends upon the
set k1 : : : km . Let kmin be the smallest and kmax the largest values in this set. If `most' of
the values in the range [kmin ; kmax ] are members of the set then `select' is implemented as
shown in Figure 3.10a. Each element of target that does not correspond to an element of
k1 : : : km is set to `L0'. When the selector set is sparse and its span is large (for example, the
set 0; 5000; 10000), a decision tree or perfect hash function should be used instead of an array.
The choice of representation is strictly a space/time tradeo, and must be made by the code
52 Properties of Real and Abstract Machines

generator for each case clause. The source-to-target mapping must specify the parameters to
be used in making this choice.
condition(e, L1, L2)
L1: clause
L2:
a) if e then clause ;

condition(e, L1, L2)

L1: clause1
GOTO L
L2: clause2
L:
b) if e then clause1 else clause2;

select(e, k1 , L1,: : : , kn , Ln, L0)

L1: clause1
GOTO L
:::
Ln: clausen
GOTO L
L0: clause0
L:
c) case e of k1 : clause1 ; : : : ; kn : clausen else clause ;
0

GOTO L
L1: clause
L: condition(e, L1, L2)
L2:
d) while e do clause ;

L1: clause
condition(e, L2, L1)
L2:
e) repeat clause until e

forbegin(i, e1 , e2 , e3 )
clause
forend(i, e2 , e3 )
f) for i := e1 by e2 to e3 do clause ;

Figure 3.9: Implementation Schemata for Common Control Structures

By moving the test to the end of the loop in Figure 3.9d, we reduce by one the number of
jumps executed each time around the loop without changing the total number of instructions
3.2 Representation of Language Elements 53

required. Further, if the target machine can execute independent instructions in parallel, this
schema provides more opportunity for such parallelism than one in which the test is at the
beginning.
`Forbegin' and `forend' can be quite complex, depending upon what the compiler can
deduce about the bounds and step, and how the language denition treats the controlled
variable. As an example, suppose that the step and bounds are constants less than 212 , the
step is positive, and the language denition states that the value of the controlled variable is
undened on exit from the loop. Figure 3.10b shows the best IBM 370 implementation for
this case, which is probably one of the most common. (We assume that the body of the loop
is too complex to permit retention of values in registers.) Note that the label LOOP is dened
within the `forbegin' operation, unlike the labels used by the other iterations in Figure 3.9.
If we permit the bounds to be general expressions, but specify the step to be 1, the general
schema of Figure 3.10c holds. This schema works even if the value of the upper bound is the
largest representable integer, since it does not attempt to increment the controlled variable
after reaching the upper bound. More complex cases are certainly possible, but they occur
only infrequently. It is probably best to implement the abstract operations by subroutine
calls in those cases (Exercise 3.9).
target : array[kmin .. kmax ] of address ;
k : integer ;
k := e ;
if
k kmin and k kmax then goto target [k ] else goto L0;
a) General schema for `select' (Figure 3.9c)
LA 1, e1 e1 = constant < 212
LOOP ST 1,i
::: Body of the clause
L 1,i
LA 2,e2 e2 = constant < 212
LA 3,e3 e3 = constant < 212
BXLE 1,2,LOOP
b) IBM 370 code for special-case forbegin : : : forend
i := e1 ; t := e3 ;
if i > t then goto l3 else goto l2;
l1 : i := i + 1;
l2 : : : : (* Body of the clause *)
if i < t then goto l1;
l3 :
c) Schema for forbegin...forend when the step is 1
Figure 3.10: Implementing Abstract Operations for Control Structures
Procedure and function invocations are control structures that also manipulate the state.
Development of the instruction sequences making up these invocations involves decisions
about the form of parameter transmission, and the construction of the activation record { the
area of memory containing the parameters and local variables.
A normal procedure invocation, in its most general form, involves three abstract opera-
tions:
Callbegin: Obtain access to the activation record of the procedure.
54 Properties of Real and Abstract Machines

Transfer: Transfer control to the procedure.

Callend: Relinquish access to the activation record of the procedure.
Argument computation and transmission instructions are placed between `callbegin' and
`transfer'; instructions that retrieve and store the values of result parameters lie between
`transfer' and `callend'. The activation record of the procedure is accessible to the caller
between `callbegin' and `callend'.
In simple cases, when the procedure calls no other procedures and does not require complex
parameters, the activation record can be deleted entirely and the parameters treated as local
variables of the environment statically surrounding the procedure declaration. The invocation
then reduces to a sequence of assignments to these variables and a simple subroutine jump. If,
as in the case of elementary functions, only one or two parameters are involved then they can
be passed in registers. Note that such special treatment leads to diculties if the functions
are invoked as formal parameters. The identity of the procedure is not xed under those
circumstances, and hence special handling of the call or parameter transmission is impossible.
Invocations of formal procedures also cause problems if, as in ALGOL 60, the number
and types of the parameters is not statically specied and must be veried at execution time.
These dynamic checks require additional instructions not only at the call site, but also at the
procedure entry. The latter instructions must be avoided by a normal call, and therefore it
is useful for the procedure to have two distinct entry points { one with and one without the
tests.
Declarations of local variables produce executable code only when some initialization is
required. For dynamic arrays, initialization includes bounds computation, storage allocation,
and construction of the array descriptor. Normally only the bounds computation would be
realized as in-line code; a library subroutine would be invoked to perform the remaining tasks.
At least for test purposes, every variable that is not explicitly initialized should be im-
plicitly assigned an initial value. The value should be chosen so that its use is likely to lead
to an error report; values recognized as illegal by the target machine hardware are thus best.
Under no circumstances should 0 be used for implicit initialization. If it is, the programmer
will too easily overlook missing explicit initialization or assume that the implicit initialization
is a dened property of the language and hence write incorrect programs.
Procedure and type declarations do not usually lead to code that is executed at the site
of the declaration. Type declarations only result in machine instructions if array descriptors
or other variables must be initialized. As with procedures, these instructions constitute a
subprogram that is not called at the point of declaration.
ALGOL 68 identity declarations of the form mid = expression are consistently replaced
by initialized variable declarations mid0 := expression. Here id0 is a new internal name, and
every applied occurrence of id is consistently replaced by id0 ". The initialization remains the
only assignment to id0 . Simplication of this schema is possible when the expression can be
evaluated at compile time and all occurrences of id replaced by this value.
The same schema describes argument transmission for the reference and strict value mech-
anisms, in particular in ALGOL 68. Transmission of a reference parameter is implemented
by initialization of an internal reference variable: ref m parameter = argument becomes ref
m variable := argument .
We have already met the internal transformation used by the value and name mechanisms
in Section 2.5.3. In the result and value/result mechanisms, the result is conveniently assigned
to the argument after return. In this way, transmission of the argument address to the
procedure is avoided. When implementing value/result transmission for FORTRAN, one
should generate the result assignment only in the case that the argument was a variable.
(Note that if the argument address is transmitted to the procedure then the caller must
3.3 Storage Management 55

always treat the argument as a variable. If the programmer uses a constant, the compiler
must either ag it as an error or move the constant value to a temporary storage location
and transmit the address of that temporary.)
For function results, the compiler generally produces temporaries of suitable type at the
call site and in the function. Within the function, the result is assigned to the local temporary.
Upon return, as in the case of a result parameter, the local temporary is copied into the global
temporary. The global temporary is only needed if the result cannot be used immediately.
(An example of this case is the value of cos(x) in cos(x) + sin(y).)
Results delivered by function procedures can, in simple cases, be returned in registers. (For
compatibility with jump cascades, it may be useful for a Boolean function to encode its result
by returning to two dierent points.) Transmission of composite values as function results
can be dicult, especially when these are arrays whose sizes are not known to the caller. This
means that the caller cannot reserve storage for the result in his own environment a priori;
as a last resort such objects may be left on the heap (Section 3.3.3).

3.3 Storage Management

Until now we have dealt with the representation of single objects in memory; in this section we
shall discuss management of storage for collections of objects, including temporary variables,
during their lifetimes. The important goals are the most economical use of memory and the
simplicity of access functions to individual objects. Source language properties govern the
possible approaches, as indicated by the following questions (see also Section 2.5.2):
Is the exact number and size of all objects known at compilation time?
Is the extent of an object restricted, and what relationships hold between the extents
of distinct objects (e.g. are they nested)?
Does the static nesting of the program text control a procedure's access to global objects,
or is access dependent upon the dynamic nesting of calls?
3.3.1 Static Storage Management
We speak of static storage management if the compiler can provide xed addresses for all
objects at the time the program is translated (here we assume that translation includes
binding), i.e. we can answer the rst question above with `yes'. Arrays with dynamic bounds,
recursive procedures and the use of anonymous objects are prohibited. The condition is
fullled for languages like FORTRAN and BASIC, and for the objects lying on the outermost
contour of an ALGOL 60 or Pascal program. (In contrast, arrays with dynamic bounds can
occur even in the outer block of an ALGOL 68 program.)
If the storage for the elements of an array with dynamic bounds is managed separately,
the condition can be forced to hold in this case also. That is particularly interesting when we
have additional information that certain procedures are not recursive, for example because
recursivity must be noted specially (as in PL/1) or because we have determined it from
analysis of the procedure calls. We can then allocate storage statically for contours other
than the outermost.
Static storage allocation is particularly valuable on computers that allow access to any
location in main memory via an absolute address in the instruction. Here, static storage cor-
responds exactly to the class of objects with direct access paths in the sense of Section 3.2.2.
If, however, it is unknown during code generation whether or not an object is directly ad-
dressable (as on the IBM 370) because this depends upon the nal addressing carried out
during binding, then we must also access statically-allocated objects via a base register. The
56 Properties of Real and Abstract Machines

only advantage of static allocation then consists of the fact that no operations for storage
reservation or release need be generated at block or procedure entry and exit.
3.3.2 Dynamic Storage Management Using a Stack
As we have already noted in Section 2.5.2, all declared values in languages such as Pascal and
SIMULA have restricted lifetimes. Further, the environments in these languages are nested:
The extent of all objects belonging to the contour of a block or procedure ends before that of
objects from the dynamically enclosing contour. Thus we can use a stack discipline to manage
these objects: Upon procedure call or block entry, the activation record containing storage for
the local objects of the procedure or block is pushed onto the stack. At block end, procedure
return or a jump out of these constructs the activation record is popped o of the stack. (The
entire activation record is stacked, we do not deal with single objects individually!)
An object of automatic extent occupies storage in the activation record of the syntactic
construct with which it is associated. The position of the object is characterized by the base
address, b, of the activation record and the relative location oset ), R, of its storage within
the activation record. R must be known at compile time but b cannot be known (otherwise
we would have static storage allocation). To access the object, b must be determined at run
time and placed in a register. R is then either added to the register and the result used
as an indirect address, or R appears as the constant in a direct access function of the form
`register+constant'.
Every object of automatic extent must be decomposable into two parts, one of which has
a size that can be determined statically. (The second part may be empty.) Storage for the
static parts is allocated by the compiler, and makes up the static portion of the activation
record. (This part is often called the rst order storage of the activation record.) When a
block or procedure is activated, the static part of its activation record is pushed onto the
stack. If the activation record contains objects whose sizes must be determined at run time,
this determination is carried out and the activation record extended. The extension, which
may vary in size from activation to activation, is often called the second order storage of the
activation record. Storage within the extension is always accessed indirectly via information
held in the static part; in fact, the static part of an object may consist solely of a pointer to
the dynamic part.
An array with dynamic bounds is an example of an object that has both static and
dynamic parts. In most languages, the number of dimensions of an array is xed, so the size
of the array descriptor is known at compile time. Storage for the descriptor is allocated by the
compiler in the static part of the activation record. On encountering the declaration during
execution, the bounds are evaluated and the amount of storage needed for the array elements
is determined. The activation record is extended by this amount and the array descriptor is
initialized appropriately. All accesses to elements of the array are carried out via the array
descriptor.
We have already noted that at compile time we do not know the base address of an
activation record; we know only the range to which it belongs. From this we must determine
the base address, even in the case where recursion leads to a number of activation records
belonging to the same range. The range itself can be specied by its block nesting depth, bnd,
dened according to the following rules based on the static structure of the program:
The main program has bnd = 1.
A range is given bnd = t + 1 if and only if the immediately enclosing range has bnd = t.
Bnd = t indicates that during execution of the range the state consists of a total of t
nested contours.
3.3 Storage Management 57

If, as in all ALGOL-like languages, the scopes of identiers are statically nested then
at every point in the execution history of a program there is at most one activation record
accessible at a given nesting depth. The base address of a particular activation record can
then be found by noting the corresponding nesting depth at compile time and setting up a
mapping s : nestingdepth ! baseaddress during execution. The position of an object in the
xed part of the activation record is fully specied by the pair (bnd; R); we shall therefore
speak of `the object (bnd; R)'.
The mapping s changes upon range entry and exit, procedure call and return, and jumps
out of blocks or procedures. Updating s is thus one of the tasks (along with stack pointer
updating and parameter or result transmission) of the state-altering operations that we met
in Section 2.5.2. We shall describe them semi-formally below, assuming that the stack is
described by:
k : array[0 .. upper limit ] of storage cell ; k top : 0 .. upper limit ;
We assume further that a storage cell can hold exactly one address, and we shall treat address
variables as integer variables with which we can index k.
The contour nesting and pointer to dynamic predecessor required by the contour model
are represented by address values stored in each activation record. Together with the re-
turn address, and possibly additional information depending upon the implementation, they
constitute the `administrative overhead' of the activation record. A typical activation record
layout is shown in Figure 3.11; the corresponding state change operations are given in Figure
3.12. We have omitted range entry/exit operations. As noted in Section 2.5.2, procedures and
blocks can be treated identically by regarding a block as a parameterless procedure called `on
the spot', or contours corresponding to blocks can be eliminated and objects lying upon them
can be placed on the contour of the enclosing procedure. If blocks are to be given separate
activation records, the block entry/exit operations are identical to those for procedures except
that no return address is saved on entry and ip is not set on exit. Jumps out of blocks are
treated exactly as shown in Figure 3.12c in any case.

Second-order storage

2 Return Address
1 Pointer to Dynamic Predecessor First-order storage
0 Pointer to Static Predecessor
Figure 3.11: Typical Activation Record Layout
The procedure and jump addresses indicated by the comments in Figures 3.12a and c
are supplied by the compiler; the environment pointers must be determined at run time. If
a procedure is invoked directly, by stating its identier, then it must lie within the current
environment and its static predecessor can be obtained from the stack by following the chain
of static predecessors until the proper block nesting depth is reached:
environment := ep ;
for i := bndcaller downto
bndprocedure do
environment := k [environment ];
58 Properties of Real and Abstract Machines

The value (bndcaller - bndprocedure ) compile time and is usually small, so the loop is
sometimes `unrolled' to a xed sequence of environment := k [environment ] operations.
k[k top] := (* static predecessor of the procedure *);
k[k top + 1] := ep; (* dynamic predecessor *)
k[k top + 2] := ip; (* return address *)
ep := k top; (* current environment *)
k top := k top + "size"; (* rst free location *)
ip := (* procedure code address *)
a) Procedure entry
k top := ep;
ep := k[k top + 1]; (* back to the dynamic predecessor *)
ip := k[k top + 2];
b) Procedure exit
k top := ep;
ep := (* target environment of the jump *);
while k[k top + 1] 6= ep do
k top := k[k top + 1]; (* leave all intermediate environments *)
ip := (* target address of the jump *);
c) Jump out of a procedure
Figure 3.12: Environment Change Operations
When a procedure is passed as a parameter and then the parameter is called, the static
predecessor cannot be obtained from the stack because the called procedure may not be in
the environment of the caller. (Figures 2.3 and 2.5 illustrate this problem.) Thus a procedure
parameter must be represented by a pair of addresses: the procedure entry point and the
activation record address for the environment statically enclosing the procedure declaration.
This pair is called a closure . When a procedure parameter is invoked, the address of the
static predecessor is obtained from the closure that represents the parameter. Figure 3.13
shows the stack representing the contours of Figure 2.5; note the closures appearing in the
activation records for procedure p.
Jumps out of a procedure also involve changing the state (Figure 3.12c). The mechanism
is essentially the same as that discussed above: If the label is referenced directly then it lies in
the current environment and its environment pointer can be obtained from the stack. A label
variable or label parameter, however, must be represented by a closure and the environment
pointer obtained from that closure.
Access to any object in the environment potentially involves a search down the chain of
static predecessors for the pointer to the activation record containing that object. In order to
avoid the multiple memory accesses required, a copy of the addresses can be kept in an array,
called a display, indexed by the block nesting depth. Access to the object (bnd; R) is therefore
provided by display[bnd] + R; we need only a single memory access, loading display[bnd] into
a base register, to set up the access function.
The Burroughs 6000/7000 series computers have a 32-register display built into the hard-
ware. This limits the maximum block nesting depth to 32, which is no limitation in practice.
Even a restriction to 16 is usually no problem, but 8 is annoying. Thus the implementation
of a display within the register set of a multiple-register machine is generally not possible,
because it leads to unnatural restrictions on the block nesting depth. The display can be
3.3 Storage Management 59

22
location after 1 : f
12 Activation record for procedure q
19 5
i=0
11 (reference to i)
5 (q's environment)
entry point address for q Activation record for procedure p
location after p(q; i)
5
12 0
i=2
4 (reference to k)
0 (empty 's environment)
entry point address for empty Activation record for procedure p
location after p(empty; k)
0
5 0
k=0
n=7
0 Activation record for procedure outer
0
0 0
Note:
k top = 22
ep = 19
ip = address of label 2
Figure 3.13: Stack Conguration Corresponding to Figure 2.5

allocated to a xed memory location, or we might keep only a partial display (made up of the
addresses of the most-frequently accessed activation records) in registers. Which activation
record addresses should be kept is, of course, program-dependent. The current activation
record address and that of the outermost activation record are good choices in Pascal; the
latter should probably be replaced with that of the current module in an implementation of
any language providing modules.
If any sort of display, partial or complete, is used then it must be kept up to date as the
state changes. Figure 3.14 shows a general procedure for bringing the display into synchronism
with the static chain. It will alter only those elements that need alteration, halting when the
remainder is guaranteed to be correct. In many cases the test for termination takes more
time than it saves, however, and a more appropriate strategy may be simply to reload the
entire display from the static chain.
Note that the full generality of update display is needed only when returning from a pro-
cedure or invoking a procedure whose identity is unknown. If a procedure at level bndnew in
the current addressing environment is invoked, the single assignment display[bndnew] := a
suces. (Here a is the address of the new activation record.) Display manipulation can
become a signicant overhead for short procedures operating at large nesting depths. Recog-
nition of special cases in which this manipulation can be avoided or reduced is therefore an
60 Properties of Real and Abstract Machines

important part of the optimization of such procedures.

procedure update_display (bndnew , bndold : integer ; a : address) :
(* Make the display consistent with the static chain
On entry -
bndnew = nesting depth of the new activation record
a = address of the new activation record
bndold = nesting depth of the current activation record
On exit -
The display specifies the environment of the new contour
*)
var
i : integer ;
h : address ;
begin (* update_display *)
i := bndnew ;
h := a ;
while display[i] = h 6 or
i < bndold do
begin
display[i] := h ;
i := i - 1; h := k[h] ;
end
end; (* update_display *)

Figure 3.14: Setting the Display

In SIMULA and Ada, as in all languages that contain coroutines and concurrently-
executing tasks, activation record creation and destruction need not follow a strict stack
discipline. Each coroutine or task corresponds to a set of activation records, and these sets
are growing and shrinking independently. Thus each coroutine or task requires an indepen-
dent stack, and these stacks themselves follow a stack discipline. The result is called a tree or
cactus stack and is most easily implemented in a segmented virtual memory. Implementation
in a linear memory is possible by xing the sizes of the component stacks, but this can only
be done when limitations can be placed upon recursion depth and spawning of further tasks.

3.3.3 Dynamic Storage Management Using a Heap

If none of the questions stated at the beginning of Section 3.3 lead to sucient reduction in the
lifetime and visibility of objects, the last resort is to allocate storage on a heap : The objects
are allocated storage arbitrarily within an area of memory. Their addresses are determined at
the time of allocation, and they can only be accessed indirectly. Examples of objects requiring
heap storage are anonymous objects such as those created by the Pascal new function and
objects whose size changes unpredictably during their lifetime. (Linked lists and the exible
arrays of ALGOL 68 belong to the latter class.)
Notice that the static and dynamic chain pointers were the only interconnections among
the activation records discussed in Section 3.3.2. The use of a stack storage discipline is not
required, but simply provides a convenient mechanism for reclaiming storage when a contour
is no longer relevant. By storing the activation records on a heap, we broaden the possibilities
for specifying the lifetimes of objects. This is the way in which the uniform retention strategy
mentioned at the end of Section 2.5.2 is implemented. Storage for an activation record is
3.3 Storage Management 61

released only if the program fragment (block, procedure, class) to which it belongs has been
left and no pointers to objects within this activation record exist.
Heap allocation is particularly simple if all objects required during execution can t into
the designated area at the same time. In most cases, however, this is not possible. Either
the area is not large enough or, in the case of virtual storage, the working set becomes too
large. A detailed discussion of heap storage management policies is beyond the scope of this
book (see Section 3.5 for references to the relevant literature). We shall only sketch three
possible recycling strategies for storage and indicate the support requirements placed upon
the compiler by these strategies.
If a language provides an explicit `release' operation, such as Pascal's dispose or PL/1's
free, then heap storage may be recycled by the user. This strategy is simple for the compiler
and the run-time system, but it is unsafe because access paths to the released storage may
still exist and be used eventually to access recycled storage with its earlier interpretation.
The release operation, like the allocation operation, is almost invariably implemented as a
call on a support routine. Arguments that describe the size and alignment of the storage area
must be supplied to these calls by the compiler on the basis of the source type of the object.
Automatic reclamation of heap storage is possible only if the designers of a language
have considered this and made appropriate decisions. The key is that it must be possible
to determine whether or not a variable contains an address. For example, only a variable
of pointer type may contain an address in a Pascal program. A special value, nil, indicates
the absence of a pointer. When a pointer variable is created, it could be initialized to nil.
Unfortunately, Pascal also provides variant records and does not require such records to have
a tag eld indicating which variant is in force. If one variant contains a pointer and another
does not, it is impossible to determine whether or not the corresponding variable contains a
pointer. Detailed discussion of the tradeos involved in such a decision by a language designer
is beyond the scope of this text.
Storage can be recycled automatically by a process known as garbage collection, which
operates in two steps:
Mark. All accessible objects on the heap are marked as being accessible.
Collect. All heap storage is scanned. The storage for unmarked objects is recycled, and
all marks are erased.
This has the advantage that no access paths can exist to recycled storage, but it requires
considerable support from the compiler and leads to periodic pauses in program execution. In
order to carry out the mark and collect steps, it must be possible for the run-time system to
nd all pointers into the heap from outside, nd all heap pointers held within a given object
on the heap, mark an object without destroying information, and nd all heap objects on a
linear sweep through the heap. Only the questions of nding pointers aect the compiler;
there are three principal possibilities for doing this:
1. The locations of all pointers are known beforehand and coded into the marking algo-
rithm.
2. Pointers are discovered by a dynamic type check. (In other words, by examining a
storage location we can discover whether or not it contains a pointer.)
3. The compiler creates a template for each activation record and for the type of every
object that can appear on the heap. Pointer locations and (if necessary) the object
length can be determined from the template.
62 Properties of Real and Abstract Machines

Pointers in the stack can also be indicated by linking them together into a chain, but this
would certainly take too much storage on the heap.
Most LISP systems use a combination of (1) and (2). For (3) we must know the target type
of every pointer in order to be able to select the proper template for the object referenced.
This could be indicated in the object itself, but storage would be saved if the template carried
the number or address of the proper template as well as the location of the pointer. In this
manner we also solve the problem of distinguishing a pointer to a record from the pointer to
its rst component. Thus the template for an ALGOL 68 structure could have the following
structure:
Length of the structure (in storage units)
For each storage unit, a Boolean value `reference'
For each reference, the address of the template of the referenced type.
If dynamic arrays or variants are allowed in records then single Boolean values indicating
the presence of pointers are no longer adequate. In the rst case, the size and number of
components are no longer known statically. The template must therefore indicate the location
of descriptors, so that they can be interpreted by the run-time system. In the second case the
position of the variant selector and the dierent interpretations based upon its value must be
known. If, as in Pascal, variant records without explicit tag elds are allowed, then garbage
collection is no longer possible.
Garbage collection also requires that all internal temporaries and registers that can contain
references must be identied. Because this is very dicult in general it is best to arrange the
generated code so that, whenever a garbage collection might occur, no references remain in
temporaries or registers.
The third recycling strategy requires us to attach a counter to every object in the heap.
This counter is incremented whenever a reference to the object is created, and decremented
whenever a reference is destroyed. When the counter is decremented to its initial value of 0,
storage for the object can be recycled because the object is obviously inaccessible. Mainte-
nance of the counters results in higher administrative and storage costs, but the overheads are
distributed. The program simply runs slower overall; it does not periodically cease normal
operation to reclaim storage. Unfortunately, the reference counter method does not solve all
problems:
Reference counts in a cyclic structure will not become 0 even after the structure as a
whole becomes inaccessible.
If a counter over ows, the number of references to the object is lost.
A complete solution requires that the reference counters be backed up by a garbage col-
lector.
To support storage management by reference counting, the compiler must be able to iden-
tify all assignments that create or destroy references to heap objects. The code generated for
such assignments must include appropriate updating of the reference counts. Diculties arise
when variant records may contain references, and assignments to the tag eld identifying the
variant are allowed: When such an assignment alters the variant, it destroys the reference
even though no direct manipulation of the reference has taken place. Similar hidden destruc-
tion occurs when there is a jump out of a procedure that leads to deletion of a number of
activation records containing references to heap objects. Creation of references is generally
easier to keep track of, the most dicult situation probably being assignment of a composite
value containing references as minor components.
3.4 Mapping Specications 63

3.4 Mapping Specications

The results of the analysis discussed in the earlier sections of this chapter should be em-
bodied in a document called a mapping specication (Figure 3.15) for the particular source
language/target machine pair. It should not only give the nal results, but also the reasoning
that led to them. Even when a particular choice was obvious, a brief statement of its basis
should be made. For example, one normally chooses the representation of integer values to
be that assumed by the hardware `add integer' instruction; a single sentence stating this fact
should appear in the specication.
L TO M MAPPING SPECIFICATION
1 The Abstract M
1.1 Storage Classes
One subsection per storage class (see Section 3.1.1).
1.2 Access Paths
One subsection per access path (see Section 3.1.2).
1.3 Instructions
One subsection per operation class (see Section 3.1.3).
2 Storage Mapping
2.1 Primitive Data Types
One subsection per primitive data type of L (see Section 3.2.1).
2.2 Composite Data Types
One subsection per composite data type of L (see Section 3.2.2).
2.3 Computation State
One subsection describing register usage, one describing the use of space for code
and constants, and one per storage area type (e.g. static, stack, heap - see Sec-
tion 3.3) required by L.
3 Operation Mapping
3.1 Routine Invocation
One subsection per operation (e.g. procedure call, procedure entry, formal call,
jump out of a procedure) required by L. Block entry/exit should also be covered
when L requires that these operations manipulate the computation state.
3.2 Control Structures
One subsection per control structure of L (see Section 3.2.4).
3.3 Expressions
3.3.1 Attributes
Information to be exchanged among the nodes of an expression (see Sec-
tion 3.2.3).
3.3.2 Encodings
Encoding of each L operation as a sequence of instructions and access paths
from the abstract M , as a function of the information exchanged among ex-
pression nodes.
Figure 3.15: Outline of a Mapping Specication
64 Properties of Real and Abstract Machines

Section 1 of the mapping specication relies heavily on the manufacturer's manual for
the target machine. It describes the machine as it will be seen by the code generator, with
anomalies smoothed out and omitted operations (to be implemented by code sequences or
subroutines) in place. The actual details of realizing the abstraction might be included, or this
information might be the subject of a separate specication. We favor the latter approach,
because the abstraction should be almost entirely language-independent. It is clear that
the designer must decide which facilities to include in the abstract machine and which to
implement as part of the operation mapping. We cannot give precise criteria for making this
choice. (The problem is one of modular decomposition, with the abstraction constituting a
module and the operation encoding using the facilities of that module.)
The most dicult part of Section 2 of the mapping specication is Section 2.3, which
is tightly coupled to Section 3.1. Procedure mechanisms advocated by the manufacturer
are often ill-suited to the requirements of a given language. Several alternative mechanisms
should be explored, and detailed cost estimates prepared on the basis of some assumptions
about the relative numbers of calls at various static nesting depths and accesses to variables.
It is imperative that these assumptions be carefully stated, even though there is only tenuous
justication for them; unstated assumptions lead to con icting judgements and usually to
a suboptimal design. Also, if measurements later indicate that the assumptions should be
changed, the dependence of the design upon them is clearly stated.
Control structure implementation can be described adequately using notation similar to
that of Figure 3.9. When a variety of information is exchanged among nodes of an expres-
sion, however, description of the encoding for each node is complicated. The best notation
available seems to be the extended-entry decision table, which we discuss in this context in
Section 10.3.2.
A mapping specication is arrived at by an iterative process, one that should be allotted
sucient time in scheduling a compiler development project. The cost is dependent upon
the complexities of both the source language and the target machine. In one specic case,
involving a Pascal implementation for the Motorola 68000, two man-months of eort was
required over a six-month period. One person should be responsible for the specication, but
at least one other (and preferably several) should be involved in frequent critical reviews. The
objective of these reviews should be to test the reasoning based upon the stated assumptions,
making certain that it has no aws. Challenging the assumptions is less important unless
specic evidence against them is available.
Sections 2.1 and 2.2 of the mapping specication should probably be written rst. They
are usually straightforward, and give a basis on which to build. Sections 2.3 and 3.1 should be
next. As indicated earlier, these sections interact strongly and involve dicult decisions. The
remainder of Section 3 is tedious, but should be carried out in full detail. It is only by being
very explicit here that one learns the quirks and problems of the machine, and discovers the
aws in earlier reasoning about storage mapping. Section 1 should be done last, not because
it is the least important, but because it is basically a modication of the machine manual in
the light of the needs generated by Section 3.

3.5 Notes and References

The question of mapping programming language constructs onto hardware has been con-
sidered piecemeal by a number of authors. Tanenbaum [1976] gives a good overview of the
issues involved, and further information can be gleaned from specic abstract machine designs
Richards [1971]; Tanenbaum [1978]; Haddon and Waite [1978]. Floating point abstrac-
tions are discussed by Brown [1977, 1981] and Cody and Waite [1980] and a standard
3.5 Notes and References 65

has been dened by a committee of IEEE [Stevenson, 1981]. McLaren [1970] provides a
comprehensive discussion of data structure packing and alignment. Randell and Russell
[1964] detail the implementation of activation record stacks and displays in the context of
ALGOL 60; Hill [1976] updates this treatment to handle the problems of ALGOL 68.
Static storage management is not the only possible strategy for FORTRAN implementa-
tions. Both the 1966 and 1978 FORTRAN standards restrict the extent of objects, and thus
permit dynamic storage management via a stack. We have not pursued the special storage al-
location problems of COMMON blocks and EQUIVALENCE statements here; the interested
reader is referred to Chapter 10 of the book by Aho and Ullman [1977] and the original
literature cited there.
Our statements about the probability of access to objects at various nesting depths are
debatable because no really good statistics exist. These probabilities are dependent upon the
hierarchical organization of the program, and may vary considerably between applications
and system programs.
The fact that a procedure used as a parameter must carry its environment with it ap-
pears in the original treatment of LISP [McCarthy, 1960]. Landin [1964] introduced the
term `closure' in connection with his mechanization of Lambda expressions. More detailed
discussions are given by Moses [1970] and Waite [1973a]. Hill [1976] applied the same
mechanism to the problem of dynamic scope checking in ALGOL 68.
An overall treatment of storage management is beyond the scope of this book. Knuth
[1968a] provides an analysis of the various general strategies, and a full discussion of most
algorithms known at the time. A general storage management package that permits a wide
range of adaptation was presented by Ross [1967]. The most important aspect of this package
is the interface conventions, which are suitable for most storage management modules.
Both general principles of and algorithms for garbage collection and compaction (the
process of moving blocks under the user's control to consolidate the free space into a single
block) are covered by Waite [1973a]. Wegbreit [1972] discusses a specic algorithm with
an improved worst-case running time.
Several authors [Deutsch and Bobrow, 1976; Barth, 1977; Morris, 1978] have shown
how to reduce the cost of reference count systems by taking special cases into account. Clark
and Green [1977] demonstrated empirically that over 90% of the objects n typical LISP
programs never have reference counts greater than 1, a situation in which the technique
operates quite eciently.

Exercises
3.1 List the storage classes and access paths available on some machine with which you are
familiar. Did you have diculty in classifying any of the machine's resources? Why?
3.2 Consider access to data occupying a part of a word on some machine with which you
are familiar. Does the best code depend upon the bit position within the word? Upon
the size of the accessed eld? Try to characterize the set of `best' code sequences. What
information would you need to choose the proper sequence?
3.3 [Steele, 1977] Consider the best code for implementing multiplication and division of
an integer by a power of 2 on some machine with which you are familiar.
(a) Would multiplication by 2 best be implemented by an add, a multiply or a shift?
Give a detailed analysis, taking into account the location and possible values of
the multiplicand.
66 Properties of Real and Abstract Machines

(b) If you chose to use a shift for division, would the proper result be obtained when
the dividend was negative? Explain.
(c) If your machine has a condition code that is set as a side eect of arithmetic
operations, would it be set correctly in all of the cases discussed above?
3.4 For some computer with which you are familiar, design encodings for the elementary
types boolean , integer , real of Pascal. Carefully defend your choice.
3.5 Consider the representation of a multi-dimensional array.
(a) In what manner can a user of ALGOL, FORTRAN or Pascal determine whether
the elements are stored in row- or column-major order?
(b) Write optimum code for some computer with which you are familiar that imple-
ments the following doubly-nested loop over an object of type array [1::m; 1::n]
of integer stored in row-major order. Do not alter the sequence of assignments
to array elements. Compare the result with the same code for an array stored in
column-major order.
for i := 1 to m do
forj := 1 to
n do
a [i, j] := 0;
(c) Explain why a test that the aective address of an array element falls within
the storage allocated to the array is not sucient to guarantee that the access is
dened.
3.6 Carefully describe the implementation of the access function for an array element (Sec-
tion 3.2.2) in each of the following cases:
(a) The ctitious starting address lies outside of the address space of the computer.
(b) The computer provides only base registers (i.e. the registers involved in the access
computation of Section 3.1.3 cannot hold signed values).
3.7 Consider a computer requiring certain data items to be stored with alignment 2, while
others have no alignment constraints. Give an algorithm that will rearrange any arbi-
trary record to occupy minimum storage. Can this algorithm be extended to a machine
whose alignment constraints require addresses divisible by 2, 4 and 8?
3.8 Give a mapping of a Pascal while statement that places the condition at the begin-
ning and has the same number of instructions as Figure 3.9d. Explain why there is
less opportunity for parallel execution in your mapping than in Figure 3.9d. Under
what circumstances would you expect your expansion to execute in less time than Fig-
ure 3.9d? What information would the compiler need in order to decide between these
schemata on the basis of execution time?
3.9 Consider the mapping of a BASIC FOR statement with the general form:
FOR I= e1 TO e2 STEP e3
:::
NEXT I
Give implementations of forbegin and forend under each of the following conditions:
(a) e1 =1, e2 =10, e3 =1
(b) e1 =1, e2 =10, e3 =7
3.5 Notes and References 67

(c) e1 =10, e2 =1, e3 =-3

(d) e1 =10, e2 =1, e3 =1
(e) e1 =A, e2 =B, e3 =C
Does your answer to (e) work when A is the largest negative integer representable on
the target machine? When B is the largest positive representable integer? If not, what
is the cost of repairing this defect? Would you consider this cost acceptable in the light
of the probability of such bounds?
3.10 For some machine with which you are familiar, compare the cost of access to statically-
allocated objects, objects allocated at xed locations in an activation record, elements
of dynamic arrays and objects allocated on the heap. Be sure to account for any
necessary base register loads.
3.11 The state change operations summarized in Figure 3.12 are actually implemented by a
combination of code at the call site, code in the procedure or block, and common code
in system subprograms. Consider their realization on some machine with which you
are familiar.
(a) Operations at the call site should be minimized, at least when the procedure is
called directly. What is the minimum code you can use? (You may change the
activation record layout of Figure 3.11 arbitrarily to suit your implementation.)
(b) How do you handle the fact that a given procedure may be called either directly or
as a parameter? Show that the environment is properly initialized in both cases.
(c) Compare the cost of using a display with that of using simply static and dynamic
pointers. On the basis of your answer to Exercise 3.8, determine the break-even
point for a display in terms of number of variable accesses.
3.12 Code the display update routine of Figure 3.4 for some machine with which you are
familiar. What average nesting depth constitutes the break-even point for the early
termination test? On the basis of your own experience, should the test be included or
not?
3.13 Under what circumstances is it impossible to compare the extents of two objects by
comparing their addresses?
3.14 For some machine with which you are familiar, design a schema for representing type
templates. Be sure to handle variant records and dynamic arrays.
3.15 Suppose that a machine provides no `undened' value. What values would you propose
to use as implicit initializations for Pascal boolean, integer and real variables? Explain
your choices.
3.16 Under what circumstances would you consider transmitting arguments and results in
registers? Illustrate your answer with several real machines.
3.17 Consider the following LAX fragment:
declare
procedure p (a : array [ ] of integer ); ...;
procedure q : array [ ] of integer ; ...
begin p (q) end;
68 Properties of Real and Abstract Machines

(a) Explain why this fragment is illegal.

(b) Suppose that the fragment were legal, and had the obvious eect: Procedure q
creates an array, which is then passed to procedure p. Discuss a storage man-
agement strategy for the array elements. Where should the storage be allocated?
Can we avoid copying the array? What tradeos are involved?
Chapter 4
Abstract Program Representation
Decomposition of the compilation process leads to interfaces specied by abstract data types,
and the basic purposes of these interfaces are largely independent of the source language and
target machine. Information crossing an interface between major compilation tasks consti-
tutes a representation of the program in an intermediate language. This representation may
or may not be embodied in a concrete data structure, depending upon the structure and goals
of a particular compiler. Similarly, the characteristics of a particular compiler may make it
useful to summarize the properties of objects in tables stored separately from the program
text.
The general characteristics of each interface stem from the modular decomposition of the
compiler discussed in Chapter 1. In this chapter we consider several important intermediate
languages and tables in detail. By determining the content and possible realization of these
interfaces, we place more concrete requirements upon the major compilation tasks.

4.1 Intermediate Languages

Our decomposition leads to four intermediate languages: the token sequence, the structure
tree, the computation graph and the target tree. A program is transformed from one to the
other in the order given, and they will be presented here in that order.

4.1.1 Token Sequence

Chapter 2 pointed out that a source program is composed of a sequence of basic symbols.
These basic symbols, rather than the characters from which they are formed, are the relevant
units of the source text. We shall use the term symbol to denote the external representation
of a basic symbol (or an encoding thereof); a token is the internal representation.
LAX symbols are described in Section A.1. Production A.1.0.1 classies them as identi-
ers, denotations and delimiters respectively. Comments are not basic symbols, and therefore
do not appear in the token sequence.
We can characterize the information carried by one token in terms of the type declarations
shown in Figure 4.1. Location encodes the information required to relate an error message
to the source language listing. Section 12.1.3 discusses error reporting mechanisms in detail,
and hence we leave the specication of the type coordinates open until then.
Most syntactic classes (encoded by members of the enumerated type tokens) contain only
a single symbol. Tokens representing such symbols need specify only the syntactic class. Only
identiers and denotations require additional information.
69
70 Abstract Program Representation

type
tokens = ( (* classication of LAX tokens *)
identifier , (* A.1.0.2 *)
integer denotation , (* A.1.0.6 *)
floating point denotation , (* A.1.0.7 *)
plus , :::
, equivalent , (* specials: A.1.0.10 *)
and kw , :::
, while kw ); (* keywords: A.1.0.11 *)
abstract token = record
location : coordinates; (* for error reports *)
caseclassification : tokens of
identifier : (sym : symbol );
integer denotation : (intv : integer value );
floating point denotation : (fptv : real value );
end;

Figure 4.1: LAX Abstract Token

A LAX identier has no intrinsic meaning that can be determined from the character string
constituting that identier. As a basic symbol, therefore, the only property distinguishing
one identier from another is its external representation. This property is embodied in the
sym eld of the token. Section 4.2.1 will consider the type symbol , and explain how the
external representation is encoded.
The eld intv or fptv is a representation of the value denoted by the source language
denotation that the token abstracts. There are several possibilities, depending upon the goals
of the particular compiler; Section 4.2.2 considers them in detail.

4.1.2 Structure Tree

A structure tree is a representation of a compilation unit in terms of source concepts. It is an
ordered tree (in the sense of Section B.1) whose structure is that of an abstract syntax of the
source language. Additional information is attached to the nodes during semantic analysis and
the beginning of code generation. We call this information attributes, and, to emphasize the
attribution, the augmented tree is sometimes termed an attributed structure tree. Important
attributes are the identity of the internal object corresponding to an identier, the types
of the operands and result of an expression, or the operation corresponding to an operator
indication (e.g. the distinction between integer and real addition, both originally specied by
`+').
Each node of the structure tree corresponds to a rule of the language denition. Because
the structure tree follows the abstract rather than the concrete syntax, some rules will never
have corresponding nodes in any structure tree. Furthermore, the concrete syntax may use
several names for a single construct of the abstract syntax. Figure 4.2 illustrates these con-
cepts with an example from LAX. The nodes of the tree have been labelled in Figure 4.2a
with the corresponding rules from Appendix A. A single rule in Appendix A may incorporate
many denitions for the same construct, and we have appended lower-case letters to the rule
number in order to distinguish these denitions. Thus `A.4.0.9b' is the second alternative for
rule A.4.0.9 { sum ::= sum addop term . Expression , assignment , disjunction , and so
forth are dierent names appearing in the concrete syntax for the expression construct of the
4.1 Intermediate Languages 71

abstract syntax. This means that any node corresponding to a rule dening any of these will
have the attributes of an expression attached to it. Figure 4.2b indicates which of the names
dened by rules used in Figure 4.2a are associated with the same abstract syntax construct.
A.4.0.2

A.4.0.16a A.4.0.9b

A.1.0.2 A.4.0.16a A.4.0.10a A.4.0.16a

A.1.0.2 A.1.0.2

a) Structure
expression , assignment , disjunction , conjunction ,
comparison , relation , sum , term , factor , primary :
primode , postmode : entity

name :
mode : entity

eqop , relop , addop , mulop , unop :

rator : operation

identifier :
sym : symbol
ent : entity

b) Attributes
Figure 4.2: Structure Tree for x := y + z
The sym attribute of an identier is just the value of the sym eld of the corresponding
token (Figure 4.1). This attribute is known as soon as the node to which it is attached is
created. We call such attributes intrinsic. All of the other attributes in the tree must be
computed. The details of the computations will be covered in Chapters 8 and 9; here we
merely sketch the process.
Ent characterizes the object (for example, a particular integer variable) corresponding to
the identier sym . It is determined by the declarations valid at the point where the identier
is used, and gives access to all of the declarative information. Section 4.2.3 discusses possible
representations for an entity .
The mode attribute of a name is the type of the object named. In our example it can
be obtained directly from the declarative information made accessible by the ent attribute
of the descendant node. In any case, it is computed on the basis of attributes appearing
in the `A.4.0.16a' node and its descendants. The term synthesized is used to describe such
attributes.
Two types are associated with each expression node in the tree. The rst, primode , is the
type determined without regard to the context in which the expression is embedded. This
is a synthesized attribute, and in our example the primode of an expression dened by an
`A.4.0.15b' node is simply the mode of the name below it. The second type, postmode , is the
72 Abstract Program Representation

type demanded by the context in which the expression is embedded. It is computed on the
basis of attributes of the expression node, its siblings, and its ancestors. Such attributes are
called inherited.
If primode 6= postmode then either a semantic error has occurred or a coercion is neces-
sary. For example, if y and z in Figure 4.2 were declared to be of types boolean and real
respectively then there is an error, whereas if they were declared to be integer and real
then a coercion would be necessary.
Three classes of operation, creation, access and assignment are necessary to manipulate
the structure tree. A creation operation establishes a new node of a specied type. Assignment
operations are used to interconnect nodes and to set attribute values, while access operations
are used to extract this information. With these operations we can build trees, traverse them
computing attribute values, and alter their structure. Structure tree operations are invoked
as the source program is parsed, constructing the tree and setting intrinsic attribute values.
One or more additional traversals of the completed tree may be necessary to establish all
attribute values. In some cases the structure of the tree may be altered during attribute
computation. Chapter 8 explains how the necessary traversals of the structure tree can be
derived from the dependence relations among the attributes. (Figure 4.3 shows some basic
traversal strategies.)
process node A;
if node A is not a leaf then
process all subtrees of A from left to right;
a) Prex traversal
if node A is not a leaf then
process all subtrees of A from left to right;
process node A;
b) Postx Traversal
process node A;
while subtrees of A remain do
begin
process next (to the right) subtree of A;
process node A;
end;
c) Hybrid traversal
Figure 4.3: Traversal Strategies
The result of processing a structure tree is a collection of related information. It may
be possible to produce this result without ever actually constructing the tree. In that case,
the structure and attributes of the tree were eectively embedded in the processing code.
Another possibility is to have an explicit data structure representing the tree. Implementation
constraints often prevent the compiler from retaining the entire data structure in primary
memory, and secondary storage must be used. If the secondary storage device is randomly-
addressable, only the implementation of the structure tree operations need be changed. If
it is sequential, however, constraints must be placed upon the sequences of invocations that
are permitted. An appropriate set of constraints can usually be derived rather easily from a
consideration of the structure tree traversals required to compute the attributes.
Any of the traversal strategies described by Figure 4.3 could be used with a sequential
storage device: In each case, the operation `process node A' implies that A is the currently-
4.1 Intermediate Languages 73

accessible element of the device. It may be read, altered, and written to another device.
The remaining operations advance the device's `window', making another element accessible.
Figure 4.4 illustrates the correspondence between the tree and the sequential le. The letters
in the nodes of Figure 4.4a stand for the attribute information. In Figures 4.4b and 4.4c, the
letters show the position of this information on the le. Figure 4.4d diers from the others
in that each interior node is associated with several elements of the le. These elements
correspond to the prex encounter of the node during the traversal ( agged with `('), some
number of inx encounters ( agged with `,'), and the postx encounter ( agged with `)').
Information from the node could be duplicated in several of these elements, or divided among
them.
a

b c

d e f

g h

a) A tree
d e b g h f c a

b) Postx linearization
a b d e c f g h

c) Prex linearization
a b d b e b a c f g f h f c a
( ( , ) , ( ( , ) ) )

d) Hybrid linearization
Figure 4.4: Linearization by Tree Traversal
The most appropriate linearization of the tree on the basis of tree traversals and tree
transformations is heavily dependent upon the semantic analysis, optimization and code gen-
eration tasks. We shall return to these questions in Chapter 14. Until then, however, we shall
assume that the structure tree may be expressed as a linked data structure.

4.1.3 Computation Graph

A computation graph is an abstract representation of a compilation unit in terms of target
concepts. It is a directed graph whose nodes correspond to target operations and whose edges
describe control and data ow. The access to identied variables and intermediate results are
not represented.
Each node of the computation graph species a single abstract target machine operation.
In addition to the operation, the node species its successor(s) and an appropriate set of
operands. An operand may be another computation graph node (indicating the result of
that node's computation), an identied variable (indicating the address of that variable), or a
constant (indicating the value of that constant). Figure 4.5 is a computation graph describing
the algorithm of Figure 1.1a in terms of an abstract target machine based on Exercise 1.3.
Note that the accumulator is never mentioned in Figure 4.5. This is indicative of the ab-
74 Abstract Program Representation

stract nature of the computation graph: It uses target operations, but not target instructions,
separating operations from access paths. Moreover, the concept of a value has been separated
from that of a variable. As we shall see in Chapter 13, this is a crucial point for common
subexpression recognition.
SUB
i
j

JZERO

exit

SUB
i
j

JNEG

SUB SUB
i i
j j

STORE STORE

j i

Figure 4.5: A Computation Graph

Figure 4.1.3 describes the array assignment a[i] := a[j ], assuming a byte-addressed target
machine and an array with 4-byte elements. The address computation described at the
beginning of Section 3.2.2 appears explicitly. Address (a[0]) is represented by the identier
a and the PA operation adds an integer to an address, yielding an address.
Computation graphs are often linearized as sequences of tuples. The tuples are implicitly
linked in the order of the sequence, and hence the last eld of the nodes in Figures 4.5 and 4.1.3
can be dropped. An explicit JMP operation is introduced to allow arbitrary linkage. `Triples'
(Figure 4.6) and `quadruples' are examples of this technique. The only dierence between
them is that in the latter the node identication is given explicitly while in the former it is
assumed to be the index of the node in the sequence. Figure 4.7 shows a more convenient
notation for human consumption.

4.1.4 Target Tree

The target tree forms the interface between code generation and assembly. Its structure
and most of the attribute values for its nodes are established during code generation; some
attribute values may be added during assembly. The structure of the tree embodies code
sequence information, while the attributes specify particular machine instructions and address
4.1 Intermediate Languages 75

ADR ADR
a a

VAL VAL
4 4

MUL MUL
i j

PA PA

STI LDI

Note: PA adds an integer to an address, yielding an address

captionConstant Operations and Array Access

Triple Operation Operands

1 VAL i
2 VAL j
3 SUB (1) (2)
4 JZERO (3) (19)
5 VAL j
6 VAL i
7 SUB (5) (6)
8 JNEG (7) (14)
9 VAL j
10 VAL i
11 SUB (9) (10)
12 STORE j (11)
13 JMP (1)
14 VAL i
15 VAL j
16 SUB (14) (15)
17 STORE i (16)
18 JMP (1)
Note: (t) is a reference to triple t
Figure 4.6: Triple Representation of Figure 4.5
76 Abstract Program Representation

t1 : i " t4 : j "
t2 : t1 4 t5 : t4 4
t3 : a + t2 t6 : a + t5
t7 : t3 := t6
Figure 4.7: Human-Readable Representation of Figure 4.1.3

computations. These characteristics are largely independent of both the source language and
the target computer.
The operations necessary to manipulate the target tree fall into the same classes as those
necessary to manipulate the structure tree. As with the structure tree, memory constraints
may require that the target tree be placed in secondary memory. The most reasonable lin-
earization to use in this case is one corresponding closely to the structure of a normal symbolic
assembly language.
Figure 4.8 gives a typical layout for a target tree node. Machine op would be a variant
record that could completely describe any target computer instruction. This record might
have elds specifying the operation, one or more registers, addresses and addressing modes.
Similarly, constant specification must be capable of describing any constant representable
on the target computer. For example, the specication of a literal constant would be similar
to that appearing in a token (Figure 4.1 and Section 4.2.2); an address constant would be
specied by a pointer to an expression node dening the address. In general, the amount of
space to be occupied by the constant must also be given.
type
instructions = ( (* Classication of target abstractions *)
operation , (* machine instruction *)
constant , (* constant value *)
label , (* address denition *)
sequence , (* code sequence *)
expression ); (* address expression *)
"
target node = t node block ;
t node block = record
link : target node;
caseclassification : instructions of
operation : (instr : machine op );
constant : (value : constant specification );
label : (addr : address );
sequence : (seq , origin : target node );
expression : (rator : expr op ; rand 2 : target node );
end;

Figure 4.8: Target Code Node

A label is an address constant. The label node is placed in a code sequence at some
arbitrary point, and represents the address at that point. When this address is used as an
operand in an address expression, one of the operands of the expression node is a pointer to
the label node. The addr eld is an example of an attribute whose value is established during
assembly: It species the actual machine address, in a form that can be used as an expression
4.2 Global Tables 77

operand. It is important to stress that this attribute is not set by the code generator; the
code generator is responsible only for establishing the label node and any linkages to it.
A target program may consist of an arbitrary number of code sequences, each of which
consists of instructions and/or data placed contiguously in the target computer memory. Each
sequence appears in the target tree as a list of operation, constant and label nodes rooted
in a sequence node. If the origin eld of the sequence node species an address expression
then the sequence begins at the address which is the value of that expression. Thus the
placement of a sequence can be specied relative to another sequence or absolutely in the
target computer memory. In the absence of an origin expression, a sequence will be placed
in an arbitrary position that guarantees no overlap between it and any other sequence not
based upon it. (A sequence s1 is based upon a sequence s2 when the origin expression of s1
depends upon a label node in s2 or in some sequence based upon s2 .) Related code sequences
whose origin expressions result in gaps between them serve to reserve uninitialized storage,
while overlapping sequences indicate run-time overlays.
Address expressions may contain integers and machine addresses, combined by the four
basic integer operations with the normal restrictions for subexpressions having machine ad-
dresses as operands. The code generator must guarantee that the result of an address ex-
pression will actually t into the eld in which it is being used. For some machines, this
guarantee cannot be made in general. As a result, either restrictions must be placed upon the
expressions used by the code generator or the assembler must take over some aspects of the
code generation task. Examples of the latter are the nal selection of an instruction from a
set whose members dier only in address eld size (e.g. short vs. long jumps), and selection
of a base register from a set used to access a block of memory. Chapter 11 will consider such
problems in detail.

4.2 Global Tables

We extract specic information from the token sequence, structure tree, computation graph or
target tree and represent it in special tables to simplify the program representation, to speed
up search processes, or to avoid many repetitions of the same data. In particular, we often
replace variable-length data by xed-length keys and thereby simplify storage management.

4.2.1 Symbol Table

The purpose of the symbol table is to provide a unique, xed-length encoding for the identiers
(and possibly the keywords) occurring in a program. In most programming languages the
number of possible identiers, and hence the length of the encoding, is very large. Since only
a tiny fraction of the possible identiers occur in any particular program, a much shorter
encoding suces and the symbol table must uniquely map the identiers into this encoding.
If the entire set of identiers is not known a priori then such a mapping can be achieved only
by comparing each input character string against those already encountered.
A symbol table module provides three basic operations:
initialize : Enter the standard identiers.
give symbol (identier string) symbol : Obtain the encoding of a specied identier.
give string (symbol) identier string : Obtain the identier having a specied encoding.
Additional operations for delivering identiers in alphabetical order are necessary if cross-
reference tables are to be produced.
78 Abstract Program Representation

Although the symbol table is used primarily for identiers, we advocate inclusion of key-
words as well. No separate recognition procedure is then required for them. With this
understanding, we shall continue to speak of the symbol table as though its only contents
were identiers.
The symbol is used later as a key to access the identier's attributes, so it is often encoded
as a pointer to a table containing those attributes. A pointer is satisfactory hen only one such
table exists and remains in main storage. Positive integers provide a better encoding when
several tables must be combined (as for separate compilation in Ada) or moved to secondary
storage. In the simplest case the integers chosen would be 1,2,: : :
Identiers may be character strings of any length. Since it may be awkward to store
a table of strings of various lengths, many compilers either x the maximum length of an
identier or check only a part of the identier when computing the mapping. We regard
either of these strategies as unacceptable. Clearly the nite size of computer memory will
result in limitations, but these should be placed on the total number of characters rather
than the length of an individual identier. Failure to check the entire identier may result in
incorrect analysis of the source program with no indication to the programmer.
The solution is to implement the symbol table as two distinct components: a string table
and a lookup mechanism. The string table is simply a very large, packed array of characters,
capable of holding all of the distinct identiers appearing in a program. It is implemented
using a conventional virtual storage scheme (Exercise 4.4), which provides for allocation of
storage only as it is needed. The string forms of the identiers are stored contiguously in this
array, and are specied by initial index and length.
In view of the large number of entries in the symbol table (often resulting mainly from
standard identiers), hash techniques are preferable to search trees for implementing the
lookup mechanism. The length of the hash table must be specied statically, before the
number of identiers is known, so we choose the scheme known as `open hashing' or `hash
with chaining': A computation is performed on the string to select one of M lists, which is
then searched sequentially. If the computation distributes the strings uniformly over the lists,
then the length of each will be approximately (number of distinct identiers)/M . By making
M large enough the lengths of the lists can be reduced to one or two items.
The rst decision to be made is the choice of hash function. It should yield a relatively
smooth distribution of the strings across the M lists, evaluation should be rapid, and it must
be expressible in the implementation language. One computation that gives good results is
to express the string as an integer and take the residue modulo M . M should be a prime
number not close to a power of the number of characters in the character set. For example,
M = 127 would not be a good choice if we were dealing with a 128-character set; M = 401,
on the other hand, should prove quite satisfactory.
There are two problems with the division method: It is time-consuming for strings whose
integer representations exceed the single-length integer range of the implementation language,
and it cannot be expressed at all if the implementation language is strongly typed. To solve
the former, we generally select some substring for the hash computation. Heads or tails of
the string are poor choices because they tend to show regularities (SUM1, SUM2, SUM3
or REDBALL, BLUEBALL, BLACKBALL) that cause the computation to map too many
strings into the same list. A better selection is the center substring:
if jsj n then s else substr (s; (jsj , n) div 2; n);
(Here s is the string, jsj is the length of s and n is the length of the longest string representable
as a single-length integer. The function substr (s; f; l) yields the l-character substring of s
beginning at the f th character.)
The constraints of a strongly-typed implementation language could be avoided by provid-
ing a primitive transfer function to convert a suciently short string into an integer for type
4.2 Global Tables 79

checking purposes. It is important that this transfer function not involve computation. For
example, if the language provides a transfer function from characters to integers, a transfer
function from strings to integers could be synthesized by a loop. This approach defeats the
whole purpose of the hashing function, however, by introducing a time-consuming computa-
tion. It would probably be preferable to use a single character to select the list in this case
and accept a longer search!
Comparison of the input identier with the symbols already present in the table can be
speeded up by a variety of quick checks, the simplest of which is comparison of string lengths.
Whether or not such checks are useful depends upon the precise costs of string comparison
and string table access.
In a multi-pass compiler, the lookup mechanism may be discarded after the lexical analysis
has converted identiers to symbols. The string table must, however, be retained for later
tasks such as module linking.

4.2.2 Constant Table

Literal constant values appearing in the program must be retained and possibly manipulated
during compilation. Compile-time computation involving numeric operations must be carried
out using the semantics of the target machine. In other words, integer operations must
conform to the range of the target machine's integer arithmetic, and oating point operations
must conform to its radix, range, precision and rounding characteristics. Because of this,
we regard the constant table as an abstract data type: It denes a set of values, and any
computations involving these values must be carried out by operations that the constant table
provides.
We distinguish three conceptually distinct representations of a constant: the character
representation appearing in the source program, the internal representation dened by the
constant table, and the representation required by the target machine. The constant table
module provides conversion operations to accept source representations and return inter-
nal representations, and to accept internal representations and return target representations.
Source-to-internal conversions are invoked during lexical analysis, while internal-to-target con-
versions are invoked during assembly. Although the three representations are conceptually
distinct, two or more of them may be physically identical in a particular compiler. For exam-
ple, a LAX oating point constant might have identical internal and target representations.
The constant table module could use a string table of the form introduced in the previous
section to store string constants. Since identical string constants occur rarely in a program,
no search is needed to enter strings into the table; each is simply inserted as it is encountered.
A xed-length encoding then consists of a string table index and length, which the constant
table module delivers as the internal value of the constant. In a multi-pass compiler the string
table could reside in secondary storage except during lexical analysis and assembly.
In addition to conversions, the constant table module must provide computational and
comparison operations for the internal representations. These operations are used not only
for manipulating denotations that appear in the source program, but also for carrying out all
computations and comparisons of program-dened values during semantic analysis and code
generation. For example, consider the Pascal type constructor array [l::u] of m. During
semantic analysis, constant table operations are used to verify that the lower bound does not
exceed the upper; during code generation they are used to compute the size and alignment
of the array.
The requirements of semantic analysis and code generation determine the set of operations
that must be provided. In general, these operations should duplicate the behavior of the
equivalent operations on the target machine. For example, a character comparison should
80 Abstract Program Representation

follow the target machine collating sequence. The range of integer values, however, must
normally be larger than that of the target machine. Suppose that we compile a program
containing the type constructor of the previous paragraph for the PDP11 (maxint = 32767).
Suppose further that l = ,5000, u = 5000 and m is real. This is a perfectly legal declaration
of an array that will easily t into the 65536-byte memory of the PDP11, but computation
of its size in bytes (40004) over ows the PDP11's integer range.
If the compiler is being executed on the target machine, this requirement for increased
range implies that the computational and comparison operations of the constant table must
use a multiple-precision representation. Knuth [1969] describes in detail how to implement
such a package.
Although, as shown above, over ow of the target machine's arithmetic range is legitimate
in some cases, it is often forbidden. When the user writes an expression consisting only of
constants, and that expression over ows the range of the target machine, the over ow must
be detected if the expression is evaluated by the compiler. This leads to a requirement that
the constant table module provide an over ow indicator that is set appropriately by each
computational operator to indicate whether or not the computation would over ow on the
target machine. Regardless of the state of the over ow indicator, however, the constant table
should yield the (mathematically) correct result.
In most programming languages, a particular numeric value can be expressed in many
dierent ways. For example, each of the following LAX oating point numbers expresses the
value `one thousand':
1000000E-3 1.0E3 .001E6 1000.0
The source-to-internal conversion operators of the constant module should accept only
a standardized input format. Nonzero integers are normally represented by a sequence of
digits, the rst of which is nonzero. A suitable representation for nonzero oating point
numbers is the pair (signicand, exponent), in which the signicand is a sequence of digits
without leading or trailing zeros and the exponent is suitably adjusted. The signicand can be
interpreted either as an integer or a normalized decimal fraction. `One thousand' would then
be represented either as ('1',3) or as ('1',4) respectively. A fractional signicand is preferable
because it can be truncated or rounded without changing the exponent. Zero is represented
by ('0',0). In Section 6.2 we shall show how the standardized format is obtained by the lexical
analyzer.
If no oating point arithmetic is provided by the constant table then the signicand can
be stored in a string table. The internal representation is the triple (string table index,
signicand length, adjusted exponent). When compile-time oating point operations are
available, oating point numbers are converted to an internal representation of appropriate
accuracy for which the arithmetic of the target machine can be simulated exactly. (Note that
decimal arithmetic is satisfactory only if the target machine also uses decimal arithmetic.)

4.2.3 Denition Table

Types, variables, procedures and parameters are examples of entities : components of the
program whose attributes are established by declaration. Most of the leaves of the structure
tree represent uses of entities, at which the entity's attributes must be made available. A
denition table abstracts the entities, avoiding the need to explicitly reproduce all of the
attributes of an entity at each of the leaves representing its uses. There is one denition
table entry for each declared entity, and this entry holds all attributes of that entity. A leaf
representing the use of an entity contains a reference to the denition table.
We must emphasize that a denition table merely restates structure tree information in
a more compact and accessible form. (Section 8.3.2 will show how to partially automate the
4.3 Notes and References 81

choice of information to be included in a denition table.) Thus each form of the structure
tree has, at least conceptually, an associated denition table. Transformations of the structure
tree imply corresponding transformations of the denition table. Whether the denition table
is actually transformed, or a new denition table is built from the transformed tree, is an
implementation decision that depends upon two factors:
The relative costs of transformation and reconstruction.
The relationship between the traversal needed to reconstruct the information and the
traversal using that information.
When assessing the relative costs, we must be certain to consider the extra storage required
during the transformation as well as the code involved.
The second factor mentioned above may require some elaboration: Consider the denition
table used during semantic analysis and that used during code generation. Although the
structure tree may be almost the same for these two processes, the interesting attributes of
dened objects are usually quite dierent. During semantic analysis we are concerned with
source properties; during code generation with target properties. Thus the denition tables
for the two processes will dier. Suppose further that our code generation strategy requires
a single depth-rst, left-to-right traversal of the structure tree given that the denition table
is available.
If the denition table can be rebuilt during a single depth-rst, left-to-right traversal of the
structure tree, and every attribute becomes available before it is needed for code generation,
then rebuilding can be combined with code generation and the second factor noted above
does not lead to increased costs. When this condition is not satised, the second factor does
increase the rebuilding cost and this must be taken into account. It may then be cheaper to
transform the denition table between the last semantic analysis traversal and the rst code
generation traversal. (The attribute dependency analysis presented in Section 8.2 is used to
decide whether the condition is satised.)
A denition table is generally an unstructured collection of entries. Any arbitrary entry
can be accessed via a pointer in order to read an attribute or assign a new value. In a one-pass
compiler, a stack strategy could also be used: At every denition a new entry is pushed onto
the top of the stack, and at the end of a range all denitions found in the range are popped.
This organization has the advantage that only relevant entries must be held in storage.
Copies of some of the more-frequently accessed attributes of an entity may be included in
each leaf representing a use of that entity. The choice of such attributes depends upon the
particular compiler design; we shall return to this question several times, in Chapters 9, 10
and 14. It may be that these considerations lead to including all attributes in the leaf. The
denition table then ceases to exist as a separate data structure.

4.3 Notes and References

Postx, triples, and quadruples are often discussed in isolation as `internal forms' of the
program, without reference to the structures they represent (see Gries [1971] for example).
Such discussions tend to bog down in a morass of special cases and extensions once they move
beyond the treatment of arithmetic expressions. We believe that thinking in terms of a tree
helps the compiler designer to concentrate on the important relationships present in the text
and to arrive at a more coherent representation. Once this has been derived, a variety of
linearizations may be used depending upon the particular compiler design.
Most authors lump the various tables discussed in Section 4.2 into a single dictionary,
which they often call `the symbol table' [Gries, 1971; Bauer and Eickel, 1976; Aho and
82 Abstract Program Representation

Ullman, 1977]. The concept of separate tables seems to be restricted to descriptions of multi-
pass compilers, as a mechanism for reducing main storage requirements [Naur, 1964]. This
is not invariably true, however, especially when one considers the literature on ALGOL 68
[Peck, 1971] In his description of a multi-pass Pascal compiler, Hartmann [1977] uses sep-
arate tables both to reduce core requirements and to provide better compiler structure.
Lookup mechanisms have concerned a large number of authors; the most comprehensive
treatment is that of Knuth. Knuth [1973] He gives details of a variety of mechanisms,
including hashing, and shows how they compare for dierent applications. It appears that
hashing is the method of choice for symbol table implementation, but there may be some
circumstances in which binary trees are superior [Palmer et al., 1974]. For symbol tables
with a xed number of known entries (e.g. keywords) Cichelli [1980] and Cercone et al.
[1982] describe a way of obtaining a hash function that does not have any collisions and hence
requires no collision resolution.

Exercises
4.1 [Sale, 1971; McIlroy, 1974] Specify abstract tokens for FORTRAN 66.
4.2 Specify a target node (Figure 4.1.3) suitable for some machine with which you are
familiar.
4.3 Is a symbol table needed to map identiers in a compiler for Minimal Standard BASIC?
Explain.
4.4 Implement a string table module, using a software paging scheme: Statically allocate an
array of pointers (a `page table') to blocks of xed size (`pages'). Initially no additional
blocks are allocated. When a string must be stored, try to t it into a currently-
allocated page. If this cannot be done, dynamically allocate a new page and place a
pointer to it in the page table. Carefully dene the interface to your module.
4.5 Implement a symbol table module that provides a lookup mechanism, and uses the
module of Exercise 4.4 to store the identier string.
4.6 Identier strings are specied in the module of Exercise 4.5 by the pair (string table
index, length). On a computer like the DEC PDP11, this specication occupies 8 bytes.
Comment on the relative merits of this scheme versus one in which identier strings
are stored directly if they are no longer than k bytes, and a string table is used for
those whose length exceeds k. What should the value of k be for the PDP11? Would
this scheme be appropriate for a multipass compiler?
4.7 Consider the FORTRAN expression `X * 3.1415926535897932385 * Y'. Assume that
no explicit type has been given for X, and that Y has been declared DOUBLE PRE-
CISION.
(a) Should the constant be interpreted as a single or double precision value? Explain.
(b) For some machine with which you are familiar, estimate the relative errors in the
single and double precision representations of the constant.
(c) Explain the relevance of this example to the problem of selecting the internal
representation to be provided by the constant table for oating point numbers.
Chapter 5
Elements of Formal Systems
Formal grammars, in particular context-free grammars, are the tools most frequently used
to describe the structure of programs. They permit a lucid representation of that structure
in the form of parse trees, and one can (for the most part mechanically) specify automata
that will accept all correctly-structured programs (and only these). The automata are easy
to modify so that they output any convenient encoding of the parse tree.
We limit our discussion to the denitions and theorems necessary to understand and use
techniques explained in Chapters 6 and 7, and many theorems are cited without proof. In the
cases where we do sketch proofs, we restrict ourselves to the constructive portions upon which
practical algorithms are based. (We reference such constructions by giving the number of the
associated theorem.) A formally complete treatment would exceed both the objectives of and
size constraints on this book. Readers who wish to delve more deeply into the theoretical
aspects of the subject should consult the notes and references at the end of this chapter.

5.1 Descriptive Tools

In this section we rst review the standard mathematical notation used to describe sets of
strings. We then introduce some formal systems for the production of such sets and with these
dene certain classes of languages. Finally, we discuss the representation of the structure of
strings by means of trees and give a complete example.

5.1.1 Strings and Rewriting Systems

We begin with a vocabulary (or alphabet ), V : A nite, nonempty set of symbols having no
discernible structure. (At least we take no notice of further structure on the level of abstraction
we are considering.) One example of a vocabulary is the set of characters available on a
particular computer, others are the set of basic symbols dened by a particular language (e.g.
identier, integer, +, begin) and the set of syntactic terms we use to describe the structure
of a program. We may attach semantic signicance to some of the symbols in the vocabulary,
without explaining them further by means of the formal systems introduced in this chapter.
The set of all nite strings x1 : : : xn , n 1, formed by concatenating elements of V is
denoted by V + . V denotes V + augmented by adding the empty string (which contains no
symbols). We shall denote the empty string by ; it is both a left and right identity for
concatenation: = = , 2 V . The count, n, of symbols in a string = x1 : : : xn is
called the length of , and is denoted by jj. Thus jj = 0.

83
84 Elements of Formal Systems

5.1 Definition
Let = !; ; ! 2 V . The string is called a head, and the string ! a tail, of . If 6=
(! 6= ) then it is a proper head (tail) of .
Each subset of V is called a language over vocabulary V . The elements of a language are
called sentences. Interesting languages generally contain innitely many sentences, and hence
cannot be dened by enumeration. We therefore dene each such language, L, by specifying
a process that generates all of its sentences, and no other elements of V . This process may
be characterized by a binary, transitive relation )+ over V , such that L = f j )+ g
for a distinguished string 2 V . We term the relation )+ a derivative relation.
5.2 Definition
A pair (V; )+ ) consisting of a vocabulary V and a derivative relation )+ , is called a formal
system.
A derivative relation usually cannot be dened by enumeration either. We shall concern
ourselves only with relations that can be described by a nite set of pairs (; ) of strings
from V . We call such pairs productions, and write them as ! . The transitive closure of
the nite relation described by these productions yields a derivative relation. More precisely:
5.3 Definition
A pair (V; P ), consisting of a vocabulary V and a nite set, P , of productions ! (; 2
V ) is called a general rewriting (or Semi-Thue ) system.
5.4 Definition
A string is directly derivable from a string (symbolically ) ) by a general rewriting
system (V; P ) if there exist strings , , , 2 V such that = , = and !
is an element of P .
5.5 Definition
A string is derivable from a string (symbolically )+ ) by a general rewriting system
(V; P ) if there exist strings 0 ; : : : ; n 2 V (n 1) such that = 0 , n = and i,1 ) i ,
i = 1; : : : ; n. The sequence 0; : : : ; n is called a derivation of length n.
We write ) to indicate that either = or )+ . If is (directly) derivable from
, we also say that is (directly ) reducible to . Without loss of generality, we shall assume
that derivations )+ of a string from itself are impossible.

5.1.2 Grammars
Using the general rewriting system dened by Figure 5.1, it is possible to derive from E
every correct algebraic expression consisting of the operators + and , the variable i, and
the parentheses ( ). Many other strings can be derived also, as shown in Figure 5.2. In the
remainder of this chapter we shall concentrate on rewriting systems in which the vocabulary is
made up of two disjoint subsets: T , a set of terminals, and N , a set of nonterminals (syntactic
variables ). We will ultimately be interested only in those strings derivable from a distinguished
nonterminal (the axiom or start symbol ) and consisting entirely of terminals. (Thus we speak
of generative systems. One could instead consider analytic systems in which the axiom is
derived from a string of terminals. We shall return to this concept with Denitions 5.12
and 5.20.)
5.1 Descriptive Tools 85

fE; T; F; +; ; (; ); ig
a) The vocabulary V
fE ! T , E ! E + T ,
T ! F, T ! T F,
F ! i, F ! (E )g
b) The productions P
Figure 5.1: A General Rewriting System (V; P )
E)T
T )T F
T F )T i
a) Some immediate derivations
E ) T i (length 3)

E ) i + i i (length 8)
TiE ) iii (length 5)
TiE ) TiE (length 0)
E ) T (length 1)
b) Additional derivations
Figure 5.2: Derivations
5.6 Definition
A quadruple G = (T; N; P; Z ) is called a grammar for the language L(G) = f 2 T j Z ) g
if T and N are disjoint, (T [ N; P ) is a general rewriting system, and Z is an element of N .
We say that two grammars G and G0 are equivalent if L(G) = L(G0 ).
Figure 5.3 illustrates these concepts with two grammars that generate algebraic expressions
in the variable i. These grammars are equivalent according to Denition 5.6.
Grammars may be classied by the complexity of their productions:
5.7 Definition (Chomsky Hierarchy)
The grammar G = (T; N; P; Z ) is a
type 0 grammar if each production has the form ! , 2 V and 2 V .
+

type 1 (context-sensitive ) grammar if each production has the form A ! , ; 2 V ,
A 2 N and 2 V . +

type 2 (context-free ) grammar if each production has the form A ! , A 2 N and 2 V .

type 3 (regular ) grammar if each production has either the form A ! a, A 2 N and a 2
T [ fg or the form A ! aB , A; B 2 N and a 2 T .
If a grammar that generates a language is context-sensitive (context-free, regular), then we
also term the language itself context-sensitive (context-free, regular). Regular and context-
free grammars are the most interesting to compiler writers. The former are usually used to
describe the basic symbols (e.g. identiers, constants) of a language, while the latter describe
the structure of a program. From now on, we restrict our attention to these two grammar
classes.
Although we admit -productions (productions whose right-hand side consists of the empty
string) in context-free grammars, we are interested only in languages that do not include the
86 Elements of Formal Systems

T =f+; ; (; ); ig
N =fE; T; F g
P =fE ! T , E ! E + T ,
T ! F, T ! T F,
F ! i, F ! (E )g
Z= E
a) A grammar incorporating (V; P ) from Figure 5.1
T =f+; ; (; ); ig
N =fE; E 0 ; T; T 0 ; F g
P =fE ! T , E ! TE 0 ,
0
E !+T , E 0 !+TE 0 ,
T ! F, T ! FT 0 ,
T 0 !*F , T 0 !*FT 0 ,
F ! i, F ! (E )g
Z= E
b) A grammar incorporating another general rewriting system
Figure 5.3: Equivalent Grammars

empty string. Such languages can always be described by -free grammars { grammars without
-productions. Therefore -productions will only be used when they result in more convenient
descriptions.
We assume further that every symbol in the vocabulary will appear in the derivation of
at least one sentence. Thus the grammar will not contain any useless symbols. (This is
not always true for actual descriptions of programming languages, as illustrated by the LAX
denition of Appendix A.)
5.1.3 Derivations and Parse Trees
Each production in a regular grammar can have at most one nonterminal on the right-hand
side. This property guarantees { in contrast to the context-free grammars { that each sen-
tence of the language has exactly one derivation when the grammar is unambiguous (Deni-
tion 5.11).
Figure 5.4a is a regular grammar that generates the non-negative integers and real numbers
if n represents an arbitrary sequence of digits. Three derivations according to this grammar
are shown in Figure 5.4b. Each string except the last in a derivation contains exactly one
nonterminal, from which a new string must be derived in the next step. The last string consists
only of terminals. The sequence of steps in each derivation of this example is determined by
the derived sentence.
The situation is dierent for context-free grammars, which may have any number of non-
terminals on the right-hand side of each production. Figure 5.5 shows that several derivations,
diering only in the sequence of application of the productions, are possible for a given sen-
tence. (These derivations are constructed according to the grammar of Figure 5.3a.)
In the left-hand column, a leftmost derivation was used: At each step a new string was
derived from the leftmost nonterminal. Similarly, a rightmost derivation was used in the
right-hand column. A nonterminal was chosen arbitrarily at each step to produce the center
derivation.
A grammar ascribes structure to a string not by giving a particular sequence of derivation
steps but by showing that a particular substring is derived from a particular nonterminal.
5.1 Descriptive Tools 87

T = fn; :; +; ,; E g
N = fC; F; I; X; S; U g
P = fC ! n, C ! nF , C ! :I ,
F ! :I , F ! ES ,
I ! n, I ! nX ,
X ! ES ,
S ! n, S ! +U , S ! ,U ,
U ! ng
Z = C
a) A grammar for real constants
C C C
n :I nF
:n n:I
n:nX
n:nES
n:nE + U
n:nE + n
b) Three derivations according to the grammar of (a)
Figure 5.4: Derivations According to a Regular Grammar
For example, in Figure 5.5 the substring i i is derived from the single nonterminal T . We
interpret this property of the derivation to mean that i i forms a single semantic unit: an
instance of the operator applied to the i's as operands. It is important to realize that the
grammar was constructed in a particular way specically to ascribe a semantically relevant
structure to each sentence in the language. We cannot be satised with any grammar that
denes a particular language; we must choose one re ecting the semantic structure of each
sentence. For example, suppose that the rules E ! E + T and T ! T F of Figure 5.3a
had been replaced by E ! E T and T ! T + F respectively. The modied grammar would
describe the same language, but would ascribe a dierent structure to its sentences: It would
imply that additions should take precedence over multiplications.
E E E
E+T E+T E+T
T +T E+T F E +T F
F +T T +T F E +T i
i+T T +F F E+F i
i+T F T +F i E +ii
i+F F F +F i T +ii
i+iF i+F i F +ii
i+ii i+ii i+ii
Figure 5.5: Derivations According to a Context-Free Grammar
Substrings derived from single nonterminals are called phrases :
5.8 Definition
Consider a grammar G = (T; N; P; Z ). The string 2 V + is a phrase (for X ) of if and
only if Z ) X )+ (; 2 V , X 2 N ). It is a simple phrase of if and only if
Z ) X ) .
Notice that a phrase need not consist solely of terminals.
88 Elements of Formal Systems

Each of the three derivations of Figure 5.5 identies the same set of simple phrases. They
are therefore equivalent in the sense that they ascribe identical phrase structure to the string
i + i i. In order to have a single representation for the entire set of equivalent derivations,
one that makes the structure of the sentence obvious, we introduce the notion of a parse tree
(see Appendix B for the denition of an ordered tree):
5.9 Definition
Consider an ordered tree (K; D) with root k0 and label function f : K ! M . Let k1 ; : : : ; kn ,
(n > 0) be the immediate successors of k0 . (K; D) is a parse tree according to the grammar
(T; N; P; Z ) if the following conditions hold:
(a) M V [ fg
(b) f (k ) = Z
0

(d) if f (ki) 2 T , or if n = 1 and f (ki) = , then ki is a leaf

(e) if f (ki) 2 N then ki is the root of a parse tree according to the grammar (T; N; P; f (ki ))

Figure 5.6 is a tree for i + i i according to the grammar of Figure 5.3a, as can be shown by
recursive application of Denition 5.9.
E

E + T

T T * F

F F i

i i

Figure 5.6: The Parse Tree for i + i i

We can obtain any string in any derivation of a sentence from the parse tree of that
sentence by selecting a minimum set of nodes, removal of which will break all root-to-leaf
paths. (Such a set of nodes is called a cut { see Denition B.8.) For example, in Figure 5.6
the set fT; +; T; ; F g (the third row of nodes, plus `+' from the second row) has this property
and T + T F is the fourth step in the center derivation of Figure 5.5.
5.10 Theorem
In a parse tree according to a grammar G = (T; N; P; Z ), a set of nodes (k1 ; : : : ; kn ) is a cut
if and only if Z ) f (k1 ) : : : f (kn ).
A parse tree species the phrase structure of a sentence. With the grammars given so far,
only one parse tree corresponds to each sentence. This may not always be true, however, as
illustrated by Figure 5.7. The grammar of Figure 5.7a describes the same language as that
of Figure 5.3a, but many sentences have several parse trees.
5.11 Definition
A sentence is ambiguous if its derivations may be described by at least two distinct parse trees
(or leftmost derivations or rightmost derivations). A grammar is ambiguous if there is at least
one ambiguous sentence in the language it denes; otherwise the grammar is unambiguous.
5.1 Descriptive Tools 89

T = f+; ; ig
N = fE g
P = fE ! E + E; E ! E E; E ! ig
Z=E
a) An ambiguous grammar
E E

E * E E + E

E + E i i E * E

i i i i

b) Two parse trees for i + i i

Figure 5.7: Ambiguity
Figure 5.7b shows two parse trees for i + i i that are essentially dierent for our purposes
because we associate two distinct sequences of operations with them. If we use an ambiguous
grammar to describe the language (and this may be a useful thing to do), then either the
ambiguity must involve only phrases with no semantic relevance or we must provide additional
rules for removing the ambiguity.
5.1.4 Extended Backus-Naur Form
Appendix A uses a notation known as extended Backus-Naur form (EBNF) to describe LAX.
This notation allows us to describe a grammar in a more compact form. Moreover, as we shall
see in Chapter 7, a parser can be derived easily from the specication of a language written
in EBNF. In this section we illustrate the techniques we have been discussing by giving a
formal denition of EBNF; an informal description appears at the beginning of Appendix A.
Figure 5.8a is the grammar for EBNF. When a specication is written in EBNF, character
strings are used to represent the elements of T as indicated in Figure 5.8b. A complete
specication for EBNF itself appears in Figure 5.8c. Given a specication such as that of
Figure 5.8c, we can derive one or more grammars that dene the same language. In this
manner we establish the `meaning' of the specication.
The derivation proceeds from a parse tree (K; D) of the given specication according to
the grammar of Figure 5.8a. In addition to the label function f from Denition 5.9, we dene
h : K ! L [ I , where L is the set of identiers and literals appearing in the specication and
I is a set of unique identiers. L and I are disjoint; h associates an element of L with every
leaf of K and an element of I with every non-leaf node. An element of L may be associated
with any number of leaves, but there is a 1-1 correspondence between non-leaf nodes and
elements of I .
L [ I is the vocabulary of the grammar that we shall derive from the EBNF specication.
All elements of I are nonterminals of the grammar, as are identiers appearing on the left
of `::=' in an EBNF rule. All literals and identiers not appearing on the left of `::=' are
terminals. Formally:
R = fh(k) j (k0 ; k) 2 D; f (k0 ) = rule ; f (k) = identifier g
T =L,R
N =R[I
90 Elements of Formal Systems

T =fidentifier , literal , is , or , lpn , rpn , lbk , rbk , plus , star ,

period , separator g
N =fspecification , rule , expression , tertiary , secondary , primary ,unit , atom g
P =fspecification ! rule , specification ! specification rule ,
rule !
identifier is expression period ,
expression !
tertiary , expression !
expression separator atom ,
tertiary !
secondary , tertiary !
tertiary or secondary ,
secondary !
primary , secondary !
secondary primary ,
primary !
unit , primary !
unit star , primary !
unit plus ,
primary !
lbk expression rbk ,
unit !
atom , unit !
lpn expression rpn ,
atom !
identifier , atom !
literal g
Z= specification
a) Grammar for EBNF
identifier : Sequence of letters, digits and underscores.
literal : String delimited by apostrophes.
lpn : ( rpn : ) lbk : [ rbk :] is : ::=
or : j star : * plus : + period : . separator : jj
b) Representation used in this book for EBNF terminals
specification ::= rule + .
rule ::= identifier '::=' expression '.' .
expression ::= (primary + jj 'j') j expression 'jj' atom .
primary ::= unit [ '*' j '+' ] j '[' expression ']' .
unit ::= atom j '(' expression ')' .
atom ::= identifier jliteral .
c) A possible EBNF specication for EBNF
Figure 5.8: Extended Backus-Naur Form
Here R is the set of rule identiers. If the EBNF specication is well-formed then there will
be exactly one element of R that does not appear on the right of `::=' in any rule. This
element is the axiom of the derived grammar:
Z = r 2 (R , fh(k) j (k0 ; k) 2 D; f (k0 ) = atomg)
A set of productions can be derived from every non-leaf node of the parse tree, and P is
the union of those sets. Consider each subtree formed from a non-leaf node k0 and its ordered
immediate successors k1 ; k2 ; : : : ; kn . The derived productions depend upon the structure of
the subtree (given by a production of Figure 5.8a) and the labels of the nodes in the subtree
as follows:
For subtree derive the production set
rule ! identifier is expression period fh(k ) ! h(k )g
! expression separator atom fh(k ) ! h(k ), h(k ) ! h(k )h(k )h(k )g
1 3
expression
! tertiary or secondary fh(k ) ! h(k ), h(k ) ! h(k )g
0 1 0 0 3 1
tertiary
! secondary primary fh(k ) ! h(k )h(k )g
0 1 0 3
secondary
! unit star fh(k ) ! , h(k ) ! h(k )h(k )g
0 1 2
primary
! unit plus fh(k ) ! h(k ), h(k ) ! h(k )h(k )g
0 0 0 1
primary
! lbk expression rbk fh(k ) ! , h(k ) ! h(k )g
0 1 0 0 1
primary
! lpn expression rpn fh(k ) ! h(k )g
0 0 2
unit 0 2

Derive the empty set of productions for any subtree with h(k0 ) = specification, and
derive fh(k0 ) ! h(k1 )g for any subtree not yet mentioned.
5.2 Regular Grammars and Finite Automata 91

The grammar derived from Figure 5.8c by this process will have more productions than
Figure 5.8a. The extra productions can be removed by a simple substitution: If B 2 N
occurs exactly twice in a grammar, once in a production of the form A ! B and once
in a production of the form B ! (; ; 2 V ), then B can be eliminated and the two
productions replaced by A ! . After all such substitutions have been made, the resulting
grammar will dier from Figure 5.8a only in the representation of vocabulary symbols.

5.2 Regular Grammars and Finite Automata

A grammar species a process for generating sentences, and thus allows us to give a nite
description of an innite language. The analysis phase of the compiler, however, must recog-
nize the phrase structure of a given sentence: It must parse the sentence. Assuming that the
language has been described by a grammar, we are interested in techniques for automatically
generating a recognizer from that grammar. There are two reasons for this requirement:
It provides a guarantee that the language recognized by the compiler is identical to that
dened by the grammar.
It simplies the task of the compiler writer.
We shall use automata, which we introduce as special cases of general rewriting systems,
as models for the parsing process. In this section we develop a theoretical basis for regular
languages and nite automata, and then extend the concepts and algorithms to context-free
languages and pushdown automata in Section 5.3. The implementation of the automata is
covered in Chapters 6 and 7.
5.2.1 Finite Automata
5.12 Definition
A nite automaton (nite state acceptor ) is a quintuple A = (T; Q; R; q0 ; F ), where Q is a
nonempty set, (T [ Q; R) is a general rewriting system, q0 is an element of Q and F is a
subset of Q. The sets T and Q are disjoint. Each element of R has the form qt ! q0 , where
q and q0 are elements of Q and t is an element of T . We say that A accepts a set of strings
L(A) = f 2 T j q0 ) q; q 2 F g. Two automata, A and A0 are equivalent if and only if
L(A) = L(A0 ).
We can conceive of the nite automaton as a machine that reads a given input string out of a
buer one symbol at a time and changes its internal state upon absorbing each symbol. Q is
the set of internal states, with q0 being the initial state and F the set of nal states. We say
that a nite automaton is in state q when the current string in the derivation has the form
q . It makes a transition from state q to state q0 if = t and qt ! q0 is an element of R.
Each state transition removes one symbol from the input string.
5.13 Theorem
For every regular grammar, G, there exists a nite automaton, A, such that L(A) = L(G).
The proof of this theorem is an algorithm to construct A, given G = (T; N; P; Z ). Let
A = (T; N [ ff g; R; Z; F ), f 2= N . R is constructed from P by the following rules:
1. If X ! t (X 2 N; t 2 T ) is a production of P then let Xt ! f be a production of R.
2. If X ! tY (X; Y 2 N; t 2 T ) is a production of P then let Xt ! Y be a production of
R.
92 Elements of Formal Systems

T =fn; :; +; ,; E g
Q =fC; F; I; X; S; U; qg
R =fCn ! q, Cn ! F , C: ! I ,
F: ! I , FE ! S ,
In ! q, In ! X ,
XE ! S ,
Sn ! q, S + ! U , S , ! U ,
Un ! qg
q0 = C
F =fqg
Figure 5.9: An Automaton Corresponding to Figure 5.4a

Further, F = ff g [ fX j X ! 2 P g. Figure 5.9 is an automaton constructed by this

process from the grammar of Figure 5.4a.
One can show by induction that the automaton constructed in this manner has the follow-
ing characteristic: For any derivation Z ) X ) q (; 2 T ; X 2 N; 2 L(A); q 2
F ), the state X species the nonterminal symbol of G that must have been used to derive
the string . Clearly this statement is true for the initial state Z if belongs to L(G). It
remains true until the nal state q, which does not generate any further symbols, is reached.
With the help of this interpretation it is easy to prove that each sentence of L(G) also belongs
to L(A) and vice-versa.
Figure 5.9 is an unsatisfactory automaton in practice because at certain steps { for exam-
ple in state I with input symbol n { several transitions are possible. This is not a theoretical
problem since the automaton is capable of producing a derivation for any string in the lan-
guage. When implementing this automaton in a compiler, however, we must make some
arbitrary decision at each step where more than one production might apply. An incorrect
decision requires backtracking in order to seek another possibility. There are three reasons
why backtracking should be avoided if possible:
The time required to parse a string with backtracking may increase exponentially with
the length of the string.
If the automaton does not accept the string then it will be recognized as incorrect. A
parse with backtrack makes pinpointing the error almost impossible. (This is illustrated
by attempting to parse the string n:nE + +n with the automaton of Figure 5.9 trying
the rules in the sequence in which they are written.)
Other compiler actions are often associated with state transitions. Backtracking then
requires unraveling of actions already completed, generally a very dicult task.
In order to avoid backtracking, additional constraints must be placed upon the automata
that we are prepared to accept as models for our recognition algorithms.
5.14 Definition
An automaton is deterministic if every derivation can be continued by at most one move.
A nite automaton is therefore deterministic if the left-hand sides of all rules are distinct. It
can be completely described by a state table that has one row for each element of Q and one
column for each element of T . Entry (q; t) contains q0 if and only if the production qt ! q0 is
an element of R. The rows corresponding to q0 and to the elements of F are suitably marked.
Backtracking can always be avoided when recognizing strings in a regular language:
5.2 Regular Grammars and Finite Automata 93

5.15 Theorem
For every regular grammar, G, there exists a deterministic nite automaton, A, such that
L(A) = L(G).
Following construction 5.13, we can derive an automaton from a regular grammar G =
(T; N; P; Z ) such that, during acceptance of a sentence in L(G), the state at each point
species the element of N used to derive the remainder of the string. Suppose that the pro-
ductions X ! tU and X ! tV belong to P . When t is the next input symbol, the remainder
of the string could have been derived either from U or from V . If A is to be deterministic,
however, R must contain exactly one production of the form Xt ! q0 . Thus the state q0
must specify a set of nonterminals, any one of which could have been used to derive the
remainder of the string. This interpretation of the states leads to the following inductive
algorithm for determining Q, R and F of a deterministic automaton A = (T; Q; R; q0 ; F ). (In
this algorithm, q represents a subset Nq of N [ ff g; f 2= N ):
1. Initially let Q = fq g and R = ;, with Nq0 = fZ g.
0

2. Let q be an element of Q that has not yet been considered. Perform steps (3)-(5) for each
t 2 T.
3. Let next(q; t) = fU j 9X 2 Nq such that X ! tU 2 P g.
4. If there is an X 2 Nq such that X ! t 2 P then add f to next(q; t) if it is not already
present; if there is an X 2 Nq such that X ! 2 P then add f to Nq if it is not already
present.
5. If next(q; t) 6= ; then let q0 be the state representing Nq = next(q; t). Add q0 to Q and
0

qt ! q0 to R if they are not already present.

6. If all states of Q have been considered then let F = fq j f 2 Nq g and stop. Otherwise
return to step (2).
You can easily convince yourself that this construction leads to a deterministic nite
automaton A such that L(A) = L(G). In particular, the algorithm terminates: All states
represent subsets of N [ ff g, of which there are only a nite number.
To illustrate this procedure, consider the construction of a deterministic nite automaton
that recognizes strings generated by the grammar of Figure 5.4a. The state table for this
grammar, showing the correspondence between states and sets of nonterminals, is given in
Figure 5.10a. You should derive this state table for yourself, following the steps of the
algorithm. Begin with a single empty row for q0 and work across it, lling in each entry
that corresponds to a valid transition. Each time a distinct set of nonterminal symbols is
generated, add an empty row to the table. The algorithm terminates when all rows have been
processed.
5.16 Theorem
For every nite automaton, A, there exists a regular grammar, G, such that L(G) = L(A).
Theorems 5.15 and 5.16 together establish the fact that nite automata and regular grammars
are equivalent. To prove Theorem 5.16 we construct the production set P of the grammar
G = (T; Q; P; q0 ) from the automaton (T; Q; R; q0 ; F ) as follows:
P = fq ! tq0 j qt ! q0 2 Rg [ fq ! j q 2 F g
5.2.2 State Diagrams and Regular Expressions
The phrase structure of the basic symbols of the language is usually not interesting, and in
fact may simply make the description harder to understand. Two additional formalisms, both
94 Elements of Formal Systems

n : + , E
q0 q1 q2 fC g
q1 q2 q3 ff; F g
q2 q4 fI g
q3 q5 q6 q6 fS g
q4 q3 ff; X g
q5 ff g
q6 q5 fU g
a) The state table
T = fn; :; +; ,; E g
Q = fq0 ; q1 ; q2 ; q3 ; q4 ; q5 ; q6 g
P = fq0 n ! q1 , q0: ! q2 ,
q1 : ! q2 , q1 E ! q3 ,
q2 n ! q4 ,
q3 n ! q5 , q3+ ! q6 , q3 , ! q6 ,
q4E ! q3,
q6n ! q5 g
F = fq1 ; q4 ; q5 g
b) The complete automaton
Figure 5.10: A Deterministic Automaton Corresponding to Figure 5.4a

of which avoid the need for irrelevant structuring, are available for regular languages. The
rst is the representation of a nite automaton by a directed graph:
5.17 Definition
Let A = (T; Q; R; q0 ; F ) be a nite automaton, D = f(q; q0 ) j 9t; qt ! q0 2 Rg, and f :
(q; q0 ) ! ft j qt ! q0 2 Rg be a mapping from D into the powerset of T . The directed graph
(Q; D) with edge labels f ((q; q0 )) is called the state diagram of the automaton A.
Figure 5.11a is the state diagram of the automaton described in Figure 5.10b. The nodes
corresponding to elements of F have been represented as squares, while the remaining nodes
are represented as circles. Only the state numbers appear in the nodes: 0 stands for q0 , 1 for
q1 , and so forth.
In a state diagram, the sequence of edge labels along a path beginning at q0 and ending at
a state in F is a sentence of L(A). Figure 5.11a has exactly 12 such paths. The corresponding
sentences are given in Figure 5.11b.
A state diagram species a regular language. Another characterization is the regular
expression:
5.18 Definition
Given a vocabulary V , and the symbols E , , +, , ( and ) not in V . A string over
V [ fE; ; +; ; (; )g is a regular expression over V if
1. is a single symbol of V or one of the symbols E or , or if
2. has the form (X + Y ), (XY ) or (X ) where X and Y are regular expressions.
5.2 Regular Grammars and Finite Automata 95

0 n 1 E 3 +- 6
E n n
n
2 4 5
a) State diagram
n .n n.n
nEn nE+n nE-n
.nEn .nE+n .nE-n
n.nEn n.nE+n n.nE-n
b) Paths
Figure 5.11: Another Description of Figure 5.10b

Every regular expression results from a nite number of applications of rules (1) and (2). It
describes a language over V : The symbol E describes the empty language, describes the
language consisting only of the empty string, v 2 V describes the language fvg, (X + Y ) =
f! j ! 2 X or ! 2 Y g, (XY ) = f j 2 X; 2 Y g. The closure operator () is dened by
the following innite sum:
X = + X + XX + XXX + : : :
As illustrated in this denition, we shall usually omit parentheses. Star is unary, and takes
priority over either binary operator; plus has a lower priority than concatenation. Thus
W + XY is equivalent to the fully-parenthesized expression (W + (X (Y ))).
Figure 5.12 summarizes the algebraic properties of regular expressions. The distinct rep-
resentations for X show that several regular expressions can be given for one language.
X +Y = Y +X (commutative)
(X + Y ) + Z = X + (Y + Z ) (associative)
(XY )Z = X (Y Z )
X (Y + Z ) = XY + XZ (distributive)
(X + Y )Z = XZ + Y Z
X +E = E + X = X (identity)
X = X = X
XE = EX = E (zero)
X +X = X (idempotent)
(X )= X
X = + XX
X = X + X
=
E =
Figure 5.12: Algebraic Properties of Regular Expressions
The main advantage in using a regular expression to describe a set of strings is that it
gives a precise specication, closely related to the `natural language' description, which can
be written in text form suitable for input to a computer. For example, let l denote any single
letter and d any single digit. The expression l(l + d) is then a direct representation of the
natural language description `a letter followed by any sequence of letters and digits'.
96 Elements of Formal Systems

The equivalence of regular expressions and nite automata follows from:

5.19 Theorem
Let R be a regular expression that describes a subset, S , of T . There exists a deterministic
nite automaton, A = (T; Q; P; q0 ; F ) such that L(A) = S .
The automaton is constructed in much the same way as that of Theorem 5.15: We create a
new expression R0 by replacing the elements of T occurring in R by distinct symbols (multiple
occurrences of the same element will receive distinct symbols). Further, we prex another
distinct symbol to the altered expression; if R = E , then R0 consists only of this starting
symbol. (As symbols we could use, for example, natural numbers with 0 as the starting
symbol.) The states of our automaton correspond to subsets of the symbol set. The set
corresponding to the initial state q0 consists solely of the starting symbol. We inspect the
states of Q one after another and add new states as required. For each q 2 Q and each t 2 T ,
let q0 correspond to the set of symbols in R0 that replace t and follow any of the symbols of
the set corresponding to q. If the set corresponding to q0 is not empty, then we add qt ! q0
to P and add fq0 g to Q if it is not already present. The set F of nal states consists of all
states that include a possible nal symbol of R0 .
Figure 5.13 gives an example of this process. Starting with q0 = f0g, we obtain the state
table of Figure 5.13b, with states q1 , q2 and q3 as nal states. Obviously this is not the
simplest automaton which we could create for the given language; we shall return to this
problem in Section 6.2.2.
R = l(l + d)
R0 = 01(2 + 3)
a) Modifying the Regular Expression
l d
q0 q1 f0g
q1 q2 q3 f1g (nal)
q2 q2 q3 f2g (nal)
q3 q2 q3 f3g (nal)
b) The resulting state table
Figure 5.13: Regular Expressions to State Tables

5.3 Context-Free Grammars and Pushdown Automata

Regular grammars are not suciently powerful to describe languages such as algebraic ex-
pressions, which have nested structure. Since most programming languages contain such
structures, we must change to a suciently powerful descriptive method such as context-free
grammars. Because regular grammars are a subclass of context-free grammars, one might
ask why we bother with regular languages at all. As we shall see in this section, the analysis
of phrase structure by means of context-free grammars is so much more costly that one falls
back upon the simpler methods for regular grammars whenever possible.
Here, and also in Chapter 7, we assume that all context-free grammars (T; N; P; Z ) contain
a production Z ! S . This is the only production in which the axiom Z appears. (Any
grammar can be put in this form by addition of such a production.) We assume further
that the terminator # follows each sentence. This symbol identies the condition `input
text completely consumed' and does not belong to the vocabulary. Section 5.3.3 assumes
further that the productions are consecutively numbered. The above production has the
5.3 Context-Free Grammars and Pushdown Automata 97

number 1, n is the total number of productions and the ith production has the form Xi ! i ,
i = xi;1 : : : xi;m. The length, m, of the right-hand side is also called the length of the
production. We shall denote a leftmost derivation X ) Y by X )L Y and a rightmost
derivation by X )R Y .
We nd the following notation convenient for describing the properties of strings: The
k-head k : ! of ! gives the rst min(k; j!j + 1) symbols of !#. FIRSTk (!) is the set of
all terminal k-heads of strings derivable from !. The set EFFk (!) (`-free rst') contains all
strings from FIRSTk (!) for which no -production A ! was applied at the last step in
the rightmost derivation. The set FOLLOWk (!) comprises all terminal k-heads that could
follow !. By denition FOLLOWk (Z ) = f#g for any k. Formally:
(

k : ! = !# when j!j < k

when ! = and jj = k

FIRSTk (!) = f j 9 2 T such that ! ) ; = k : g

EFFk (!) = f 2 FIRSTk (!) j @A 2 N; 2 T such that ! )R A ) g

FOLLOWk (!) = f j 9 2 V such that Z ) !; 2 FIRSTk ( )g

We omit the index k when it is 1. These functions may be applied to sets of strings, in
which case the result is the union of the results of applying the function to the elements of its
argument. Finally, if is a string and
is a set of strings, we shall dene
= f! j ! 2
g.
5.3.1 Pushdown Automata
For nite automata, we saw that the state species the set of nonterminal symbols of G
that could have been used to derive the remainder of the input string. Suppose that a nite
automaton has reached the rst right parenthesis of the following expression (which can be
derived using a context-free grammar):
(a1 + (a2 + ( + (am ) : : : ))
It must then be in a state specifying some set of nonterminal symbols that can derive exactly m
right parentheses. Clearly there must be a distinct state for each m. But if m is larger than the
number of states of the automaton (and this could be arranged for any given number of states)
then there cannot be a distinct state for each m. Thus we need a more powerful automaton,
which can be obtained by providing a nite automaton with a stack as an additional storage
structure.
5.20 Definition
A pushdown automaton is a septuple A = (T; Q; R; q0 ; F; S; s0 ), where (T [ Q [ S; R) is a
general rewriting system, q0 is an element of Q, F is a subset of Q, and s0 is an element of
S or s0 = . The sets T and Q are disjoint. Each element of R has the form qa ! 0 q0 ,
where and 0 are elements of S , q and q0 are elements of Q, a is an element of T or a = ,
and is an element of T .
Q, q0 and F have the same meaning as the corresponding components of a nite automaton. S
is the set of stack symbols, and s0 is the initial content of the stack. The pushdown automaton
accepts a string 2 T if s0 q0 ) q for some q 2 F . If each sentence is followed by #, the
98 Elements of Formal Systems

pushdown automaton A denes the language L(A) = f j s0 q0 # ) q#; q 2 F; 2 T g. (In

the literature one often nds the requirement that be an element of S rather than S ; our
automaton would then be termed a generalized pushdown automaton. Further, the denition
of `accept' could be based upon either the relation s0 q0 ) q, 2 S , q 2 F , or the relation
s0 q0 ) q, q arbitrary. Under the given assumptions these denitions prove to be equivalent
in power.)
We can picture the automaton as a machine with a nite set Q of internal states and a
stack of arbitrary length. If we have reached the conguration s1 : : : sn q in a derivation, then
the automaton is in state q, is the unread part of the input text being analyzed, and s1 : : : sn
is the content of the stack (s1 is the bottom item and sn the top). The transitions of the
automaton either read the next symbol of the input text (symbol-controlled ) or are spontaneous
and do not shorten the input text. Further, each transition may alter the topmost item of
the stack; it is termed a stacking, unstacking or replacing transition, respectively, if it only
adds items, deletes items, or changes them without altering their total number.
The pushdown automaton can easily handle the problem of nested parentheses: When it
reads a left parenthesis from the input text, it pushes a corresponding symbol onto the stack;
when it reads the matching right parenthesis, that symbol is deleted from the stack. The
number of states of the automaton plays no role in this process, and is independent of the
parenthesis nesting depth.
5.21 Theorem
For every context-free grammar, G, there exists a pushdown automaton, A, such that L(A) =
L(G).
As with nite automata, one proves this theorem by construction of A. There are two con-
struction procedures, which lead to distinct automata; we shall go into the details of these
procedures in Sections 5.3.2 and 5.3.3 respectively. The automata constructed by the two
procedures serve as the basic models for two fundamentally dierent parsing algorithms.
A pushdown automaton is not necessarily deterministic even if the left sides of all pro-
ductions are distinct. For example, suppose that 1 q ! 0 q0 0 and 2 q ! 00 q00 00 are two
distinct productions and 2 is a proper tail of 1 . Thus 1 = 2 and both productions are
applicable to the conguration 2 q. If we wish to test formally whether the productions
unambiguously specify the next transition, we must make the left-hand sides the same length.
Determinism can then be tested, as in the case of nite automata, by checking that the left-
hand sides of the productions are distinct. We shall only consider cases in which the state q
and k lookahead symbols of the input string are used to determine the applicable production.
Unfortunately, it is not possible to sharpen Theorem 5.21 so that the pushdown automa-
ton is always deterministic; Theorem 5.15 for regular grammars cannot be generalized to
context-free grammars. Only by additional restrictions to the grammar can one guarantee
a deterministic automaton. Most programming languages can be analyzed deterministically,
since they have grammars that satisfy these restrictions. (This has an obvious psychologi-
cal basis: Humans also nd it easier to read a deterministically-analyzable program.) The
restrictions imposed upon a grammar to obtain a deterministic automaton depend upon the
construction procedure. We shall discuss the details at the appropriate place.
5.3.2 Top-Down Analysis and LL(k) Grammars
Let G = (T; N; P; Z ) be a context-free grammar, and consider the pushdown automaton
A = (T; fqg; R; q; fqg; V; Z ) with V = T [ N and R dened as follows:
R = ftqt ! q j t 2 T g [ fBq ! bn : : : b1 q j B ! b1 : : : bn 2 P; n 0; B 2 N; bi 2 V g
5.3 Context-Free Grammars and Pushdown Automata 99

T = f+; ; (; ); ig
Q = fqg
R = fEq ! Tq, Eq ! T +Eq,
Tq ! Fq, Tq ! F *Tq,
Fq ! iq, Fq !)E (q,
+q+ ! q, *q*! q, (q(! q, )q)! q, iqi ! qg
q0 = q
F = fqg
S = f+, *, (, ), i; E; T; F g
s0 = E
Figure 5.14: A Pushdown Automaton Constructed from Figure 5.3a
Stack Input Leftmost derivation
E q i+ii E
T +E q i+ii E+T
T +T q i+ii T +T
T +F q i+ii F +T
T +i q i+ii i+T
T + q +i i
Tq ii
F T q ii i+T F
F F q ii i+F F
F i q ii i+iF
F q i
Fq i
iq i i+ii
q
Figure 5.15: Top-Down Analysis

This automaton accepts a string in L(G) by constructing a leftmost derivation of that string
and comparing the symbols generated (from left to right) with the symbols actually appearing
in the string.
Figure 5.14 is a pushdown automaton constructed in this manner from the grammar of
Figure 5.3a. In the left-hand column of Figure 5.15 we show the derivation by which this
automaton accepts the string i + i i. The right-hand column is the leftmost derivation of
this string, copied from Figure 5.5. Note that the automaton's derivation has more steps due
to the rules that compare a terminal symbol on the stack with the head of the input string
and delete both. Figure 5.16 shows a reduced set of productions combining some of these
steps with those that precede them.
The analysis performed by this automaton is called a top-down (or predictive ) analysis
because it traces the derivation from the axiom (top) to the sentence (bottom), predicting
the symbols that should be present. For each conguration of the automaton, the stack
species a string from V used to derive the remainder of the input string. This corresponds
to construction 5.13 for nite automata, with the stack content playing the role of the state
and the state merely serving to mark the point reached in the input scan.
We now specify the construction of deterministic, top-down pushdown automata by means
of the LL(k) grammars introduced by Lewis and Stearns [1969]:
100 Elements of Formal Systems

R0 = fEq ! Tq, Eq ! T + Eq
Tq ! Fq, Tq ! F Tq,
Fqi ! q, Fq(!)Eq,
+q+ ! q, q ! q, )q) ! qg
Figure 5.16: Reduced Productions for Figure 5.14
5.22 Definition
A context-free grammar G = (T; N; P; Z ) is LL(k) for given k 0 if, for arbitrary derivations
Z )L A ) ) ; 2 T ; ; 2 V ; A 2 N

Z )L A ) ! ) 0 0 2 T ; ! 2 V
(k : = k : 0 ) implies = !.
5.23 Theorem
For every LL(k) grammar, G, there exists a deterministic pushdown automaton, A, such that
L(A) = L(G).
A reads each sentence of the language L(G) from l eft to right, tracing a l eftmost derivation
and examining no more than k input symbols at each step. (Hence the term `LL(k)'.)
In our discussion of Theorem 5.13, we noted that each state of the nite automaton
corresponding to a given grammar specied the nonterminal of the grammar that must have
been used to derive the string being analyzed. Thus the state of the automaton characterized a
step in the grammar's derivation of a sentence. We can provide an analogous characterization
of a step in a context-free derivation by giving information about the production being applied
and the possible right context: Each state of a pushdown automaton could specify a triple
(p; j;
), where 0 j np gives the number of symbols from the right-hand side of production
Xp ! xp;1 : : : xp;n already analyzed and
is the set of k-heads of strings that could follow
p

the string derived from Xp . This triple is called a situation, and is written in the following
descriptive form:
[Xp ! ;
] = xp;1 : : : xp;j ; = xp;j +1 : : : xp;np

The dot (which is assumed to be outside of the vocabulary) marks the position of the analysis
within the right-hand side. (In most cases
contains a single string. We shall then write it
without set brackets.)
Given a grammar (T; N; P; Z ), we specify the states Q and transitions R of the automaton
inductively as follows:
1. Initially let Q = fq0 g and R = ;, with q0 = [Z ! S ; #]. (Note that FOLLOWk (Z ) =
f#g.) The initial state is q0, which is also the initial stack content of A. (We could
have chosen an arbitrary state as the initial stack content.) The automaton halts if this
state is reached again, the stack is empty, and the next input symbol is the terminator
#.
2. Let q = [X ! ;
] be an element of Q that has not yet been considered.
3. If = then add q ! to R if it is not already present. (The notation q ! is
shorthand for the set of spontaneous unstacking transitions q0 q ! q0 with arbitrary
q0.)
4. If = t for some t 2 T and 2 V , let q0 = [X ! t ;
]. Add q0 to Q and qt ! q0
to R if they are not already present.
5.3 Context-Free Grammars and Pushdown Automata 101

5. If = B for some B 2 N and 2 V , let q0 = [X ! B ;

] and H = f[B !
i ; FIRSTk (
)] j B ! i 2 P g. (Thus 1 i m if there are m productions with
left-hand side B .) Set Q := Q [ fq0 g [ H and R := R [ fqi ! q0 hi i j hi 2 H ,
i 2 FIRSTk (i
)g.
6. If all states in Q have been examined, stop. Otherwise, return to step (2).
The construction terminates in all cases, since the set of situations is nite. One can
show that the resulting automaton is deterministic if and only if G is an LL(k) grammar, and
therefore the construction provides a test for the LL(k) property.
Consider the grammar of Figure 5.17a. We can apply Construction 5.23 with k = 3 to
show that this grammar is LL(3), obtaining the states of Figure 5.17b and the transitions of
Figure 5.17c.
P = fZ ! X ,
X ! Y , X ! bY a,
Y ! c, Y ! cag
a) An LL(3) grammar
q0 = [Z ! X ; #] q9 = [Y ! c a; #]
q1 = [Z ! X ; #] q10 = [X ! bY a; #]
q2 = [X ! Y ; #] q11 = [Y ! c; a#]
q3 = [X ! bY a; #] q12 = [Y ! ca; a#]
q4 = [X ! Y ; #] q13 = [Y ! ca; #]
q5 = [Y ! c; #] q14 = [X ! bY a; #]
q6 = [Y ! ca; #] q15 = [Y ! c; a#]
q7 = [X ! b Y a; #] q16 = [Y ! c a; a#]
q8 = [Y ! c; #] q17 = [Y ! ca; a#]
b) States of the automaton, with the situations they represent
R = fq0 c# ! q1 q2c#, q7 ca# ! q10q11 ca#
q0 ca# ! q1 q2 ca#, q7 caa ! q10 q12 caa,
q0 bca ! q1 q3 bca, q8 ! ,
q1 ! , q9 a ! q13 ,
q2 c# ! q4 q5c#, q10 a ! q14 ,
q2 ca# ! q4 q6 ca#, q11 c ! q15,
q3 b ! q7 , q12 c ! q16, q13 ! ,
q4 ! , q14 ! ,
q5 c ! q8 , q15 ! ,
q6 c ! q9 , q16 a ! q17 , q17 ! g
c) Production set of the Automaton
Figure 5.17: Constructing a Deterministic Top-Down Automaton
With k = 2 the construction leads to identical states. In state q7 , however, we obtain the
following transitions:
q7 ca ! q10 q11 ca; q7 ca ! q10q12 ca
The automaton is therefore nondeterministic and hence the grammar is LL(3), but not LL(2).
The example also shows that the lookahead symbols are examined only at spontaneous, stack-
ing transitions that correspond to entry into a new production. As soon as such a transition
102 Elements of Formal Systems

is executed, the reading of terminal symbols and the decision to terminate the production
with an unstacking transition proceeds without further lookahead.
There exist grammars that do not have the LL(k) property for any k. Among the possible
reasons is the occurrence of left recursive nonterminals { nonterminals A for which a derivation
A ) A!, ! 6= , is possible. In a predictive automaton, left recursive nonterminals lead to
cycles that can be broken only by examining a right context of arbitrary length. They can,
however, be eliminated through a transformation of the grammar.
5.24 Theorem
An LL(k) grammar can have no left recursive nonterminal symbols.
5.25 Theorem
For every context-free grammar G = (T; N; P; Z ) with left recursive nonterminals, there exists
an equivalent grammar G0 = (T; N 0 ; P 0 ; Z ) with no left recursive nonterminals.
Let the elements of N be numbered consecutively: N = fX1 ; : : : ; Xn g. If we choose the
indices such that the condition i < j holds for all productions Xi ! Xj ! then G has no left
recursive nonterminals. If such a numbering is not possible for G, we can guarantee it for G0
through the following construction:
1. Let N 0 = N , P 0 = P . Perform steps (2) and (3) for i = 1; : : : ; n.
2. For j = 1; : : : ; i,1 replace all productions Xi ! Xj ! 2 P 0 by fXi ! j ! j Xj ! j 2 P 0g.
(After this step, Xi ) Xj implies i j .)
+

3. Replace the entire set of productions of the form Xi ! Xi! 2 P 0 (if any exist) by the
productions fBi ! !Bi j Xi ! Xi ! 2 P 0 g [ fBi ! g, adding a new symbol Bi to N 0 .
At the same time, replace the entire set of productions Xi ! , = 6 Xi , by Xi ! Bi.
The symbols added during this step will be given numbers n + 1; n + 2; : : : ;
If the string ! in the production Xi ! Xi ! does not begin with Xj , j i then we can
replace Xi ! Xi ! by fBi ! !, Bi ! !Bi g and Xi ! by fXi ! , Xi ! Bi g in step (3).
This approach avoids the introduction of -productions; it was used to obtain the grammar
of Figure 5.3b from that of Figure 5.3a.
Note that left recursion such as E ! T , E ! E + T is used in the syntax of arithmetic
expressions to re ect the left-association of the operators. This semantic property can also be
seen in the transformed productions E ! TE 0 ; E 0 ! +TE 0 ; E 0 ! , but not in E ! T; E !
T + E . In EBNF the left associativity of an expression can be conveniently represented by
E ::= T (0+0 T ).
One of the constructions discussed above results in -productions, while the other does
not. We can always eliminate -productions from an LL(k) grammar, but by doing this we
may increase the value of k:
5.26 Theorem
Given an LL(k) grammar G with -productions. There exists an LL(k + 1) grammar without
-productions that generates the language L(G) , fg.
Conversely, k can be reduced by introducing -productions:
5.27 Theorem
For every -free LL(k + 1) grammar G, k > 0, there exists an equivalent LL(k) grammar with
-productions.
5.3 Context-Free Grammars and Pushdown Automata 103

The proof of Theorem 5.27 rests upon a grammar transformation known as left-factoring,
illustrated in Figure 5.18. In Figure 5.18a, we cannot distinguish the productions X ! Y c
and X ! Y d by examining any xed number of symbols from the input text: No matter
what number of symbols we choose, it is possible for Y to derive a string of that length in
either production.
P = fZ ! X ,
X ! Y c, X ! Y d,
Y ! a, Y ! bY g
a) A grammar that is not LL(k) for any k
P = fZ ! X ,
X ! Y X 0,
X 0 ! c, X 0 ! d,
Y ! a, Y ! bY g
b) An equivalent LL(1) grammar
Figure 5.18: Left Factoring
We avoid the problem by deferring the decision. Since both productions begin with Y ,
it is really not necessary to distinguish them until after the string derived from Y has been
scanned. The productions can be combined by `factoring out' the common portion, as shown
in Figure 5.18b. Now the decision is made at exactly the position where the productions
begin to dier, and consequently it is only necessary to examine a single symbol of the input
string.
In general, by deferring a decision we obtain more information about the input text we
are analyzing. The top-down analysis technique requires us to decide which production to
apply before analyzing the string derived from that production. In the next section we shall
present the opposite technique, which does not require a decision until after analyzing the
string derived from a production. Intuitively, this technique should handle a larger class of
grammars because more information is available on which to base a decision; this intuition can
be proven correct. The price is an increase in the complexity of both the analysis procedure
and the resulting automaton, but in practice the technique remains competitive.
5.3.3 Bottom-Up Analysis and LR(k) Grammars
Again let G = (T; N; P; Z ) be a context-free grammar, and consider the pushdown automaton
A = (T; fqg; R; q; fqg; V; ) with V = T [ N , and R dened as follows:
R = fx1 : : : xnq ! Xq j X ! x1 : : : xn 2 P g [ fqt ! tq j t 2 T g [ fZq ! qg
This automaton accepts a string in L(G) by working backward through a rightmost derivation
of the string.
Figure 5.19 is a pushdown automaton constructed in this manner from the grammar of
Figure 5.3a. In the left-hand column of Figure 5.20, we show the derivation by which this
automaton accepts the string i + i i. The right-hand column is the reverse of the rightmost
derivation of this string, copied from Figure 5.5. The number of steps required for the
automaton's derivation can be decreased by combining productions as shown in Figure 5.21.
(This reduction is analogous to that of Figure 5.16.)
The analysis performed by this automaton is called a bottom-up analysis because of the
fact that it traces the derivation from the sentence (bottom) to the axiom (top). In each
104 Elements of Formal Systems

T = f +; ; (; ); ig
R = fTq ! Eq, E + Tq ! Eq,
Fq ! Tq, T Fq ! Tq,
iq ! Fq, (E )q ! Fq,
q+ ! +q, q ! q, q(! (q, q) !)q, qi ! iq,
Eq ! qg
S = f +; ; (; ); i; E; T; F g
Figure 5.19: A Pushdown Automaton Constructed from Figure 5.3a
Stack Input Reverse rightmost derivation
q i+ii i+ii
i q +i i
F q +i i F + i i
T q +i i T + i i
E q +i i E + i i
E+ q i i
E + i q i
E + F q i E+F i
E + T q i E+T i
E + T q i
E+T i q
E +T F q E+T F
E +T q E+T
E q E
q
Figure 5.20: Bottom-Up Analysis

conguration of the automaton the stack contains a string from S , from which the portion
of the input text already read can be derived. The state merely serves to mark the point
reached in the input scan. The meaningful information is therefore the pair (; ), where
2 S denotes the stack contents and 2 T denotes the remainder of the input text.
The pairs (; ) that describe the congurations of an automaton tracing such a derivation
may be partitioned into equivalence classes as follows:
5.28 Definition
For p = 1; : : : ; n let Xp ! p be the pth production of a context-free grammar G =
(T; N; P; Z ). The reduction classes, Rj , j = 0; : : : ; n are dened by:
R0 = f(; ) j = ; = ! such that Z )R A!; A )R ; 6= g
0

Rp = f(; ) j = p; Z )R Xpg

R0 = fTq ! Eq, E + Tq ! Eq,
Fq ! Tq, T Fq ! Tq,
qi ! Fq, (Eq) ! Fq,
q+ ! +q, q ! q, q(! (q,
Eq ! qg
Figure 5.21: Reduced Productions for Figure 5.17
5.3 Context-Free Grammars and Pushdown Automata 105

`A )R ' denotes the relation `A )R and the last step in the derivation does not take the
0

form B ) '.
The reduction classes contain all pairs of strings that could appear during the bottom-up
parse of a sentence in L(G) by the automaton described above. Further, the reduction class
to which a pair belongs characterizes the transition carried out by the automaton when that
pair appears as a conguration. There are three possibilities:
1. (; ) 2 R . The simple phrase is not yet completely in the stack; the transition qt ! tq
0
with t = 1 : is applied (shift transition ).
2. (; ) 2 Rp, 1 p n. The simple phrase is complete in the stack and the reduce
transition pq ! Xp q is applied. (For p = 1 the transition Zq ! q occurs and the
automaton halts.)
3. (; ) 2= Rj , 0 j n. No further transitions are possible; the input text does not belong
to L(G).
A pushdown automaton that bases its decisions upon the reduction classes is obviously
deterministic if and only if the grammar is unambiguous.
Unfortunately the denition of the sets Rj uses the entire remainder of the input string
in order to determine the reduction class to which a pair (; ) belongs. That means that our
bottom-up automaton must inspect an arbitrarily long lookahead string to make a decision
about the next transition, if it is to be deterministic. If we restrict the number of lookahead
symbols to k, we arrive at the following denition:
5.29 Definition
For some k 0, the sets Rj;k , j = 0; : : : ; n, are called k-stack classes of a grammar G if:
Rj;k = f(; ) j 9(; ) 2 Rj such that = k : g

If the k-stack classes are pairwise-disjoint, then the pushdown automaton is deterministic
even when the lookahead is restricted to k symbols. This property characterizes a class of
grammars introduced by Knuth [1965]:
5.30 Definition
A context-free grammar G = (T; N; P; Z ) is LR(k) for given k 0 if, for arbitrary derivations
Z )R A! ) ! 2 V ; ! 2 T ; A ! 2 P
Z )R 0B!0 ) 0 !0 0 2 V ; !0 2 T ; B ! 2 P
(jj + k) : ! = (jj + k) : 0 !0 implies = 0 , A = B and = .
The automaton given at the beginning of this section scans the input text from l eft to right,
tracing the reverse of a r ightmost derivation and examining no more than k input symbols
at each step. (Hence the term "LR(k)".)
5.31 Theorem
A context-free grammar is LR(k) if and only if its k-stack classes are pairwise-disjoint.
On the basis of this theorem, we can test the LR(k) property by determining the intersection
of the k-stack classes. Unfortunately the k-stack classes can contain innitely many pairs
(; ): The length restriction permits only a nite number of strings , but the lengths of the
stack contents are unrestricted. However, we can give a regular grammar Gj for each k-stack
106 Elements of Formal Systems

class Rj;k such that L(Gj ) = f(& ) j (; ) 2 Rj;k g. Since algorithms exist for determining
whether two regular languages are disjoint, this construction leads to a procedure for testing
the LR(k) property.
5.32 Theorem
Let G = (T; N; P; Z ) be a context-free grammar, and let k 0. Assume that & is not an
element of the vocabulary V = T [ N . There exists a set of regular grammars Gj , j = 0; : : : ; n
such that L(Gj ) = f& j (; ) 2 Rj;k g.
The regular grammars that generate the k-stack classes are based upon the situations intro-
duced in connection with Theorem 5.23:
W = f[X ! ; !] j X ! 2 P; ! 2 FOLLOWk (X )g
These situations are the nonterminal symbols of the regular grammars. To dene the gram-
mars themselves, we rst specify a set of grammars that generate the k-stack classes, but are
not regular:
G0j = (V [ f&; #g; W; P 0 [ P 00 [ Pj ; [Z ! S ; #])
The productions in P 0 [P 00 build the components of the k-stack class. They provide the nite
description of the innite strings. Productions in Pj attach the component, terminating the
k-stack class:
P 0 = f[X ! ; !] ! [X ! ; !] j 2 V g

P 00 = f[X ! B ; !] ! [B ! ; ] j B ! 2 P; 2 FIRSTk ( !)g

P0 = f[X ! ; !] ! & j 6= ; 2 EFFk (!)g

Pp = f[Xp ! p; !] ! &!g p = 1; : : : ; n

Remember that the lengths of and ! are limited to k symbols, so the number of possible
strings & and &! is nite. If we regard these strings as single terminal symbols, productions
in P 0 and Pj , j = 0; : : : ; n, are allowable in a regular grammar. Productions in P 00 are not
allowable, however, since they are of the form A ! B , A; B 2 N . Thus G0j is not regular.
It is always possible to rewrite a grammar so that it contains no productions such as those
in P 00 . The key is the closure of a nonterminal:
H (A) = fAg [ fB j C ! B 2 P; C 2 H (A)g
The procedure for rewriting the grammar is:
1. Select an A 2 N for which H (A) 6= fAg.
2. Set P = P , fA ! B j B 2 N g.
3. Set P = P [ fA ! j B ! 2 P; B 2 H (A); 2= N g.
The algorithm terminates when no selection can be made in step (1).
We obtain Gj from G0j by applying this algorithm. The strings are all of the form
v[: : : ], & or &!, and therefore all introduced productions satisfy the conditions for a regular
grammar.
5.3 Context-Free Grammars and Pushdown Automata 107

5.33 Theorem
For every LR(k) grammar G there exists a deterministic pushdown automaton A such that
L(A) = L(G).
Let G = (T; N; P; Z ). We base the construction of the automaton on the grammars Gj , eec-
tively building a machine that simultaneously generates the k-stack classes and checks them
against the reverse of a rightmost derivation of the string. Depending upon the particular
k-stack class, the automaton pushes the input symbol onto the stack or reduces some number
of stacked symbols to a nonterminal. The construction algorithm generates the necessary
situations as it goes, and uses the closure operation discussed above `on the y' to avoid
considering productions from P 00 . As in the construction associated with Theorem 5.15, a
state of the automaton must specify a set of situations, any one of which might have been
used in deriving the current k-stack class. It is convenient to restate the denition of a closure
directly in terms of a set of situations M :
H (M ) = M [ f[B ! ; ] j 9[X ! B ; !] 2 H (M ); B ! 2 P; 2 FIRSTk ( !)g
The elements of Q and R are determined inductively as follows:
1. Initially let Q = fq g and R = ;, with q = H (f[Z ! S ; #]g).
0 0

2. Let q be an element of Q that has not yet been considered. Perform steps (3)-(5) for each
2V.
3. Let basis(q; ) = f[X ! ; !] j [X ! ; !] 2 qg.
4. If basis(q; ) =6 ;, then let next(q; ) = H (basis(q; )). Add q0 = next(q; ) to Q if it is
not already present.
5. If basis(q; ) 6= ; and 2 T then set
(
0
R := R [ fq ! qq g0 if k 1
fq ! qq j [X ! ; !] 2 q; 2 FIRSTk,1( !)g otherwise
6. If all elements of Q have been considered, perform step (7) for each q 2 Q and then stop.
Otherwise return to step (2).
7. For each [X ! ; !] 2 q, where = x1 : : : xn, set R := R [ fq1 : : : qnq! ! q1q0 ! j
[X ! ; !] 2 q1 ; qi+1 = next(qi ; xi )(i = 1; : : : ; n , 1); q = next(qn ; xn ); q0 =
next(q1 ; X )g
The construction terminates in all cases, since only a nite number of situations [X !
; !] exist.
Figure 5.22 illustrates the algorithm by applying it to the grammar of Figure 5.17a with
k = 2. In this example k = 1 would yield the same set of states. (For k = 0, q4 and q6 would
be coalesced, as would q7 and q9 .) Nevertheless, a single lookahead symbol is not sucient to
distinguish between the shift and reduce transitions in state 6. The grammar is thus LR(2),
but not LR(1).
We shall conclude this section by quoting the following theoretical results:
5.34 Theorem
For every LR(k) grammar with k > 1 there exists an equivalent LR(1) grammar.
5.35 Theorem
Every LL(k) grammar is also an LR(k) grammar.
108 Elements of Formal Systems

q0: [Z ! X ; #] q4 : [Y ! c ; #]
[X ! Y ; #] [Y ! c a; #]
[X ! bY a; #]
[Y ! c; #] q5 : [X ! bY a; #]
[Y ! ca; #] q6 : [Y ! c; a#]
q1: [Z ! X ; #] [Y ! c a; a#]
q2: [X ! Y ; #] q7 : [Y ! ca; #]
q3: [X ! b Y a; #] q8 : [X ! bY a; #]
[Y ! c; a#] q9 : [Y ! ca; a#]
[Y ! ca; a#]
a) States
R = fq0 bc ! q0 q3 c,
q0 c# ! q0 q4 #,
q0 ca ! q0q4a,
q3 ca ! q3q6a,
q4 a# ! q4q7 #,
q5 a# ! q5q8 #,
q6 aa ! q6 q9 a,
q0 q2 # ! q0 q1#,
q0 q4 # ! q0 q2#,
q3 q6 a# ! q3 q5a#,
q0 q4 q7 # ! q0q2 #,
q0 q3 q5 q8# ! q0 q1 #,
q3 q6 q9 a# ! q3q5 a#g
b) Transitions
Figure 5.22: A Deterministic Bottom-Up Automaton for Figure 5.17a
5.36 Theorem
There exist LR(k) grammars that are not LL(k0 ) for any k0 .
5.37 Theorem
There exists an algorithm that, when given an LR(k) grammar G, will decide in a nite
number of steps whether there exists a k0 such that G is LL(k0 ).
As a result of Theorem 5.34 we see that it might possibly be sucient to concern ourselves
only with LR(1) grammars. (As a matter of fact, the transformation underlying the proof
of this theorem is unsuitable for practical purposes.) The remaining theorems support our
intuitive thoughts at the end of Section 5.3.2.

5.4 Notes and References

The basic symbols of a programming language are often described by arbitrary context-free
productions, as illustrated by the LAX denition of Appendix A.1. This description does not
provide a suitable starting point for mechanical construction of a lexical analyzer, and must
therefore be recast by hand in terms of a regular set or regular grammar.
Our interpretation of nite automata and pushdown automata as special cases of general
rewriting systems follows Salomaa [1973]. By this means we avoid a special denition of
concepts such as congurations or transitions of an automaton.
5.4 Notes and References 109

BNF notation was rst used to describe ALGOL 60 [Naur, 1963]. Many authors have
proposed extensions similar to our EBNF, using quoted terminals rather than bracketed
nonterminals and having a regular expression capability. EBNF denitions are usually shorter
than their BNF equivalents, but the important point is that they are textual representations
of syntax charts [Jensen and Wirth, 1974; ANSI, 1978a]. This means that the context-free
grammar can actually be developed and described to the user by means of pictures.
Pushdown automata were rst examined by Samelson and Bauer [1960] and applied
to the compilation of a forerunner of ALGOL 60. Theoretical mastery of the concepts and
the proofs of equivalence to general context-free grammars followed later. Our introduction
of LR(k) grammars via reduction classes follows the work of Langmaack [1971]. Aho and
Ullman [1972] (and many other books dealing with formal languages) cover essentially the
same material as this chapter, but in much greater detail. The proofs that are either outlined
here or omitted entirely can be found in those texts.

Exercises
5.1 Prove that there is no loss of generality by prohibiting formal systems in which a
derivation )+ of a string from itself is possible.
5.2 Choose some useless nonterminal from the LAX denition and brie y justify its inclu-
sion in Appendix A.
5.3 Give an intuitive justication of Theorem 5.10.
5.4 Write a program to examine a nite automaton A and return the accepted language
L(A) in closed form as a regular expression.
5.5 Regular expressions X1 ; : : : ; Xn can also be dened implicitly via systems of regular
equations of the form:

Xi = ai;0 + ai;1 X1 + + ai;n Xn; i = 1; : : : ; n

Here the ai;j are known regular expressions. State the conditions under which such a
system has a unique solution, and give an algorithm to compute this solution. (Hint:
For b 6= , the equation X = aX + b has the solution b a.)
5.6 Give an explanation of the need for `)R ' in Denition 5.28.
0

5.7 Prove that the algorithm for rewriting G to remove productions of the form A ! B ,
A; B 2 N results in a grammar G0 such that L(G) = L(G0 ).
110 Elements of Formal Systems
Chapter 6
Lexical Analysis
Lexical analysis converts the source program from a character string to a sequence of
semantically-relevant symbols. The symbols and their encoding form the intermediate lan-
guage output from the lexical analyzer.
In principle, lexical analysis is a subtask of parsing that could be carried out by the normal
parser mechanisms. To separate these functions, the source language grammar G must be
partitioned into subgrammars G0 ; G1 ; G2 ; : : : such that G1 ; G2 ; : : : describe the structure
of the basic symbols and G0 describes the structure of the language in terms of the basic
symbols. L(G) is then obtained by replacing the terminal symbols of G0 by strings from
L(G1 ); L(G2 ); : : :
The separation of lexical analysis from parsing gives rise to higher organizational costs
that can be justied only by realizing greater savings in other areas. Such savings are possible
in table-driven parsers through reduction in table size. Further, basic symbols usually have
such a simple structure that faster procedures can be used for the lexical analysis than for
the general parsing.
We shall rst discuss the partitioning of the grammar and the desired results of lexical
analysis, and then consider implementation with the help of nite automata.

6.1 Modules and Interfaces

In this section we devote ourselves to the `black box' aspects of lexical analysis: Decomposition
of the grammar and with it the denition of the tasks of lexical analysis, arriving at the
interface between the lexical analyzer and the remainder of the compiler.

6.1.1 Decomposition of the Grammar

Delimiters (keywords, meaningful special characters and combinations of special characters),
identiers and constants together are termed basic symbols. In sharp contrast to other lan-
guage elements, their structure and representation may be arbitrarily changed (say by in-
troducing French or German keywords or by representing `<' by `.LT.') without altering the
power of the language. Further, the structure of the basic symbols can generally be described
with regular grammars or regular expressions.
The productions of Section A.1 describe the basic symbols of LAX. (Conversion to a
regular grammar is left to the reader.) The productions A.1.0.1, A.1.0.9-12 are super uous
because only the nonterminals identifier and constant , single keywords, special characters
and special character combinations (other than `(*') occur in the remainder of the grammar.
111
112 Lexical Analysis

In many languages the grammar for basic symbols (symbol grammar ) is not so easily de-
termined from the language denition, or it results in additional diculties. For example,
the ALGOL 60 Report denes keywords, letters, digits, special characters and special char-
acter combinations as basic symbols; it does not include identiers, numbers and strings in
this category. This description must be transformed to meet the requirements of compiler
construction. In PL/1, as in other languages in which keywords are lexically indistinguish-
able from identiers, context determines whether an identier (e.g. IF) is to be treated as a
keyword or a freely-chosen identier. Two symbol grammars must therefore be distinguished
on the basis of context; one accepts identiers and not keywords, the other does the converse.
An example of similar context-dependence in FORTRAN is the rst identier of a statement:
In an assignment it is interpreted as the identier of a data object, while in most other cases
it is interpreted as a keyword. (Statement classication in FORTRAN is not an easy task {
see the discussion by Sale [1971] for details.)
Even if it is necessary to consult context in order to determine which symbols are possible
at the given point in the input text, a nite automaton often suces. The automaton in this
case has several starting states corresponding to the distinct symbol grammars. We shall not
pursue this point further.

6.1.2 Lexical Analyzer Interface

The lexical analyzer is organized as a module with several local state variables and implements
the following elementary operations:
initialize lexical analysis
next token
wrapup lexical analysis
The central operation next token is used by the parser to obtain the next token in the
token sequence (Section 4.1.1). (A coroutine, activated for each token, might be used instead
of a procedure.) If the parser does not interact directly with the lexical analyzer, then a le
of tokens must be constructed by calls to next token . The parser obtains the tokens by
reading this le. Even if direct calls are possible, such a le is necessary when the parsing is
done in several passes (as for ALGOL 68).
The lexical analyzer itself uses the following elementary operations:
next character (Source program input module)
report lexical error (Error module)
identify symbol (Symbol table module)
enter constant (Constant table module)
The information ow involving the lexical analyzer module is shown in Figure 6.1.
The lexical analyzer reads the input text one character at a time by executing the
next character operation. Both the transition to a new line (if it is signicant) and the
encounter with the end of the input text are represented by characters in order to preserve the
uniformity of the interface. (If next character is executed again after the end of the input
text has been encountered then it continues to deliver the termination character.) Usually
next character is the most frequently executed operation in the entire compiler, and thus
strongly in uences the speed of compilation. We shall consider the implementation of this
operation in detail in Section 6.2.3.
The error reporting module is invoked when lexical errors (unrecognized input charac-
ters and violations of the basic symbol grammar) are encountered. This module will then
determine the continuation of lexical analysis (Section 12.2.3).
When a sequence of characters has been identied as a basic symbol, the lexical analyzer
will either create a token describing it or will restart in a new state. Dierent representations
6.2 Construction 113

Source program
Symbol table
input

Lexical analyser Constant table

Parser Error handler

Figure 6.1: Lexical Analyzer Interface

of the same basic symbol are resolved at this point. For example, if we were to allow the
symbol `<' to be written `LESS' or `LT' also, all three would lead to creation of the same
token. The operation identify symbol is used during token creation to perform the mapping
discussed in Section 4.2.1. If the basic symbol is a literal constant, rather than an identier,
the enter constant operation is used instead of identify symbol (Section 4.2.2).

6.2 Construction
We assume that the basic symbols are described by some set of regular grammars or regular
expressions as discussed in Section 6.1.1. According to Theorem 5.15 or Theorem 5.19 we
can construct a set of nite automata that accept the basic symbols. Unfortunately, these
automata assume the end of the string to be known a priori ; the task of the lexical analyzer is
to extract the next basic symbol from the input text, determining the end of the symbol in the
process. Thus the automaton only partially solves the lexical analysis problem. To enhance
the eciency of the lexical analyzer we should use the automaton with the fewest states from
the set of automata that accept the given language. Finally, we consider implementation
questions.
In order to obtain the classication for the basic symbol (Figure 4.1) we partition the
nal states of the automaton into classes. Each class either provides the classication directly
or indicates that it must be found by using the operation identify symbol . The textual
representation of constants, and the strings used to interrogate the symbol table, are obtained
from the input stream. The automaton is extended for this purpose to a nite-state transducer
that emits a character on each state transition. (In the terminology of switching theory, this
transducer is a special case of a Mealy machine.) The output characters are collected together
into a character string, which is then used to derive the necessary information.

6.2.1 Extraction and Representation

A semicolon is an ALGOL 60 basic symbol, and is not a head of any other basic symbol. When
an ALGOL 60 lexical analysis automaton reaches the nal state corresponding to semicolon,
it can halt and accept the semicolon. The end of the accepted string has been determined,
and the input pointer is positioned for the next symbol. A colon is also an ALGOL 60 basic
symbol, but it is a head of :=. Therefore the automaton must look ahead when it reaches
the nal state corresponding to colon. A more complex lookahead is required in the case of
FORTRAN, where a digit sequence d is a basic symbol and also a head of the basic symbol
114 Lexical Analysis

d.E1. Since .EQ. is also a basic symbol, the automaton must look ahead three characters (in
certain cases) before it can determine the end of the symbol string.
By applying the tests of Section 5.3.3 to the original grammar G, we could determine (for
xed k) whether a k-character lookahead is sucient to resolve ambiguity. Because of the
eort involved, this is usually not done. Instead, we apply the principle of the longest match :
The automaton continues to read until it reaches a state with no transition corresponding to
the current input character. If that state is a nal state, then it accepts the symbol scanned
to that point; otherwise it signals a lexical error. The feasibility of the principle of the longest
match is determined by the representation of the symbols (the grammars G1 ; G2 ; : : : ) and by
the sequences of symbols permitted (the grammar G0 ).
The principle of the longest match in its basic form as stated above is unsuitable for a large
number of grammars. For example, an attempt to extract the next token from `3.EQ.4' using
the rules of FORTRAN would result in a lexical error when `Q' was encountered. The solution
is to retain information about the most-recently encountered nal state, thus providing a `fall-
back' position. If the automaton halts in a nal state, then it accepts the symbol; otherwise
it restores the input stream pointer to that at the most-recently encountered nal state. A
lexical error is signaled only if no nal state had been encountered during the scan.
We have tacitly assumed that the initial state of the automaton is independent of the
nal state reached by the previous invocation of next token . If this assumption is relaxed,
permitting the state to be retained from the last invocation, then it is sometimes possible to
avoid even the limited backtracking discussed above (Exercise 6.3). Whether this technique
solves all problems is still an open question.
The choice of a representation for the keywords of a language plays a central role in de-
termining the representations of other basic symbols. This choice is largely a question of
language design: The denitions of COBOL, FORTRAN and PL/1 (for example) prescribe
the representations and their relationship to freely-chosen identiers. In the case of AL-
GOL 60 and its descendants, however, these characteristics are not discussed in the language
denitions. Here we shall brie y review the possibilities and their consequences.
The simplest possibility is the representation of keywords by reserved words { ordinary
identiers that the programmer is not permitted to use for any other purpose. This approach
requires that identiers be written without gaps, so that spaces and newlines can serve as
separators between identiers and between an identier and a number. Letters may appear
within numbers, and hence they must not be separated from the preceding part of the number
by spaces. The main advantage of this representation is its lucidity and low susceptibility to
typographical errors. Its main disadvantage is that the programmer often does not remember
all of the reserved words and hence incorrectly uses one as a freely-chosen identier. Further,
it is virtually impossible to modify the language by adding a new keyword because too many
existing programs might have used this keyword as a freely-chosen identier.
If keywords are distinguished lexically then it is possible to relax the restrictions on place-
ment of spaces and newlines. There is no need for the programmer to remember all of the
keywords, and new ones may be introduced without aecting existing programs. The rules
for distinguishing keywords are known as stropping conventions ; the most common ones are:
Underlining the keyword.
Bracketing the keyword by special delimiters (such as the apostrophes used in the DIN
66006 standard for ALGOL 60).
Prexing the keyword with a special character and terminating it at the rst space,
newline or character other than a letter or digit.
Using upper case letters for keywords and lower case for identiers (or vice-versa).
6.2 Construction 115

All of these conventions increase the susceptibility of the input text to typographical errors.
Some also require larger character sets than others or relatively complex line-imaging routines.
6.2.2 State Minimization
Consider a completely-specied nite automaton A = (T; Q; R; q0 ; F ) in which a production
qt ! q0 exists for every pair (q; t), q 2 Q, t 2 T . Such an automaton is termed reduced when
there exists no equivalent automaton with fewer states.
6.1 Theorem
Theorem: For every completely-specied nite automaton A = (T; Q; R; q0 ; F ) there exists a
reduced nite automaton A0 = (T; Q0 ; R0 ; q00 ; F 0 ) with L(A0 ) = L(A).
To construct A0 we rst delete all states q for which there exists no string ! such that
q0 ! ) q. (These states are termed unreachable.) We then apply the renement algorithm
of Section B.3.2 to the state diagram of A, with the initial partition fq j q 2 F g, fq j q 2= F g.
Let Q0 be the set of all blocks in the resulting partition, and let [q] denote the block to which
q 2 Q belongs. The denition of A0 can now be completed as follows:

R0 = f[q]t ! [q0 ]jqt ! q0 2 Rg

q00 = [q0 ]
F 0 = f[q]jq 2 F g
As an example, consider the automaton of Figure 5.13, which recognized the regular
expression l(l + d). The initial partition consists of two blocks fq0 g and fq1 ; q2 ; q3 g and is
not rened, leading to the automaton of Figure 6.2. We would have achieved the same result
if we had begun with the regular expression (A + B + + Z )(A + B + + Z + 0 + + 9).

l ld
0 1
Figure 6.2: Reduced Automaton Accepting l(l + d)
In order to apply the algorithm of Section B.3.2 to this example we must complete the
original automaton, which permits only l as an input character in state q0 . To do this we
introduce an `error state', qe , and transitions qt ! qe for all pairs (q; t); q 2 Q; t 2 T , not
corresponding to transitions of the given automaton. (In the example, q0 d ! qe suces.) In
practice, however, it is easier to modify the algorithm so that it does not require explicit error
transitions.
If c denotes any character other than the quote, then the regular expression "" + "(c +
"")(c + "")*" describes the characters and strings of Pascal. Figure 6.3a shows the automaton
constructed from this expression according to the procedure of Theorem 5.19, and the reduced
automaton is shown in Figure 6.3b.
In our application we must modify the equivalence relation still further, and only treat
nal states as equivalent when they lead to identical subsequent processing. For an automaton
recognizing the symbol grammar of LAX, we divide the nal states into the following classes:
Identiers or keywords
Special characters
Combinations of special characters
116 Lexical Analysis

1 ’’ 4
’’ c ’’
0 c 3 c 6
c ’’ ’’
’’
2 ’’ 5
a) Unreduced
c
0 ’’ 1,2
3,6
’’
’’
4,5
b) Reduced
Figure 6.3: Finite Automata Accepting `"" + "(c + "")(c + "")*"'

Integers
Floating point numbers
Floating point numbers with exponents
This results in the reduced automaton of Figure 6.4. Letters denote the following character
classes:
a = all characters other than '*'
a0 = all characters other than '*' or ')'
d = digits
l = letters
s = '+' '-' '*' '<' '>' '"' ';' ',' ')' '[' ']'
Figure 6.4 illustrates several methods of obtaining the code corresponding to a basic
symbol. States, 1, 6, 7, 9, and 12-18 all provide the code directly. Identify symbol must
be used in state 4 to distinguish identiers from keywords. In state 19 we might also use
identify symbol , or we might use some other direct computation based on the character
codes.
The state reduction in these examples could be performed by hand with no display of
theory, but the theory is required if we wish to mechanically implement a lexical analyzer
based upon regular expressions.
6.2.3 Programming the Lexical Analyzer
In order to extract the basic symbol that follows a given position p in the input stream we must
recognize and delete irrelevant characters such as spaces and newlines, use the automaton to
read the symbol, and x the terminal position p0 .
Super uous spaces can be deleted by adding transitions q0 0 ! q to all states q in which
such spaces are permitted. Since newlines (card boundaries or carriage returns) are input
characters if they are signicant, we can handle them in the same way as super uous spaces
in many languages.
6.2 Construction 117

a * ld
* -
2 3 4 5
a’ ld
*
) l
1 6

( . d
d d
0 8 d 9 12
d
s . E d d

19 / : = 7 E 10 +- 11
d

17 14 13

/ = =

18 16 15
Figure 6.4: Finite Automaton Accepting LAX Basic Symbols

There are two possibilities from which to choose when programming the automaton:
Representing the transition table as a matrix, so that the program for the automaton
has the general form:
while basic_symbol_not_yet_complete do
state := table[state, next_character] ;

Programming the transition table as a case clause for each state.

The rst method is generally expensive in terms of memory. For LAX we need a 20 57
matrix, even without considering characters that may occur only in comments. We can reduce
the size of this matrix by grouping together all characters that are treated uniformly by the
lexical analyzer and provide one column for each such character class. The class to which a
character belongs is then obtained from an array indexed by the character. This array makes
the remainder of the compiler relatively independent of changing character sets and their
encoding, thus increasing its machine-independence. For LAX the classes are: fletters other
than Eg, fEg, fdigitsg, f g, f(g, f)g, f*g, f+ -g, f;g, f=g, f/g, f.g, f:g, f<>", [ ]g, fspace
tab newlineg, fterminator (#)g, fcharacters allowed only in commentsg; the matrix size is
then 20 18. The storage requirements can often be reduced still further, possibly by means
of techniques introduced in the next chapter.
In contrast to the matrix representation, mechanical implementation of the transition
table by case clauses can be carried out only at great cost. Hand coding is rather simple,
however, and one usually obtains a much smaller lexical analyzer. Steps can also be taken to
speed up execution of the most-frequently performed transitions.
118 Lexical Analysis

The simplest way to provide output from the automaton is to add the input character
to a string { empty at the start of the basic symbol { during each state transition. This
strategy is generally inadequate. For example, the quotes bounding a Pascal character or
string denotation should be omitted and any doubled internal quote should be replaced by
a single quote. Thus more general actions may need to be taken at each state transition. It
usually suces, however, to provide the following four options:
Add (some mapping of) the input character to the output string.
Add a given character to the output string.
Set a pointer or index to the output string.
Do nothing.
Figure 6.5 illustrates three of these actions applied to produce output from the automaton
of Figure 6.3b. A slash separates the output action from the input character; the absence of
a slash indicates the `do nothing' action.
c/c
0 1,2
’’ 3,6 ’’/’’

’’ 4,5
Figure 6.5: Finite Transducer for Pascal Strings
In order to produce the standard representation of oating point numbers (see Sec-
tion 4.2.2), we require three indices to the characters of the signicand:
beg: Initially indexes the rst character of the signicand, nally indexes the rst nonzero
digit.
pnt: Indexes the rst position to the right of the decimal point.
lim: Initially indexes the rst position to the right of the signicand, nally indexes the rst
position to the right of the last nonzero digit.
By moving the indices beg and lim, the leading and trailing zeros are removed so that the
signicand is left over in standard form. If e is the value of the explicit exponent, then the
adjusted exponent e0 is given by:
e0 := e + (pnt , beg) signicand interpreted as a fraction
e0 := e + (pnt , lim) signicand interpreted as an integer
The standard representation of a oating point zero is the pair (0 00 ; 0). This representation
is obtained by taking a special exit from the standardization algorithm if beg becomes equal
to lim during the zero-removal process.
Many authors suggest that the next character operation be implemented by a proce-
dure. We have already pointed out that the implementation of next character strongly
in uences the overall speed of the compiler; in many cases simple use of a procedure leads to
signicant ineciency. For example, Table 6.6 shows the results of measuring lexical analysis
times for three translators running on a Control Data 6400 under KRONOS 2.0. RUN 2.3 is a
FORTRAN compiler that reads one line at a time, storing it in an array; the next character
operation is implemented as a fetch and index increment in-line. The COMPASS 2.0 assem-
bler implements some instances of next character by procedure calls and others by in-line
references, while the Pascal compiler uses a procedure call to fetch each character. The two
6.3 Notes and References 119

test programs for the FORTRAN compiler had similar characteristics: Each was about 5000
lines long, composed of 30-40 heavily-commented subprograms. The test program for COM-
PASS contained 900 lines, about one-third of which were comments, and that for Pascal (the
compiler itself) had 5000 lines with very few comments.
Lexical Analysis Time
Translator Program Microseconds Fraction of
per character total compile time
RUN 2.3 Page Formatter 3.56 14%
without comments 3.44 9%
Flowchart Generator 3.3 11.5%
COMPASS 2.0 I/O Package 5.1 21%
Pascal 3.4 Pascal Compiler 35.6 39.6%
Figure 6.6: Lexical Analysis on a Control Data 6400 [Dunn, 1974]
Further measurements on existing compilers for a number of languages indicate that the
major subtasks of lexical analysis can be rank-ordered by amount of time spent as follows:
1. Skipping spaces and comments.
2. Collecting identiers and keywords.
3. Collecting digits.
4. All other tasks.
In many cases there are large (factor of at least 2) dierences in the amount of time spent
between adjacent elements in this hierarchy. Of course the precise breakdown depends upon
the language, compiler, operating system and coding technique of the user. For example, skip-
ping a comment is trivial in FORTRAN; on the other hand, an average non-comment card
in FORTRAN has 48 blank columns out of the 66 allocated to code Knuth [1971a]. Taken
together, the measurements discussed in the two paragraphs above lead to the conclusion that
the lexical analyzer should be partitioned further: Tasks 1-3 should be incorporated into a
scanner module that implements the next character operation, and the nite automaton
and its underlying regular grammar (or regular expression) should be dened in terms of
the characters digit string , identifier , keyword , etc. This decomposition drastically re-
duces the number of invocations of next character , and also the in uence of the automaton
implementation upon the speed of the lexical analyzer.
Tasks 1-3 are trivial, and can be implemented `by hand' using all of the coding tricks and
special instructions available on the target computer. They can be carefully integrated with
the I/O facilities provided by the operating system to minimize overhead. In this way, serious
ineciencies in the lexical analyzer can be avoided while retaining systematic construction
techniques for most of the implementation.

6.3 Notes and References

The fact that the basic symbols are regular was rst exploited to generate a lexical analyzer
mechanically in the RWORD System [Johnson et al., 1968; Gries, 1971]. More recently,
DeRemer [1974] has proposed the use of a modied LR technique (Section 5.3.3) for this
generation. Lesk [1975] describes how such a system can be linked to the remainder of a
compiler.
Lexical analyzer generators are still the exception rather than the rule. The analyzers
used in practice are simple, and hand coding is not prohibitively expensive. There are also
120 Lexical Analysis

many indications that the hand-coded product provides signicant savings in execution time
over the products of existing generators. Many of the coding details (table formats, output
actions, limited backtrack and character class tradeos) are discussed by Waite [1973a] in
his treatment of string-directed pattern matching.
Two additional features, macros and compiler control commands (compiler options,
compile-time facilities) complicate the lexical analyzer and its interface to the parser. Macro
processing can usually be done in a separate pre-pass. If, however, it is integrated into the
language (as in PL/M or Burroughs Extended ALGOL) then it is a task of the lexical an-
alyzer. This requires additional information from the parser regarding the scope of macro
denitions.
We recommend that control commands always be written on a separate line, and be easily
recognizable by the lexical analyzer. They should also be syntactically valid, so that the parser
can process them if they are not relevant to lexical analysis. Finally, it is important that there
be only one form of control command, since the user should not be forced to learn several
conventions because the compiler writer decides to process commands in several places.

Exercises
6.1 Derive a regular grammar from the LAX symbol grammar of Appendix A.1. Derive a
regular expression.
6.2 [Sale, 1971; McIlroy, 1974] Consider the denition of FORTRAN 66.
(a) Partition the grammar as discussed in Section 6.1.1. Explain why you distin-
guished each of the symbol subgrammars Gi .
(b) Carefully specify the lexical analyzer interface. How do you invoke dierent symbol
subgrammars?
6.3 Consider the following set of tokens, which are possible in a FORTRAN assignment
statement [McIlroy, 1974] (identifier is constructed as usual, d denotes a nonempty
sequence of digits, and s denotes either `+' or `-'):
+ - * / ** (), =
.TRUE. .FALSE.
.AND. .OR. .NOT.
.LT. .LE. .EQ. .NE. .GE. .GT.
identifier
d d. d.d .d
dEd d.Ed d.dEd .dEd
dEsd d.Esd d.dEsd .dEsd
Assume that any token sequence is permissible, and that the ambiguity of `***' may
be resolved in any convenient manner.
(a) Derive an analysis automaton using the methods of Section 5.2, and minimize the
number of states by the method of Section B.3.3.
(b) Derive an analysis automaton using the methods given by Aho and Corasick
[1975], and minimize the number of states.
(c) Describe in detail the interaction between the parser and the automaton derived
in (b). What information must be retained? What form should that information
take?
6.3 Notes and References 121

(d) Can you generalize the construction algorithms of Aho and Corasick to arbitrary
regular expression inputs?
6.4 Write a line-imaging routine to accept an arbitrary sequence of printable characters,
spaces and backspace characters and create an image of the input line. You should
recognize an extended character set which includes arbitrary underlining, plus the fol-
lowing overstruck characters:
c overstruck by / interpreted as `cents'
= overstruck by / interpreted as `not equal'
(Note: Overstrikes may occur in any order.) Your image should be an integer array,
with one element per character position. This integer should encode the character (e.g.
`cents') resulting in that position from the arbitrary input sequence.
6.5 Write a program to implement the automaton of Figure 6.4 as a collection of case
clauses. Compile the program and compare its size to the requirements for the transi-
tion table.
6.6 Attach output specications to the transitions of Figure 6.4. How will the inclusion of
these specications aect the program you wrote for Exercise 6.5? Will their inclusion
change the relationship between the program size and transition table size signicantly?
6.7 Consider the partition of a lexical analyzer for LAX into a scanner and an automaton.
(a) Restate the symbol grammar in terms of identifier , digit string , etc. to
re ect the partition. Show how this change aects Figure 6.4.
(b) Carefully specify the interface between scanner and automaton.
(c) Rewrite the routine of Exercise 6.5, using the interface dened in (b). Has the
overall size of the lexical analyzer changed? (Don't forget to include the scan-
ner size!) Has the relationship between the two possible implementations of the
automaton (case clauses or transition tables) changed?
(d) Measure the time required for lexical analysis, comparing the implementation of
(c) with that of Exercise 6.5. If they dier, can you attribute the dierence to any
specic feature of your environment (e.g. an expensive procedure mechanism)? If
they do not dier, can you explain why?
6.8 Suppose that LAX is being implemented on a machine that supports both upper and
lower case letters. How would your lexical analyzer change under each of the following
assumptions:
(a) Upper and lower case letters are indistinguishable.
(b) Upper and lower case may be mixed arbitrarily in identiers, but all occurrences of
a given identier must use the same characters. (In other words, if an identier is
introduced as ArraySize then no identier such as arraysize can be introduced
in the same range.) Keywords must always be lower case.
(c) As (b), except that upper and lower case may be mixed arbitrarily in keywords,
and need not always be the same.
(d) Choose one of the schemes (a)-(c) and argue in favor of it on grounds of program
portability, ease of use, documentation value, etc.
122 Lexical Analysis
Chapter 7
Parsing
The parsing of a source program determines the semantically-relevant phrases and, at the
same time, veries syntactic correctness. As a result we obtain the parse tree of the program,
at rst represented implicitly by the sequence of productions employed during the derivation
from (or reduction to) the axiom according to the underlying grammar.
In this chapter we concern ourselves with the practical implementation of parsers. We
begin with the parser interface and the appropriate choice of parsing technique, and then
go into the construction of deterministic parsers from a given grammar. We shall consider
both the top-down and bottom-up parsing techniques introduced in Section 5.3.2 and 5.3.3.
Methods for coding parsers by hand and for generating them mechanically will be discussed.

7.1 Design
To design a parser we must dene the grammar to be processed, augment it with connection
points (points at which information will be extracted) and choose the parsing algorithm.
Finally, the augmented grammar must be converted into a form suited to the chosen parsing
technique. After this preparation the actual construction of the parser can be carried out
mechanically. Thus the process of parser design is really one of grammar design, in which we
derive a grammar satisfying the restrictions of a particular parsing algorithm and containing
the connection points necessary to determine the semantics of the source program.
Even if we are given a grammar for the language, modications may be necessary to obtain
a useful parser. We must, of course, guarantee that the modied grammar actually describes
the same language as the original, and that the semantic structure is unchanged. Structural
syntactic ambiguity leading to dierent semantic interpretations can only be corrected by
altering the language. Other ambiguities can frequently be removed by deleting productions
or restricting their applicability depending upon the parser state.

7.1.1 The Parser Interface

A parser accepts a sequence of basic symbols, recognizes the extant syntactic structure, and
outputs that structure along with the identity of the relevant symbols. If the syntactic
structure is not error-free, the parser invokes the error handler to report errors and to aid
in recovery so that processing can continue. (The details of the recovery mechanism will be
discussed in Section 12.2.2.) Figure 7.1 shows the information ow involved in the parsing
process.
Three possible interface specications are suggested by Figure 7.1, depending upon the
overall organization of the compiler. The most common is for the parser module to provide
123
124 Parsing

Lexical Tokens Connection Semantic

analyzer Parser points analyzer

Error Sythesized
reports tokens

Error
handler

Figure 7.1: Parser Information Flow

the operation parse program . It invokes the lexical analyzer's next symbol operation for
each basic symbol, and reports each connection point by invoking an appropriate operation
of some other module. (We term this invocation a parser action.) Control of the entire
transduction process resides within the parser in this design. By moving the control out of
the parser module, we obtain the two alternative designs: The parser module provides either
an operation parse symbol that is invoked with a token as an argument, or an operation
next connection that is invoked to obtain a connection point specication.
It is also possible to divide the parsing over more than one pass. Properties of the language
and demands of the parsing algorithm can lead to a situation where we need to know the
semantics of certain symbols before we can parse the context of the denitions of these
symbols. ALGOL 68, for example, permits constructs whose syntactic structure can be
recognized by deterministic left-to-right analysis only if the complete set of type identiers is
known beforehand. When the parsing is carried out in several passes, the sequence of symbols
produced by the lexical analyzer will be augmented by other information collected by parser
actions during previous passes. The details depend upon the source language.
We have already considered the interface between the parser and the lexical analyzer, and
the representation of symbols. The parser looks ahead some number of symbols in order to
control the parsing. As soon as it has accepted one of the lookahead symbols as a component
of the sentence being analyzed, it reads the next symbol to maintain the supply of lookahead
symbols. Through the use of LL or LR techniques, we can be certain that the program is
syntactically correct up to and including the accepted symbol. The parser thus need not
retain accepted symbols. If the code for these symbols, or their values, must be passed on
to other compiler modules via parser actions, these actions must be connected directly to
the acceptance of the symbol. We shall term connection points serving this purpose symbol
connections.
We can distinguish a second class of connection point, the structure connection. It is
used to connect parser actions to the attainment of certain sets of situations (in the sense
of Section 5.3.2) and permits us to trace the phrases recognized by the parser in the source
program. Note carefully that symbol and structure connections provide the only information
that a compiler extracts from the input text.
In order to produce the parse tree as an explicit data structure, it suces to provide
one structure connection at each reduction of a simple phrase and one symbol connection at
acceptance of each symbol having a symbol value; at the structure connections we must know
which production was applied. We can x the connection points for this process mechanically
from the grammar. This process has proved useful, particularly with bottom-up parsing.
Parser actions that enter declarations into tables or generate code directly cannot be xed
mechanically, but must be introduced by the programmer. Moreover, we often know which
production is to be applied well before the reduction actually takes place, and we can make
7.1 Design 125

good use of this knowledge. In these cases we must explicitly mark the connection points
and parser actions in the grammar from which the parser is produced. We add the symbol
encoding (code and value) taken from the lexical analyzer as a parameter to the symbol
connections, whereas parser actions at structure connections extract all of their information
from the state of the parser.
Expression ::= Term ('+' Term % Addop ) .
Term ::= Factor ('*' Factor % Mulop ) .
Factor ::= 'Identifier ' & Ident j '(' Expression ')' .
a) A grammar for expressions
Addop : Output "+"
Mulop : Output "*"
Ident : Output the identier returned by the lexical analyzer

b) Parser actions to produce postx

Figure 7.2: Connection Points
Figure 7.2 illustrates a grammar with connection points. The character % marks structure
connections, the character & symbol connections. Following these characters, the parser
action at that point is specied. Denitions of the parser actions are given in Figure 7.2b.
The result of these specications is a translation of arithmetic expressions from inx to postx
form.
The processes for parser generation to be described in Sections 7.2 and 7.3 can inter-
pret symbol and structure connections introduced explicitly into the grammar as additional
nonterminals generating the null string. Thus the connection points do not require special
treatment; only the generated parsing algorithm must distinguish them from symbols of the
grammar. In addition, none of the transformations used during the generation process alters
the invocation sequence of the associated parser actions.
The introduction of connection points can alter the properties of the grammar. For ex-
ample, the grammar whose productions are fZ ! S , S ! abc, S ! abdg is LR(0). The
modied grammar fZ ! S , S ! a&Abc, S ! a&Bbdg no longer possesses this property:
After reading a it is not yet clear which of the parser actions should be carried out.
If a grammar does not have a desired property before connection points are introduced,
then their inclusion will not provide that property. This does not, however, prohibit a parser
action from altering the state of the parser and thus simulating some desirable property. For
example, one can occasionally distinguish among several possible state transitions through
the use of semantic information and in this manner establish an LL property not previously
present. More problems are generally created than avoided by such ad hoc measures, however.

7.1.2 Selection of the Parsing Algorithm

The choice of which parsing technique to use in a compiler depends more upon the economic
and implementation viewpoint than upon the source language and its technical properties.
Experience with a particular technique and availability of a program to construct the parser
(or the cost of developing such a program) are usually stronger criteria than the suitability of
the technique for the given source language. The reason is that, in many cases, the grammar
for a language can be modied to satisfy the restrictions of several parsing techniques.
As we have previously stressed, the parser should work deterministically under all cir-
cumstances. Only in this way can we parse correct programs in a time linearly dependent
126 Parsing

upon program length, avoiding backtrack and the need to unravel parser actions. We have
already pointed out the LL and LR algorithms as special cases of deterministic techniques
that recognize a syntactic error at the rst symbol, t, that cannot be the continuation of a
correct program; other algorithms may not discover the error until attempting to reduce the
simple phrase in which t occurs. Moreover, LR(k) grammars comprise the largest class whose
sentences can be parsed using deterministic pushdown automata. In view of these properties
we restrict ourselves to the discussion of LL and LR parsing algorithms. Other techniques
can be found in the literature cited in Section 7.4.
Usually the availability of a parser generator is the strongest motive for the choice between
LL and LR algorithms: If one has such a generator at one's disposal, then the technique it
implements is given preference. If no parser generator is available, then an LL algorithm
should be selected because the LL conditions are substantially easier to verify by hand. Also
a transparent method for obtaining the parser from the grammar exists for LL but not for
LR algorithms. By using this approach, recognizers for large grammars can be programmed
relatively easily by hand.
LR algorithms apply to a larger class of grammars than LL algorithms, because they
postpone the decision about the applicable production until the reduction takes place. The
main advantage of LR algorithms is that they permit more latitude in the representation of
the grammar. As the example at the end of Section 7.1.1 shows, however, this advantage may
be neutralized if distinct structure connections that frustrate deferment of a parsing decision
must be introduced. (Note that LL and LR algorithms behave identically for all language
constructs that begin with a special keyword.)
We restrict our discussion to parsers with only one-symbol lookahead, and thus to LL(1)
and LR(1) grammars. Experience shows that this is not a substantial restriction; program-
ming languages are usually so simply constructed that it is easy to satisfy the necessary
conditions. In fact, to a large extent one can manage with no lookahead at all. The main
reason for the restriction is the considerable increase in cost (both time and space) that must
be invested to obtain more lookahead symbols in the parser generator and in the generated
parser.
When dealing with LR grammars, not even the restriction to the LR(1) case is sucient
to obtain practical tables. Thus we use an LR(1) parse algorithm, but control it with tables
obtained through a modication of the LR(0) analyzer.

7.1.3 Parser Construction

LL and LR parsers are pushdown automata. Given a grammar G = (T; N; P; Z ), we can
use either construction 5.23 (LL) or construction 5.33 (LR) to derive a parsing automaton
A = (T; Q; R; q0 ; fq0 g; Q; q0 ). To implement this automaton, we must represent the transitions
of R in a convenient form so that we can determine the next transition quickly and at the
same time keep the total storage requirement reasonable.
For this purpose we derive a transition function, f (q; ), from the production set R. It
species which of the possible actions (e.g. read a symbol, reduce according to a production
from P , report an error) should be taken in state q when the input string begins with the
element 2 T . In the LR case we also dene f (q; ) for 2 N ; it then species the action to
be taken in state q after a reduction to . The transition function may be represented by a
(transition) matrix.
Some of the entries of f (q; ) may be unreachable, regardless of the terminal string input
to the parser. (We shall give examples in Section 7.3.1.) Because these entries can never be
reached, the actions they specify are irrelevant. In the terminology of sequential machines,
these entries are don't-cares and the transition function is incompletely specied. The presence
7.2 LL(1) Parsers 127

of don't-cares leads to possible reduction in table size by combining rows or columns that dier
only in those elements.
The transition function may be stored as program fragments rather than as a matrix.
This is especially useful in an LL parser, where there are simple rules relating the program
fragments to the original grammar.
Parser generation is actually compilation: The source program is a grammar with em-
bedded connection points, and the target program is some representation of the transition
function. Like all compilers, the parser generator must rst analyze its input text. This
analysis phase tests the grammar to ensure that it satises the conditions (LL(1), LR(1),
etc.) assumed by the parser. Some generators, like `error correcting' compilers, will attempt
to transform a grammar that does not meet the required conditions. Other transformations
designed to optimize the generated parser may also be undertaken. In Sections 7.2 and 7.3
we shall consider some aspects of the `semantic analysis' (condition testing) and optimization
phases of parser generators.
Grammar Type Test Parser generation
LL(1) n 2
n2
Strong LL(k) n k +1
nk+1
LL(k) n2k 2n +(k+1) log n
k

SLR(1) n 2
2n+log n
SLR(k) nk+2 2n+k log n
LR(k) n2(k+1) 2n +1 +k log n
k

Table 7.1: Computational Complexity of Parser Generation [Hunt et al., 1975]

Table 7.1 summarizes the computational complexity of the parser generation algorithms
presented in the remainder of this chapter. (The parameter n is the sum of the lengths of the
right-hand sides of all productions.) It should be emphasized that the expressions of Table 7.1
represent asymptotic bounds on execution time. All of the bounds given are sharp, since in
every case grammars exist whose parsers require an amount of table space proportional to
the time bound specied for parser construction.

7.2 LL(1) Parsers

LL(1) parsers are top-down pushdown automata that can be obtained by construction 5.23.
We shall rst sharpen the denition of an LL grammar and thereby simplify the construction
of the automaton. Next we explain the relationship between a given LL(1) grammar and the
implementation of the pushdown automaton. Finally we develop the algorithms for an LL(1)
parser generator. We defer the problem of error handling until Section 12.2.2.

7.2.1 Strong LL(k) Grammars

Consider an LL(k) grammar G = (T; N; P; Z ) and a left derivation:
Z )L A ) ; 2 T ; A 2 N; 2 V
According to Denition 5.22, we can predict the next applicable production A ! if and
k : are given. The dependence upon is responsible for the fact that, in construction 5.23,
we must carry along the right context ! in the situation [A ! ; !]. Without this
dependence we could use the following in place of step 5 of the construction algorithm:
128 Parsing

5'. If = B for some B 2 N and 2 V , let q0 = [X ! B ;

] and H = f[B !
i ; FOLLOWk (B )] j B ! i 2 P g. Set Q := Q [ fq0g [ H , and R := R [ fq !
q0hi j hi 2 H; 2 FIRSTk (i FOLLOWk (B ))g.
In this way, situations distinct only in the right context always belong to the same
state. This simplication is made possible by the strong LL(k) grammars introduced by
Rosenkrantz and Stearns [1970]:
7.1 Definition
A context free grammar G = (T; N; P; Z ) is called a strong LL(k) grammar for given k < 0
if, for arbitrary derivations
Z )L A ) ) ; 2 T ; ; 2 V ; A 2 N
Z )L 0 A0 ) 0!0 ) 0 0 0 ; 0 2 T ; !; 0 2 V
(k : = k : 0 ) implies = !.
The grammar with P = fZ ! aAab; Z ! bAbb; A ! a; A ! g is LL(2), as can be seen
by writing down all derivations. On the other hand, the derivations Z ) aAab ) aab and
Z ) bAbb ) babb violate the conditions for strong LL(2) grammars.
The dependence upon , the stack contents of the automaton, is re ected in the fact that
two distinct states q = [X ! ; !] and q0 = [X ! ; !0 ], identical except for the
right context, can occur in construction 5.23 and lead to distinct sequences of transitions.
Without this dependence the further course of the parse is determined solely by X ! ,
and FOLLOWk (X ) cannot distinguish the right contexts !,!0 .
7.2 Theorem (LL(1) condition)
A context free grammar G is LL(1) i for two productions X ! , X ! 0 , 6= 0 implies
that FIRST (FOLLOW (X )) and FIRST (0 FOLLOW (X )) are disjoint.
To prove Theorem 7.2 we assume a t 2 T that is an element of both FIRST (FOLLOW (X ))
and FIRST (0 FOLLOW (X )). Then one of the following cases must hold:
1. t 2 FIRST (), t 2 FIRST (0 )
2. 2 FIRST (), t 2 FIRST (0 ), t 2 FOLLOW (X )
3. 2 FIRST (0 ), t 2 FIRST (), t 2 FOLLOW (X )
4. 2 FIRST (), 2 FIRST (0 ), t 2 FOLLOW (X )
With the aid of the denition of FOLLOW we can easily see that each of these cases
contradicts Denition 5.22 for k = 1. Thus G is not an LL(1) grammar; in fact, in case
(4) the grammar is ambiguous. If, on the other hand, the grammar does not fulll the
specications of Denition 5.22, then one of the above cases holds and the grammar does not
satisfy the LL(1) condition. (Note that Theorem 5.24 may be derived directly from the LL(1)
condition.)
If the grammar is -free, the LL(1) condition can be simplied by omitting FOLLOW (X ).
Obviously it is fullled if and only if G is a strong LL(k) grammar. Thus Theorem 7.3 follows
from Theorem 7.2:
7.3 Theorem
Every LL(1) grammar is a strong LL(1) grammar.
Theorem 7.3 cannot be generalized to k < 1, as illustrated by the LL(2) grammar with
P = fZ ! aAab; Z ! bAbb; A ! a; A ! g cited above. The simplication of pushdown
automata mentioned at the beginning of the section thus applies only to the LL(1) case; it is
not applicable to LL(k) grammars with k < 1.
7.2 LL(1) Parsers 129

7.2.2 The Parse Algorithm

A matrix representation of the transition function for the LL(1) case does not provide as much
insight into the parsing process as does the conversion of the productions of the grammar to
recursive procedures. We shall thus begin our treatment by discussing the technique known
as recursive descent.
In a recursive descent parser we use a position in the parser to re ect the state of the
automaton. The stack therefore contains locations at which execution of the parser may
resume. When a state represents a situation [X ! B ; !], B 2 N , we must enter
information into the stack about the following state [X ! B ; !] before proceeding to
the consideration of the production B ! . If we are using a programming language that
permits recursive procedures, we may associate a procedure with each nonterminal B and use
the standard recursion mechanism of the language to implement the automaton's stack.
With this approach, the individual steps in construction 5.23 lead to the program schemata
shown in Figure 7.3. These schemata assume the existence of a global variable symbol con-
taining the value of the last symbol returned by the lexical analyzer, which is reset by a call
to next symbol .
Transition set Program schema
q! q : end
qt ! q0 q : if symbol = t then next symbol else error ; q0 : : : :
q : X ; q0 : : : :
:::
proc X :
begin
case symbol of
qt1 ! q0 q1t1 t : begin q : : : : end;
1 1
::: :::
qtm ! q0qm tm tm : begin qm : : : : end
otherwise error
end
end
Figure 7.3: Program Schemata for an LL(1) Parser
Consider the grammar of Figure 7.4a, which, like the grammar of Figure 5.3b, satises the
LL(1) condition. By construction 5.23, with the simplication discussed in Section 7.2.1, we
obtain the pushdown automaton whose states are shown in Figure 7.4b and whose transitions
appear in Figure 7.4c. Figure 7.5 shows a parser for this grammar implemented by recursive
descent. As suggested, the procedures correspond to the nonterminals of the grammar. We
have placed the code to parse the axiom on the end as the main program. The test of the
lookahead symbol in state q1 guarantees that the iput has been completely processed.
This systematically-constructed program can be simplied, also systematically, as shown
in Figure 7.6a. The correspondence between the productions of Figure 7.4a and the code of
Figure 7.6a results from the following transformation rules:
1. Every nonterminal X corresponds to a procedure X ; the axiom of the grammar corre-
sponds to the main program.
2. The body of procedure X consists of a case clause that distinguishes the productions
with X as left-hand side. Every nonterminal on the right-hand side of a production is
converted to a call of the corresponding procedure. Every terminal leads to a call of
next symbol , after the presence of the terminal has been veried.
3. In case none of the expected terminals is present, the error handler is invoked.
130 Parsing

If an empty production occurs for a nonterminal, this alternative can, in principle, be

deleted. Thus the procedure corresponding to E1 could also be written as shown in Fig-
ure 7.6b. Any errors would then be detected only after return to the calling procedure. In
Section 12.2.2 we shall see that the quality of error recovery is degraded by this strategy.

Z!E
E ! FE1
E1 ! j + FE1
F ! i j (E )
a) The grammar
q0 : [Z ! E ] q8 : [E1 ! + FE1 ]
q1 : [Z ! E ] q9 : [F ! i]
q2 : [E ! FE1 ] q10 : [F ! (E )]
q3 : [E ! F E1 ] q11 : [E1 ! + FE1 ]
q4 : [F ! i] q12 : [F ! (E )]
q5 : [F ! (E )] q13 : [E1 ! +F E1 ]
q6 : [E ! FE1 ] q14 : [F ! (E )]
q7 : [E1 ! ] q15 : [E1 ! +FE1 ]
b) The states of the parsing automaton
q0i ! q1 q2 i, q0(! q1 q2 (,
q1 ! ;
q2i ! q3 q4 i, q2(! q3 q5 (,
q3# ! q6 q7 #, q3) ! q6 q7 ), q3 + ! q6q8 +,
q4i ! q9 ,
q5(! q10 ,
q6 ! ,
q7 ! ,
q8+ ! q11 ,
q9 ! ,
q10 i ! q12 q2 i, q10(! q12 q2 (,
q11 i ! q13 q4 i, q11(! q13 q5 (,
q12 ) ! q14,
q13 # ! q15 q7 #, q13) ! q15 q7 ), q13 + ! q15q8 +,
q14 ! ,
q15 !
c) The transitions of the parsing automaton
Figure 7.4: A Sample Grammar and its Parsing Automaton
If we already know that a grammar satises the LL(1) condition, we can easily use these
transformations to write a parser (either by mechanical means or by hand). With additional
transformation rules we can generalize the technique suciently to convert our extended BNF
(Section 5.1.3) and connection points. Some of the additional rules appear in Figure 7.7.
Figure 7.8 illustrates the use of these rules.
7.2 LL(1) Parsers 131

procedure parser ;
procedure E ; forward;
procedure F ;
begin (* F *)
case symbol of
'i':
begin
(* q4 : *) if symbol = 'i' then next_symbol else error ;
(* q9 : *)
end;
'(' :
begin
(* q5 : *) if
symbol = '(' then next_symbol else error ;
(* q10 : *) E ;
(* q12 : *) if
symbol = ')' then next_symbol else error ;
(* q14 : *)
end
otherwise error
end;
end; (* F *)
procedure E1 ;
begin (* E1 *)
case symbol of
'#', ')' : (* q7 : *);
'+' :
begin
(* q8 : *) if
symbol = '+' then next_symbol else error;
(* q11 : *) F ;
(* q13 : *) E1 ;
(* q15 : *)
end
otherwise error
end;
end; (* E1 *)

procedure E ;
begin (* E *)
(* q2 :
*) F ;
(* q3 :
*) E1 ;
(* q6 :
*)
end ; (* E *)

begin (* parser *)
q
(* 0 : *) E ;
q
(* 1 : *) if
symbol <> '#' then error ;
end; (* parser *)

Figure 7.5: A Recursive Descent Parser for the Grammar of Figure 7.4
132 Parsing

procedure parser;
procedure E ; forward;
procedure F ;
begin (* F *)
case symbol of
'i' : next_symbol ;
'(':
begin
next_symbol;
E;
if symbol = ')' then next_symbol else error ;
end
otherwise error
end;
end; (* F *)
procedure E1 ;
begin (* E1 *)
case symbol of
'#', ')': ;
'+': begin next_symbol ; F ; E1 end
otherwise error
end;
end; (* E1 *)
procedure E ;
begin F ; E1 end;
begin (* parser *)
E;
if symbol <> '#' then error ;
end; (* parser *)
a) Errors detected within E1
procedure E1 ;
begin (* E1 *)
if symbol = `+' then begin next_symbol ; F ; E1 end;
end; (* E1 *)
b) Errors detected after exit from E 1
Figure 7.6: Figure 7.5 Simplied
7.2 LL(1) Parsers 133

Element Program schema

Option [x] if symbol in FIRST (x) then x;
Closure x+ repeat x until not (symbol in FIRST (x))
x while symbol in FIRST (x) do x;
List x jj d x;
while symbol in FIRST (d) do
begin d; x end;
Connection t&Y if symbol = t then
begin Y ; next symbol end
else error ;
%Z Z
Figure 7.7: Extension of Figure 7.3

expression ::= term (0 +0 term % addop ) :

term ::= 0 i0 & ident j 0 (0 expression 0 )0 :
a) Grammar (compare Figure 7.2a)
procedure parser ;
procedure term ; forward;
procedure expression ;
begin (* expression *)
term ;
while symbol = '+' do
begin next_symbol ; term ; addop end;
end; (* expression *)
procedure term ;
begin (* term *)
case symbol of
'i': begin ident ; next_symbol end;
'(': begin
next_symbol ;
expression ;
if symbol = ')' then next_symbol else error ;
end
otherwise error
end;
end; (* term *)
begin (* parser *)
expression ;
ifsymbol <> '#' then error ;
end (* parser *)
b) Parser
Figure 7.8: Parser for an Extended BNF Grammar
Recursive descent parsers are easy to construct, but are not usually very ecient in either
time or storage. Most grammars have many nonterminals, and each of these leads to the
dynamic cost associated with the call of and return from a recursive procedure. The proce-
dures that recognize nonterminals could be implemented substantially more eciently than
134 Parsing

arbitrary recursive procedures because they have no parameters or local variables, and there
is only a single global variable. Thus the alteration of the environment pointer on procedure
entry and exit can be omitted.
An interpretive implementation of a recursive descent parser is also possible: The control
program interprets tables generated from the grammar. Every table entry species a basic
operation of the parser and the associated data. For example, a table entry might be described
as follows:
typeparse_table_entry = record
operation : integer ; (* Transition *)
lookahead : set of
symbol_code ; (* Input or lookahead symbol *)
next : integer (* Parse table index *)
end
;

States corresponding to situations that follow one another in a single production follow
one another in the table. Figure 7.9 species a recursive descent interpreter assuming that
parse table is an array of parse table entry .
Alternatives (1)-(5) of the case clause in Figure 7.9 supply the program schemata for
qt ! q0 , q ! and qti ! q0 qi ti introduced in Figure 7.3. As before, the transition qti !
q0 qi ti is accomplished in two steps (alternative 3 followed by either 4 or 5). The situations
represented by the alternatives are given as comments. Alternative 6 shows one of the possible
optimizations, namely the combination of selecting a production X ! i (alternative 4)
with acceptance of the rst symbol of i (alternative 1). Further optimization is possible
(Exercise 7.6).
7.2.3 Computation of FIRST and FOLLOW Sets
The rst step in the generation of an LL(1) parser is to ensure that the grammar G =
(T; N; P; Z ) satises the LL(1) condition. To do this we compute the FIRST and FOLLOW
sets for all X 2 N . For each production X ! 2 P we can then determine the director
set W = FIRST (FOLLOW (X )). The director sets are used to verify the LL(1) condition,
and also become the lookahead sets used by the parser. With the computation of these sets,
the task of generating the parser is essentially complete. If the grammar does not satisfy the
LL(1) condition, the generator may attempt transformations automatically (for example, left
recursion removal and simple left factoring) or it may report the cause of failure to the user
for correction.
The following algorithm can be used to compute FIRST (X ) and initial values for the
director set W of each production X ! .
1. Set FIRST (X ) empty and repeat steps (2)-(5) for each production X ! .
2. Let = x1 : : : xn , i = 0 and W = f#g. If n = 0, go to step 5.
3. Set i := i + 1 and W := W [ FIRST (xi ). (If xi is an element of T , FIRST (xi ) = fxi g;
if FIRST (xi ) is not available, invoke this algorithm recursively to compute it.) Repeat
step 3 until either i = n or # is not an element of FIRST (xi ).
4. If # is not an element of FIRST (xi ), set W := W , f#g.
5. Set FIRST (X ) := FIRST (X ) [ W .
Note that if the grammar is left recursive, step (3) will lead to an endless recursion and
the algorithm will fail. This failure can be avoided by marking each X when the computation
of FIRST (X ) begins, and clearing the mark when that computation is complete. If step (3)
attempts to invoke the algorithm with a marked nonterminal, then a left recursion has been
detected.
7.2 LL(1) Parsers 135

procedure parser ;
var current : integer ;
stack : array [1 .. max_stack ] of integer ;
stack_pointer : 0 .. max_stack ;
begin (* parser *)
current := 1; stack_pointer := 0;
repeat
with parse_table [current ] do
case operation of
1: (* X ! t *)
if symbol in lookahead then
begin next_symbol ; current := current + 1 end
else error;
2: (* X ! *)
begin
current := stack [stack_pointer ];
stack_pointer := stack_pointer - 1;
end ;
3: (* X ! B *)
begin
if stack_pointer = max_stack then
abort ;
stack_pointer := stack_pointer + 1;
stack [stack_pointer ] := current + 1;
current := next ;
end
;
4: (* X ! i (not the last alternative) *)
if
symbol in
lookahead then
current := current + 1
elsecurrent := next ;
5: (* X ! m (last alternative) *)
if
symbol in
lookahead then
current := current + 1
elseerror ;
6: (* X ! t i (not the last alternative) *)
if
symbol in
lookahead then
begin next_symbol ; current := current + 1 end
elsecurrent := next
end
;
until current = 1;
if symbol <> '#' then
error ;
end ; (* Parser *)

Figure 7.9: An Interpretive LL(1) Parser

136 Parsing

This algorithm is executed exactly once for each X 2 N . If # is not in W at the

beginning of step 5 then W is the complete director set for the production X ! . Otherwise
the complete director set for X ! is (W , f#g) [ FOLLOW (X ).
Ecient computation of FOLLOW (X ) is somewhat trickier. The problem is that some
elements can be deduced from single rules, while others re ect interactions among rules.
For example, consider the grammar of Figure 7.4a. We can immediately deduce that
FOLLOW (F ) includes FIRST (E1 ), because of the production E1 ! +FE1 Since E1 ) ,
FOLLOW (F ) also contains FOLLOW (E1 ), which includes FOLLOW (E ) because of the
production E ! FE1 .
Interaction among the rules can be represented by the relation LAST :
7.4 Definition
Given a context free grammar G = (T; N; P; Z ). For any two nonterminals A, B , A LAST B
i B ! A 2 P and ) .
This relation can be described by a directed graph F = (N; D), with D = f(A; B ) j
A LAST B g. If there is a path from node A to node B in F , then FOLLOW (A) is a
subset of FOLLOW (B ); all nodes in a strongly connected region of F have identical fol-
low sets. The general strategy for computing follow sets is thus to compute provisional sets
FOL(X ) = ft j A ! X 2 P; t 2 FIRST ( )g , f#g based only upon the relationships
among symbols within productions, and then use F to combine these sets.
We can easily compute the graph F and the set FOL(X ) by scanning the production
backward and recalling that A ) if # is in FIRST (A). Since F is sparse (jDj << jN N j),
it must be represented by an edge list rather than an adjacency matrix if the eciency of the
remaining computation is to be maintained.
The next step is to form the strongly connected regions of F and derive the directed
acyclic graph F 0 = (N 0 ; D0 ) of these regions:
D0 = f(A0 ; B 0) j (A; B ) 2 D such that A is in the strongly connected region A0 and B is
in the region B 0g
F 0 can be constructed eciently by using the algorithm of Section B.3.2 to form the regions
and then constructing the edges in one pass over F . At the same time, we can compute the
initial follow sets FOL(A0 ) of the strongly connected regions A0 inN 0 by taking the union of
all FOL(A) such that A is a nonterminal in the region A0 .
The nal computation of FOLLOW (A0 ) is similar to our original computation of
FIRST (A):
1. Initially, FOLLOW (A0 ) = FOL(A0 ) for A0 6= Z 0 , and FOLLOW (Z 0) = f#g.
2. For each immediate successor, B 0 , of A0 add FOLLOW (B 0) to FOLLOW (A0 ).
If FOLLOW (B 0) is not already available, then invoke this algorithm recursively to
compute it.
This algorithm also operates upon each element of N 0 exactly once. For each production
X ! with # in W , we now obtain the nal director sets by setting W := (W , f#g) [
FOLLOW (X 0) (X 0 is the strongly connected region containing X ).

7.3 LR Parsers
Using construction 5.33, we can both test whether a grammar is LR(1) and construct a parser
for it. Unfortunately, the number of states of such a parser is too large for practical use.
Exactly as in the case of strong LL(k) grammars, many of the transitions in an LR(1) parser
7.3 LR Parsers 137

are independent of the lookahead symbol. We can utilize this fact to arrive at a parser with
fewer states, which implements the LR(1) analysis algorithm but in which reduce transitions
depend upon the lookahead symbol only if it is absolutely necessary.
We begin the construction with an LR(0) parser, which does not examine lookahead
symbols at all, and introduce lookahead symbols only as required. The grammars that we
can process with these techniques are the simple LR(1) (SLR(1)) grammars of DeRemer
[1969]. (This class can also be dened for arbitrary k < 1.) Not all LR(1) grammars are also
SLR(1) (there is no equivalence similar to that between ordinary and strong LL(1) grammars),
but the distinction is unimportant in practice except for one class of problems. This class
of problems will be solved by sharpening the denition of SLR(1) to obtain lookahead LR(1)
(LALR(1)) grammars.
The verications of the LR(1), SLR(1) and LALR(1) conditions are more laborious than
verication of the LL(1) condition. Also, there exists no simple relationship between the
grammar and the corresponding LR pushdown automaton. LR parsers are therefore employed
only if one has a parser generator. We shall rst discuss the workings of the parser and in
that way derive the SLR(1) and LALR(1) grammars from the LR(0) grammars. Next we
shall show how parse tables are constructed. Since these tables are still too large in practice,
we investigate the question of compressing them and show examples in which the nal tables
are of feasible size. The treatment of error handling will be deferred to Section 12.2.2.
7.3.1 The Parse Algorithm
Consider an LR(k) grammar G = (T; N; P; Z ) and the pushdown automaton A = (T; Q; R; q0 ;
fq0g; Q; q0 ) of construction 5.33. The operation of the automaton is most easily explained
using the matrix form of the transition function:
8
>
>
>
q0 if 2 T and q ! qq0 2 R or
>
>
>
< if 2 N and q0 = next(q; ) (shift transition)
f (q; ) = >X ! if [X ! ; ] 2 q (reduce transition)
>
>
>
>
>
HALT if = # and [Z ! S ; #] 2 q
:
ERROR otherwise
This transition function is easily obtained from construction 5.33: All of the transitions
dened in step (5) deliver shift transitions with one terminal symbol, which will be accepted;
the remaining transitions result from step (7) of the construction. We divide the transition
p1 : : : pmq! ! p1 q0! referred to in step (7) into two steps: Because [X ! ; ] is in q we
know that we must reduce according to the production X ! and remove m = jj states
from the stack. Further we dene f (p1 ; X ) = next(p1 ; X ) = q0 to be the new state. If w = #
and [Z ! S ; #] 2 q then the pushdown automaton halts.
Figure 7.10 gives an example of the construction of a transition function for k = 0. We
have numbered the states and rules consecutively. `+2' indicates that a reduction will be
made according to rule 2; `*' marks the halting of the pushdown automaton. Because k = 0,
the reductions are independent of the following symbols.
Figure 7.10c shows the transition function as the transition diagram of a nite automaton
for the grammars of Theorem 5.32. The distinct grammars correspond to distinct nal states.
As an LR parser, the automaton operates as follows: Beginning at the start state 0, we make
a transition to the successor state corresponding to the symbol read. The states through
which we pass are stored on the stack; this continues until a nal state is reached. In the nal
state we reduce by means of the given production X ! , delete jj states from the stack
and proceed as though X had been `read'.
138 Parsing

(1) Z ! E
(2) E ! E + F (3) E ! F
(4) F ! i (5) F ! (E )
a) The grammar
i ( ) + # E F
0 3 4 . . . 1 2
1 . . . 5 *
2 +3 +3 +3 +3 +3
3 +4 +4 +4 +4 +4
4 3 4 . . . 6 2
5 3 4 . . . 7
6 . . 8 5 .
7 +2 +2 +2 +2 +2
8 +5 +5 +5 +5 +5
b) The transition table
+3 2 F 0 E 1 # HALT
( i
F
( 4 i 3 +4 +
E i
(
) F
+5 8 6 + 5 7 +2

c) The transition diagram

Figure 7.10: An Example of an LR(0) Grammar

The only distinction between the mode of operation of an LR(k) parser for k > 0 and the
LR(0) parser of the example is that the reductions may depend upon lookahead symbols. In
the nal states of the automaton, reductions will take place only if the context allows them.
Don't-care entries with f (q; ) = ERROR, i.e. entries such that there exists no word
with q0 q0 # ) !q # with suitable stack contents !, may occur in the matrix representa-
tion of the transition function. Note that all entries (q; X ), X 2 N , with f (q; X ) = ERROR
are don't-cares. By the considerations in step (3) of construction 5.33, no error can occur in
a transition on a nonterminal; it would have been recognized at the latest at the preceding
reduction. (The true error entries are denoted by `.', while don't-cares are empty entries in
the matrix representation of f (q; ).)

7.3.2 SLR(1) and LALR(1) Grammars

Figure 7.11a is a slight extension of that of Figure 7.10a. It is not an LR(0) grammar, as
Figure 7.12 shows. (A star before a situation means that this situation belongs to the basis of
the state; the lookahead string is omitted.) In states 2 and 9 we must inspect the lookahead
symbols to decide whether to reduce or not. Figure 7.11b gives a transition matrix that
performs this inspection.
The operation of the parser can be seen from the example of the reduction of i + i (i + i)#
(Figure 7.13). The `Next Symbol' column is left blank when the parser does not actually
examine the lookahead symbol. This example shows how, by occasional consideration of a
7.3 LR Parsers 139

(1) Z!E
(2) E ! E + T (3) E ! T
(4) T ! T F (5) T ! F
(6) F !i (7) F ! (E )
a) The grammar
i ( ) + * # E T F
0 4 5 . . . . 1 2 3
1 . . . 6 . *
2 . . +3 +3 7 +3
3 +5 +5 +5 +5 +5 +5
4 +6 +6 +6 +6 +6 +6
5 4 5 . . . . 8 2 3
6 4 5 . . . . 9 3
7 4 5 . . . . 10
8 . . 11 6 . .
9 . . +2 +2 7 +2
10 +4 +4 +4 +4 +4 +4
11 +7 +7 +7 +7 +7 +7
b) The transition table
Figure 7.11: A Non-LR(0) Grammar

lookahead symbol, we can also employ an LR(0) parser for a grammar that does not satisfy
the LR(0) condition. States in which a lookahead symbol must be considered are called
inadequate. They are characterized by having a situation [X ! ] that leads to a reduction,
and also a second situation. This second situation leads either to a reduction with another
production or to a shift transition.
DeRemer [1971] investigated the class of grammars for which these modications lead to
a parser:
7.5 Definition
A context free grammar G = (T; N; P; Z ) is SLR(1) i the following algorithm leads to a
deterministic pushdown automaton.
The pushdown automaton A = (T; Q; R; q0 ; fq0 g; Q; q0 ) will be dened by its transition
function f (q; ) rather than the production set R. The construction follows that of construc-
tion 5.33. We use the following as the closure of a set of situations:
H (M ) = M [ f[Y ! ] j 9[X ! Y ] 2 H (M )g
1. Initially let Q = fq0 g, with q0 = H (f[Z ! S ]g).
2. Let q be an element of Q that has not yet been considered. Perform steps (3)-(4) for
each 2 V .
3. Let basis(q; ) = f[X ! ] j [X ! v ] 2 qg.
4. If basis(q; ) 6= ;, then let next(q; ) = H (basis(q; )). Add q0 = next(q; ) to Q if it is
not already present.
5. If all elements of Q have been considered, perform step (6) for each q 2 Q and then
stop. Otherwise return to step (2).
6. For all 2 V , dene f (q; ) by:
140 Parsing
8
>
>
>
next(q; ) if [X ! v ] 2 q
f (q; ) = >X ! if [X ! ] 2 q and 2 FOLLOW (X )
>
<

>
>
HALT if = # and [Z ! S ] 2 q
ERROR
>
:
otherwise

This construction is almost identical to construction 5.33 with k = 0. The only dierence is
the additional restriction 2 FOLLOW (X ) for the reduction (second case).
SLR(1) grammars cover many practically important language constructs not expressible
by LR(0) grammars. Compared to the LR(1) construction, the given algorithm leads to
substantially fewer states in the automaton. (For the grammar of Figure 7.11a the ratio is
22:12). Unfortunately, even SLR(1) grammars do not suce for all practical requirements.
State Situation f (q; )
0 * [Z ! E ] E 1
[E ! E + T ]
[E ! T ] T 2
[T ! T F ]
[T ! F ] F 3
[F ! i] i 4
[F ! (E )] ( 5
1 * [Z ! E ] # HALT
* [E ! E + T ] + 6
2 * [E ! T ] #; ); + reduce 3
* [T ! T F ] 7
3 * [T ! F ] reduce 5
4 * [F ! i] reduce 6
5 * [F ! (E )] E 8
[E ! E + T ]
[E ! T ] T 2
[T ! T F ]
[T ! F ] F 3
[F ! i] i 4
[F ! (E )] ( 5
6 * [E ! E + T ] T 9
[T ! T F ]
[T ! F ] F 3
[F ! i] i 4
[F ! (E )] ( 5
7 * [T ! T F ] F 10
[F ! i] i 4
[F ! (E )] ( 5
8 * [F ! (E )] ) 11
* [E ! E + T ] + 6
9 * [E ! E + T ] #; ); + reduce 2
* [T ! T F ] 7
10 * [T ! T F ] reduce 4
11 * [F ! (E )] reduce 7
Figure 7.12: Derivation of the Automaton of Figure 7.11b
7.3 LR Parsers 141

Right derivation Stack Next Reduce by Next

before transition Symbol Production State
:i + i (i + i)# 0 i 4
i: + i (i + i)# 0,4 6 3
F: + i (i + i)# 0,3 5 2
T: + i (i + i)# 0,2 + 3 1
E: + i (i + i)# 0,1 + 6
E + :i (i + i)# 0,1,6 i 4
E + i: (i + i)# 0,1,6,4 6 3
E + F: (i + i)# 0,1,6,3 5 9
E + T: (i + i)# 0,1,6,9 7
E + T :(i + i)# 0,1,6,9,7 ( 5
E + T (:i + i)# 0,1,6,9,7,5 i 4
E + T (i: + i)# 0,1,6,9,7,5,4 6 3
E + T (F: + i)# 0,1,6,9,7,5,3 5 2
E + T (T: + i)# 0,1,6,9,7,5,2 + 3 8
E + T (E: + i)# 0,1,6,9,7,5,8 + 6
E + T (E + :i)# 0,1,6,9,7,5,8,6 i 4
E + T (E + i:)# 0,1,6,9,7,5,8,6,4 6 3
E + T (E + F:)# 0,1,6,9,7,5,8,6,3 5 9
E + T (E + T:)# 0,1,6,9,7,5,8,6,9 ) 2 8
E + T (E:)# 0,1,6,9,7,5,8 ) 11
E + T (E ):# 0,1,6,9,7,5,8,11 7 10
E + T F:# 0,1,6,9,7,10 4 9
E + T:# 0,1,6,9 # 2 1
E:# 0,1 # HALT
Z:#
Figure 7.13: A Sample Parse by the Automaton of Figure 7.11b

The problem arises whenever there is a particular sequence of tokens that plays dierent roles
in dierent places. In LAX, for example, an identier followed by a colon may be either a label
(A.2.0.6) or a variable serving as a lower bound (A.3.0.4). For this reason the LAX grammar is
not SLR(1), because the lookahead symbol `:' does not determine whether identifier should
be reduced to name (A.4.0.16), or a shift transition building a label definition should take
place.
If the set of lookahead symbols for a reduction could be partitioned according to the
state then we could solve the problem, as can be seen from the example of Figure 7.14. The
productions of Figure 7.14a do not fulll the SLR(1) condition, as we see in the transition
diagram of Figure 7.14b. In the critical state 5, however, a reduction with lookahead symbol
c need not be considered! If c is to follow B then b must have been read before, and we would
therefore have had the state sequence 0, 3, 7 and not 0, 2, 5. The misjudgement arises through
states in which all of the symbols that could possibly follow B are examined to determine
whether to reduce B ! d, without regard to the symbols preceding B . We thus rene the
construction so that we do not admit all lookahead symbols in FOLLOW (X ) when deciding
upon a reduction X ! , but distinguish on the basis of predecessor states lookahead symbols
that can actually appear.
We begin by dening the kernel of an LR(1) state to be its LR(0) situations:
kernel(q) = f[X ! ] j [X ! ;
] 2 qg
142 Parsing

(1) Z ! A
(2) A ! aBb (3) A ! adc (4) A ! bBc (5) A ! bdd
(6) B ! d
a) The grammar
0
A a b

1 2 3
# B d B d

HALT 4 +6
5 on 6 +6
7 on
b,c b,c
b c c d
8 9 10 11
+2 +3 +4 +5
b) The SLR(1) transition diagram
a b c d # A B
0 2 3 . . . 1
1 . . . . *
2 . . . 5 . 4
3 . . . 7 . 6
4 8 .
5 . +6 9 . .
6 . 10
7 . +6 +6 11 .
8 +2 +2 +2 +2 +2
9 +3 +3 +3 +3 +3
10 +4 +4 +4 +4 +4
11 +5 +5 +5 +5 +5
c) The LALR(1) transition table
Figure 7.14: A Non-SLR(1) Grammar

Construction 7.5 above eectively merges states of the LR(1) parser that have the same kernel,
and hence any lookahead symbol that could have appeared in any of the LR(1) states can
appear in the LR(0) state. The set of all such symbols forms the exact right context upon
which we must base our decisions.
7.6 Definition
Let G = (T; N; P; Z ) be a context free grammar, Q be the state set of the pushdown automaton
formed by construction 7.5, and Q0 be the state set of the pushdown automaton formed by
construction 5.33 with k = 1. The exact right context of an LR(0) situation [X ! ] in a
state q 2 Q is dened by:
ERC (q; [X ! ]) = ft 2 T j 9q0 2 Q0 such that q = kernel(q0 ) and [X ! ; t] 2 q0 g

Theorem 5.31 related the LR(k) property to non-overlapping k-stack classes, so it is not
surprising that the denition of LALR(1) grammars involves an analogous condition:
7.3 LR Parsers 143

7.7 Definition
Let G = (T; N; P; Z ) be a context free grammar and Q be the state set of the pushdown
automaton formed by construction 7.5. G is LALR(1) i the following sets are pairwise
disjoint for all q 2 Q, p 2 P :
Sq;0 = ft j [X ! ] 2 q; 6= ; t 2 EFF (ERC (q; [X ! ]))g
Sq;p = ERC (q; [Xp ! p ])

Although Denition 7.6 implies that we need to carry out construction 5.33 to determine the
exact right context, this is not the case. The following algorithm generates only the LR(0)
states, but may consider each of those states several times in order to build the exact right
context. Each time a shift transition into a given state is discovered, we propagate the right
context. If the propagation changes the third element of any triple in the state then the entire
state is reconsidered, possibly propagating the change further. Formally, we dene a merge
operation on sets of situations as follows:
merge(A; B ) = f[X ! ; [
] j [X ! ; ] 2 A; [X ! ;
] 2 B g
The LALR(1) construction algorithm is then:
1. Initially let Q = fq0 g, with q0 = H (f[Z ! S ; f#g]g).
2. Let q be an element of Q that has not yet been considered. Perform steps (3)-(5) for
each 2 V .
3. Let basis(q; ) = f[X ! ;
] j [X ! v ;
] 2 qg.
4. If basis(q; ) 6= ; and there is a q0 2 Q such that kernel(q0 ) = kernel(H (basis(q; )))
then let next(q; ) = merge(H (basis(q; )); q0 ). If next(q; ) 6= q0 then replace q0 by
next(q; ) and mark q0 as not yet considered.
5. If basis(q; ) 6= ; and there is no q0 2 Q such that kernel(q0 ) = kernel(H (basis(q; )))
then let next(q; ) = H (basis(q; )). Add q00 = next(q; ) to Q.
6. If all elements of Q have been considered, perform step (7) for each q 2 Q and then
stop. Otherwise return to step (2).
7. For all 2 V dene f (q; ) as follows:
8
>
>
>
next(q; ) if basis(q; ) 6= ;
f (q; ) = >X ! if [X ! ;
] 2 q; 2

>
<

>
>
HALT if = # and [Z ! S ; f#g] 2 q
ERROR
>
:
otherwise
Figure 7.14c shows the LALR(1) automaton derived from Figure 7.14a. Note that we
can only recognize a B by reducing production 6, and this can be done only with b or c as
the lookahead symbol (see rows 5 and 7 of Figure 7.14c). States 4 and 6 are entered only
after recognizing a B , and hence the current symbol must be b or c in these states. Thus
Figure 7.14c has don't-care entries for all symbols other than b and c in states 4 and 6.
7.3.3 Shift-Reduce Transitions
For most programming languages 30-50% of the states of an LR parser are LR(0) reduce
states, in which reduction by a specic production is determined without examining the
144 Parsing

context. In Figure 7.12 these states are 3, 4, 10 and 11. We can combine these reductions
with the stacking of the previous symbol to obtain a new kind of transition { the shift-reduce
transition { specifying both the stacking of the last symbol of the right-hand side and the
production by which the next reduction symbol is to be made. Formally:
If f (q0; ) = X ! (or f (q0 ; ) = HALT ) is the only possible action (other than ERROR)
in state q0 then redene f (q; ) to be `shift reduce X ! ' for all states q with f (q; ) = q0
and for all 2 V . Then delete state q0 .
With this simplication the transition function of Figure 7.11 can be written as shown in
Figure 7.15.
i ( ) + * # E T F
0 -6 5 . . . . 1 2 -5
1 . . . 6 . *
2 . . +3 +3 7 +3
5 -6 5 . . . 8 2 -5
6 -6 5 . . . . 9 -5
7 -6 5 . . . . -4
8 . . -7 6 . .
9 . . +2 +2 7 +2
Figure 7.15: The Automaton of Figure 7.11 Recast for Shift-Reduce Transitions
(The notation remains the same, with the addition of ,p to indicate a shift-reduce tran-
sition that reduces according to the pth production.)
Introduction of shift-reduce transitions into a parsing automaton for LAX reduces the
number of states from 131 to 70.

7.3.4 Chain Production Elimination

A chain production A ! B is a semantically meaningless element of P with a right-hand side
of length 1. In this section we shall denote chain productions by A !c B and derivations
using only chain productions by A )c B (instead of A ) B ). Any productions not explicitly
marked are not chain productions. Chain productions are most often introduced through
the description of expressions by rules like sum ::= term | sum addop term . They also
frequently arise from the collection of single concepts into some all-embracing concept (as in
A.3.0.1, for example).
Reductions according to chain productions are completely irrelevant, and simply waste
time. Thus elimination of all chain productions may speed up the parsing considerably.
During the parse of the statement A := B in LAX, for example, we must reduce 11 times
by productions of length 1 before reaching the form name `:=' expression , which can be
recognized as an assignment. Of these reductions, only the identication of an identifier as
a name (A.4.0.16) has relevant semantics. All other reductions are semantically meaningless
and should not appear in the structure tree.
We could remove chain productions by substitution, a process used in conjunction with
Theorem 5.25. The resulting denition of the LR parser would lead to far too many states,
which we must then laboriously reduce to a manageable number by further processing. A
more satisfactory approach is to try to eliminate the reductions by chain productions from
the parser during construction. In many cases this technique will also lower the number of
states in the nal parser.
The central idea is to simultaneously consider all chain productions that could be intro-
duced in a given parser state. Suppose that a state q contains a situation [X ! A ; t]
7.3 LR Parsers 145

and A )+ B . We must rst reduce to B , then to A. If however, the derivation A )+ B

consists solely of chain productions then upon a reduction to B we can immediately reduce
to A without going through any intermediate steps.
Construction 7.7, when applied to Figure 7.16a
(1) Z ! E
(2) E ! E + T (3) E ! T
(4) T ! T i (5) T ! i
a) The grammar
+3 T 0 i +5
on #,+ 2 3 on #,+,*
E i
* HALT 1 +
on # 4
T
+4 6 +2
on #,+,* 7 i 5 * on +,#
b) The transition Diagram
HALT 2 T 0 i 3 +5
on #
+ E i

* HALT 1 4
on # +
T
+4 6 +2
on #,+,* 7 i 5 * on +,#
c) After elimination of the chain production (3) E ! T
Figure 7.16: A Simple Case of Chain Production Elimination
(a simplied version of Figure 7.11a), yields a parser with the state diagram given in
Figure 7.16b. If we reach state 2, we can reduce to E given the lookahead symbol #, but we
could also reduce to Z immediately. We may therefore take either the actions of state 1 or
those of state 2. Figure 7.16c shows the parser that results from merging these two states.
Note that in Figure 7.16b the actions for states 1 and 2 do not con ict (with the excep-
tion of the reduction E ! T being eliminated). This property is crucial to the reduction;
fortunately it follows automatically from the LR(1) property of the grammar: Suppose that
for A 6= B , A )c C and B )c C . Suppose further that some state q contains situations
[X ! A ; ,] and [Y ! B; ]. The follower condition `FIRST ( ,) and FIRST ()
disjoint' must then hold, since otherwise it would be impossible to decide whether to reduce C
to A or B in state f (q; C ). Consideration of state 0 in Figure 7.16b with A = E , B = C = T
illustrates that the follower condition is identical to the absence of con ict required above.
Situations involving chain productions are always introduced by a closure operation. In-
stead of using these chain production situations when establishing a new state, we use the
situations that introduced them. This is equivalent to saying that reduction to the right-hand
side of the chain production should be interpreted as reduction to the left-hand side. Thus
the only change in construction 7.7 comes in computation of basis(q; ):
3'. Let basis(q; ) = f[Y ! a ; ] j [X ! v ; ,]; [Y ! a; ] 2 q; a )c g,f[A !
B ;
] j A !c B g.
146 Parsing

As an example of the process, assume that the productions E ! T and T ! F in

the grammar of Figure 7.11a are chain productions. Figure 7.17 shows the derivation of an
LALR(1) automaton that does not reduce by these productions. (Compare this derivation
with that of Figure 7.12.)
State Situation f (q; )
0 * [Z ! E ; f#g] E 1
[E ! E + T ; f#+g]
[E ! T ; f#+g] T 2
[T ! T F ; f# + g]
[T ! F ; f# + g] F 2
[F ! i; f# + g] i 3
[F ! (E ); f# + g] ( 4
1 * [Z ! E ; f#g] # HALT
* [E ! E + T ; f#+g] + 5
2 * [Z ! E ; f#g] # HALT
* [E ! E + T ; f#+g] + 5
* [T ! T F ; f# + g] 6
3 * [F ! i; f# + )g] reduce 6
4 * [F ! (E ); f# + )g] E 7
[E ! E + T ; f)+g]
[E ! T ; f)+g] T 8
[T ! T F ; f) + g]
[T ! F ; f) + g] F 8
[F ! i; f) + g] i 3
[F ! (E ); f) + g] ( 4
5 * [E ! E + T ; f#+)g] T 9
[T ! T F ; f# + )g]
[T ! F ; f# + )g] F 9
[F ! i; f# + )g] i 3
[F ! (E ); f# + )g] ( 4
6 * [T ! T F ; f# + )g] F 10
[F ! i; f# + )g] i 3
[F ! (E ); f# + )g] ( 4
7 * [F ! (E ); f# + )g] ) 11
* [E ! E + T ; f)+g] + 5
8 * [F ! (E ); f# + )g] ) 11
* [E ! E + T ; f)+g] + 5
* [T ! T F ; f) + g] 6
9 * [E ! E + T ; f#+)g] #)+ reduce 2
* [T ! T F ; f# + )g] 6
10 * [T ! T F ; f# + )g] reduce 4
11 * [F ! (E ); f# + )g] reduce 7
Figure 7.17: Chain Production Elimination Applied to Figure 7.11
7.3.5 Implementation
In order to carry out the parsing practically, a table of the left sides and lengths of the right
sides of all productions (other than chain productions), as well as parser actions to be invoked
at connection points, must be known to the transition function. The transition function is
7.3 LR Parsers 147

partitioned in this way to ease the storage management problems. Because of cost we store
the transition function as a packed data structure and employ an access routine that locates
the value f (q; ) given (q; ). Some systems work with a list representation of the (sparse)
transition matrix; the access may be time consuming if such a scheme is used, because lists
must be searched.
The access time is reduced if the matrix form of the transition function is retained, and the
storage requirements are comparable to those of the list method if as many rows and columns
as possible are combined. In performing this combination we take advantage of the fact that
two rows can be combined not only when they agree, but also when they are compatible
according to the following denition:
7.8 Definition
Consider a transition matrix f (q; ). Two rows q; q0 2 Q are compatible if, for each column
, either f (q; ) = f (q0; ) or one of the two entries is a don't-care entry.
Compatibility is dened analogously for two columns ; 0 2 V . We shall only discuss the
combination of rows here.
We inspect the terminal transition matrix, the submatrix of f (q; ) with 2 T , separately
from the nonterminal transition matrix. Often dierent combinations are possible for the two
submatrices, and by exploiting them separately we can achieve a greater storage reduction.
This can be seen in the case of Figure 7.18a, which is an implementation of the transition
matrix of Figure 7.17. In the terminal transition matrix rows 0, 4, 5 and 6 are compatible,
but none of these rows are compatible in the nonterminal transition matrix.
In order to increase the number of compatible rows, we introduce a Boolean failure matrix,
F [q; t], q 2 Q, t 2 T . This matrix is used to lter the access to the terminal transition matrix:
f (q; t) = if F [q; t] then error else entry in the transition matrix;
For this purpose we dene F [q; t] as follows:
(

F [q; t] = true if f (q; t) = ERROR

false otherwise
Figure 7.18b shows the failure matrix derived from the terminal transition matrix of Fig-
ure 7.18a. Note that the failure matrix may also contain don't-care entries, derived as dis-
cussed at the end of Section 7.3.2. Row and column combinations applied to Figure 7.18b
reduce it from 9 6 to 4 4.
With the introduction of the failure matrix, all previous error entries become don't-care
entries. Figure 7.18c shows the resulting compression of the terminal transition matrix.
The nonterminal transition matrix is not aected by this process; in our example it can be
compressed by combining both rows and columns as shown in Figure 7.18d. Each matrix
requires an access map consisting of two additional arrays specifying the row (column) of the
matrix to be used for a given state (symbol). For grammars of the size of the LAX grammar,
the total storage requirements are generally reduced to 5-10% of their original values.
We have a certain freedom in combining the rows of the transition matrix. For ex-
ample, in the terminal matrix of Figure 7.18a we could also have chosen the grouping
f(0,4,5,6,9),(1,2,7,8)g. In general these groupings dier in the nal state count; we must
therefore examine a number of possible choices. The task of determining the minimum num-
ber of rows reduces to a problem in graph theory: We construct the (undirected) incompati-
bility graph I = (Q; D) for our state set Q, in which two nodes q and q0 are connected if the
rows are incompatible. Minimization of the number of rows is then equivalent to the task of
148 Parsing

coloring the nodes with a minimum number of colors such that any pair of nodes connected
by a branch are of dierent colors. (Graph coloring is discussed in Section B.3.3.) Further
compression may be possible as indicated in Exercises 7.12 and 7.13.
i ( ) + * # E T F
0 -6 4 . . . . 1 2 2
1 . 5 *
2 . . . 5 6 *
4 -6 4 . . . . 7 8 8
5 -6 4 . . . . 9 9
6 -6 4 . . . . -4
7 -7 5 .
8 . . -7 5 6 .
9 . . +2 +2 6 +2
a) Transition matrix for Figure 7.17 with shift-reduce transitions
i ( ) + * #
0 false false true true true true
1 true false false
2 true true true false false false
4 false false true true true true
5 false false true true true true
6 false false true true true true
7 false false true
8 true true false false false true
9 true true false false false false
b) Uncompressed failure matrix for (a)
i ( ) + * #
0,1,2,4,5,6,7,8 -6 4 -7 5 6 *
9 +2 +2 6 +2
c) Compressed terminal transition matrix
E TF
0,1,2 1 2
4 7 8
5 9
6,7,8,9 -4
d) Compressed nonterminal transition matrix
Figure 7.18: Table Compression

7.4 Notes and References

LL(1) parsing in the form of recursive descent was, according to McClure [1972], the most
frequently-used technique in practice. Certainly its exibility and the fact that it can be
hand-coded contribute to this popularity.
LR languages form the largest class of languages that can be processed with deterministic
pushdown automata. Other techniques (precedence grammars, (m; n)-bounded context gram-
7.4 Notes and References 149

mars or Floyd-Evans Productions, for example) either apply to smaller language classes or do
not attain the same computational eciency or error recovery properties as the techniques
treated here. Operator precedence grammars have also achieved signicant usage because one
can easily construct parsers by hand for expressions with inx operators. Aho and Ullman
[1972] give quite a complete overview of the available parsing techniques and their optimal
implementation.
Instead of obtaining the LALR(1) parser from the LR(1) parser by merging states, one
could begin with the SLR(1) parser and determine the exact right context only for those states
in which the transition function is ambiguous. This technique reduces the computation time,
but unfortunately does not generalize to an algorithm that eliminates all chain productions.
Construction 7.7 requires a redundant eort that can be avoided in practice. For example,
the closure of a situation [X ! B ;
] depends only upon the nonterminal B if the
lookahead set is ignored. The closure can thus be computed ahead of time for each B 2 N ,
and only the lookahead sets must be supplied during parser construction. Also, the repeated
construction of the follower state of an LALR(1) state that develops from the combination
of two LR(1) states with distinct lookahead sets can be simplied. This repetition, which
results from the marking of states as not yet examined, leaves the follower state (specied
as a set of situations) unaltered. It can at most add lookahead symbols to single situations.
This addition can also be accomplished without computing the entire state anew.
Our technique for chain production elimination is based upon an idea of Pager [1974].
Use of the failure matrix to increase the number of don't-care entries in the transition matrix
was rst proposed by Joliat [1973, 1974].

Exercises
7.1 Consider a grammar with embedded connection points. Explain why transformations
of the grammar can be guaranteed to leave the invocation sequence of the associated
parser actions invariant.
7.2 State the LL(1) condition in terms of the extended BNF notation of Section 5.1.3.
Prove that your statement is equivalent to Theorem 7.2.
7.3 Give an example of a grammar in which the graph of LAST contains a cycle. Prove
that FOLLOW (A) = FOLLOW (B ) for arbitrary nodes A and B in the same strongly
connected subgraph.
7.4 Design a suitable internal representation of a grammar and program the generation
algorithm of Section 7.2.3 in terms of it.
7.5 Devise an LL(1) parser generation algorithm that accepts the extended BNF notation
of Section 5.1.3. Will you be able to achieve a more ecient parser by operating upon
this form directly, or by converting it to productions? Explain.
7.6 Consider the interpretive parser of Figure 7.9.
(a) Dene additional operation codes to implement connection points, and add the
appropriate alternatives to the case statement. Carefully explain the interface
conventions for the parser actions. Would you prefer a dierent kind of parse
table entry? Explain.
(b) Some authors provide special operations for the situations [X ! B ] and [X !
tB ]. Explain how some recursion can be avoided in this manner, and write
appropriate alternatives for the case statement.
150 Parsing

(c) Once the special cases of (b) are recognized, it may be advantageous to provide
extra operations identical to 4 and 5 of Figure 7.9, except that the conditions are
reversed. Why? Explain.
(d) Recognize the situation [X ! t] and alter the code of case 4 to absorb the
processing of the 2 operation following it.
(e) What is your opinion of the value of these optimizations? Test your predictions
on some language with which you are familiar.
7.7 Show that the following grammar is LR(1) but not LALR(1):
Z ! A,
A ! aBcB , A ! B , A ! D,
B ! b, B ! Ff ,
D ! dE ,
E ! FcA, E ! FcE ,
F !b
7.8 Repeat Exercise 7.5 for the LR case. Use the algorithm of Section 7.3.4.
7.9 Show that FIRST (A) can be computed by any marking algorithm for directed graphs
that obtains a `spanning tree', B , for the graph. B has the same node set as the original
graph, G, and its branch set is a subset of that of G.
7.10 Consider the grammar with the following productions:
Z ! AXd, Z ! BX , Z ! C ,
A ! B, A ! C ,
B ! CXb,
C ! c,
X!
(a) Derive an LALR(1) parser for this grammar.
(b) Delete the reductions by the chain productions A ! B and A ! C .
7.11 Use the techniques discussed in Section 7.3.5 to compress the transition matrix pro-
duced for Exercise 7.8.
7.12 [Anderson et al., 1973] Consider a transition matrix for an LR parser constructed by
one of the algorithms of Section 7.3.2.
(a) Show that for every state q there is exactly one symbol z (q) such that f (q0 ; a)
implies a = z (q).
(b) Show that, in the case of shift-reduce transitions introduced by the algorithms
of Sections 7.3.3 and 7.3.4, an unambiguous symbol z (A ! ) exists such that
f (q; a) = `shift and reduce A ! ' implies a = z(A ! ).
(c) Show that the states (and shift-reduce transitions) can be numbered in such a
way that all states in column c have sequential numbers c0 + i, i = 0; 1; : : : Thus
it suces to store only the relative number i in the transition matrix; the base
c0 is only given once for each column. In exactly the same manner, a list of the
reductions in a row can be assigned to this row and retain only the appropriate
index to this list in the transition matrix.
(d) Make these alterations in the transition matrix produced for Exercise 7.8 before
beginning the compression of Exercise 7.11, and compare the result with that
obtained previously.
7.4 Notes and References 151

7.13 Bell [1974] Consider an m n transition matrix, t, in which all unspecied entries
are don't-cares. Show that the matrix can be compressed into a p q matrix c, two
length-m arrays f and u, and two length-n arrays g and by the following algorithm:
Initially fi = gi = 1, 1 i m, 1 j n, and k = 1. If all occupied columns of
the ith row of t uniformly contain the value r, then set fi := k, k := k + 1, ui := r
and delete the ith row of t. If the j th column is uniformly occupied, delete it also and
set gj := k, k := k + 1, j := r. Repeat this process until no uniformly-occupied row
or column remains. The remaining matrix is the matrix c. We then enter the row
(column) number in c of the former ith row (j th column) into ui (j ). The following
relation then holds:
ti;j = if fi < gj then ui
else if fi < gj then j
else (* fi = gj = 1 *) cu ; ;
i j

(Hint: Show that the size of c is independent of the sequence in which the rows and
columns are deleted.)
152 Parsing
Chapter 8
Attribute Grammars
Semantic analysis and code generation are based upon the structure tree. Each node of the
tree is `decorated' with attributes describing properties of that node, and hence the tree
is often called an attributed structure tree for emphasis. The information collected in the
attributes of a node is derived from the environment of that node; it is the task of semantic
analysis to compute these attributes and check their consistency. Optimization and code
generation can be also described in similar terms, using attributes to guide the transformation
of the tree and ultimately the selection of machine instructions.
Attribute grammars have proven to be a useful aid in representing the attribution of the
structure tree because they constitute a formal denition of all context-free and context-
sensitive language properties on the one hand, and a formal specication of the semantic
analysis on the other. When deriving the specication, we need not be overly concerned with
the sequence in which the attributes are computed because this can (with some restrictions) be
derived mechanically. Storage for the attribute values is also not re ected in the specication.
We begin by assuming that all attributes belonging to a node are stored within that node in
the structure tree; optimization of the attribute storage is considered later.
Most examples in this chapter are included to show constraints and pathological cases;
practical examples can be found in Chapter 9.

8.1 Basic Concepts of Attribute Grammars

An attribute grammar is based upon a context-free grammar G = (N; T; P; Z ). It associates a
set A(X ) of attributes with each symbol, X , in the vocabulary of G. Each attribute represents
a specic (context-sensitive) property of the symbol X , and can take on any of a specied set
of values. We write X:a to indicate that attribute a is an element of A(X ).
Each node in the structure tree of a sentence in L(G) is associated with a particular
set of values for the attributes of some symbol X in the vocabulary of G. These values
are established by attribution rules R(p) = fXi :a f (Xj :b; : : : ; Xk :c)g for the productions
p : X0 ! X1 : : : Xn used to construct the tree. Each rule denes an attribute Xi :a in terms
of attributes Xj :b; : : : ; Xk :c of symbols in the same production. (Note that in this chapter
we use upper-case letters to denote vocabulary symbols, rather than using case to distinguish
terminals from nonterminals. The reason for this is that any symbol of the vocabulary may
have attributes, and the distinction between terminals and nonterminals is generally irrelevant
for attribute computation.)
In addition to the attribution rules, a condition B (Xi :a; : : : ; Xj :b) involving attributes of
symbols occurring in p may be given. B species the context condition that must be fullled
if a syntactically correct sentence is correct according to the static semantics and therefore
153
154 Attribute Grammars

rule assignment ::= name ':=' expression .

attribution
name.environment assignment.environment;
expression.environment assignment.environment;
name.postmode name.primode;
expression.postmode
if name.primode = ref_int_type then
int_type else real_type ;
rule expression ::= name addop name .
attribution
name[1].environment expression.environment ;
name[2].environment expression.environment ;
expression.primode
if
coercible (name[1].primode, int_type) and
coercible (name[2].primode, int_type )
then
int_type else
real_type ;
addop.mode expression.primode ;
name[1].postmode expression.primode ;
name[2].postmode expression.primode ;
condition coercible (expression.primode, expression.postmode );

rule addop ::= '+'.

attribution
addop.operation
ifaddop.mode = int_type then int_addition else real_addition ;
rule name ::= identifier .
attribution
name.primode defined_type (identifier.symbol , name.environment );
condition coercible (name.primode , name.postmode );

Figure 8.1: Simplied LAX Assignment

translatable. We could also regard this condition as the computation of a Boolean attribute
consistent , which we associate with the left-hand side of the production.
As an example, Figure 8.1 gives a simplied attribute grammar for LAX assignments.
Each p 2 P is marked by the keyword ruleand written using EBNF notation (restricted to
express only productions). The elements of R(p) follow the keyword attribution. We use a
conventional expression-oriented programming language notation for the functions f , and ter-
minate each element with a semicolon. Particular instances of an attribute are distinguished
by numbering multiple occurrences of symbols in the production (e.g. name[1] , name[2] )
from left to right. Any condition is also marked by a keyword and terminated by a semicolon.
In order to check the consistency of the assignment and to further identify the + operator,
we must take the operand types into account. For this purpose we dene two attributes,
primode and postmode , for the symbols expression and name , and one attribute, mode ,
for the symbol addop . Primode describes the type determined directly from the node and its
descendants; postmode describes the type expected when the result is used as an operand by
other nodes. Any dierence between primode and postmode must be resolved by coercions.
The Boolean function coercible (t1 ; t2 ) tests whether type t1 can be coerced to t2 .
8.1 Basic Concepts of Attribute Grammars 155

assignment

name1 expression

identifier1 name2 addop name3

identifier2 ’+’ identifier3

a) Syntactic structure tree

assignment.environment
identifieri.symbol

b) Attribute values given initially (i = 1; : : : ; 3)

name1 .environment expression.environment
namei .environment name1 .primode
name1 .postmode expression.postmode namei .primode
expression.primode name1 condition
addop.mode namei .postmode expression condition
addop.operation namei condition

c) Attribute values computed (i = 2; 3)

Figure 8.2: Analysis of x := y + z

Figure 8.2 shows the analysis of x := y + z according to the grammar of Figure 8.1.
(Assignment.environment would be computed from the declarations of x, y and z , but here
we show it as given in order to make the example self-contained.) Attributes on the same
line of Figure 8.2c can be computed collaterally; every attribute is dependent upon at least
one attribute from the previous line. These dependency relations can be expressed as a graph
(Figure 8.3). Each large box represents the production whose application corresponds to the
node of the structure tree contained within it. The small boxes making up the node itself
represent the attributes of the symbol on the left-hand side of the production, and the arrows
represent the dependency relations arising from the attribution rules of the production. The
node set of the dependency graph is just the set of small boxes representing attributes; its
edge set is the set of arrows representing dependencies.

environment

assignment
env pri post environment primode postmode
expression

name
symbol env pri post mode oper env pri post

identifier
name addop name
symbol symbol

identifier identifier

Figure 8.3: Attribute Dependencies in the Tree for x := y + z

156 Attribute Grammars

We must know all of the values upon which an attribute depends before we can compute
the value of that attribute. Clearly this is only possible if the dependency graph is acyclic.
Figure 8.3 is acyclic, but consider the following LAX type denition, which we shall discuss
in more detail in Sections 9.1.2 and 9.1.3:
type t = record (real x , ref t p );

We must compute a type attribute for each of the identiers t, x and p so that the
associated type is known at each use of the identier. The type attribute of t consists of
the keyword record plus the types and identiers of the elds. Now, however, the type of p
contains an application of t, implying that the type identied by t depends upon which type
a use of t identies. Thus the type t depends cyclically upon itself. (We shall show how to
eliminate the cycle from this example in Section 9.1.3.)
Let us now make the intuition gained from these examples more precise. We begin with
the grammar G, a set of attributes A(X ) for each X in the vocabulary of G, and a set of
attribution rules R(p) (and possibly a condition B (p)) for each p in the production set of G.
8.1 Definition
An attribute grammarSis a 4-tuple, AG = (G; A; R; B ). G = (T; N; P; Z )Sis a reduced context
free grammar, A = X 2T [N A(X )Sis a nite set of attributes, R = p2P R(p) is a nite
set of attribution rules, and B = p2P B (p) is a nite set of conditions. A(X ) \ A(Y ) 6=
; implies X = Y . For each occurrence of X in the structure tree corresponding to a sentence
of L(G), at most one rule is applicable for the computation of each attribute a 2 A(X ).
8.2 Definition
For each p : X0 ! X1 : : : Xn 2 P the set of dening occurrences of attributes is AF (p) =
fXi :a j Xi :a f (: : : ) 2 R(p)g. An attribute X:a is called derived or synthesized if there
exists a production p : X ! and X:a is in AF (p); it is called inherited if there exists a
production q : Y ! X and X:a 2 AF (q).
Synthesized attributes of a symbol represent properties resulting from consideration of the
subtree derived from the symbol in the structure tree. Inherited attributes result from con-
sideration of the environment. In Figure 8.1, the name.primode and addop.operation at-
tributes were synthesized; name.environment and addop.mode were inherited.
Attributes such as the value of a constant or the symbol of an identier, which arise in
conjunction with structure tree construction, are called intrinsic. Intrinsic attributes re ect
our division of the original context-free grammar into a parsing grammar and a symbol gram-
mar. If we were to use the entire grammar of Appendix A as the parsing grammar, we could
easily compute the symbol attribute of an identifier node from the subtree rooted in that
node. No intrinsic attributes would be needed because constant values could be assigned
to left-hand side attributes in rules such as letter ::= 'a'. Thus our omission of intrinsic
attributes in Denition 8.2 results in no loss of generality.
8.3 Theorem
The following sets are disjoint for all X in the vocabulary of G:
AS (X ) = fX:a j 9p : X ! 2 P and X:a 2 AF (p)g
AI (X ) = fX:a j 9q : Y ! X 2 P and X:a 2 AF (q)g
Further, there exists at most one rule X:a f (: : : ) in R(p) for each p 2 P and a 2 A(X ).
8.1 Basic Concepts of Attribute Grammars 157

Suppose that an attribute a belonged to both AS (X ) and AI (X ). Some derivation Z )

Y ) X ) ) ! (! 2 L(G)) would then have two dierent rules for computing
the value of attribute a at node X . But this situation is prohibited by the last condition of
Denition 8.1. It can be shown that Theorem 8.3 is equivalent to that condition.
Denition 8.1 does not guarantee that a synthesized attribute a 2 A(X ) will be com-
putable in all cases, because it does not require that X:a be an element of AF (p) for every
production p : X ! . A similar statement holds for inherited attributes.
8.4 Definition
An attribute grammar is complete if the following statements hold for all X in the vocabulary
of G:
For all p : X ! 2 P; AS (X ) AF (p)
For all q : Y ! X 2 P; AI (X ) AF (q)
AS (X ) [ AI (X ) = A(X )
Further, if Z is the axiom of G then AI (Z ) is empty.
As compiler writers, we are only interested in attribute grammars that allow us to compute
all of the attribute values in any structure tree.
8.5 Definition
An attribute grammar is well-dened if, for each structure tree corresponding to a sentence
of L(G), all attributes are eectively computable. A sentence of L(G) is correctly attributed
if, in addition, all conditions yield true.
It is clear that a well-dened attribute grammar must be complete. A complete attribute
grammar is well-dened, however, only if no attribute can depend upon itself in any structure
tree. We therefore need to formalize the dependency graph introduced in Figure 8.3.
8.6 Definition
For each p : X0 ! X1 : : : Xn 2 P the set of direct attribute dependencies is given by
DDP (p) = f(Xi :a; Xj :b) j Xj :b f (: : : Xi:a : : : ) 2 R(p)g
The grammar is locally acyclic if the graph of DDP (p) is acyclic for each p 2 P .
We often write (Xi :a; Xj :b) 2 DDP (p) as Xi :a ! Xj :b 2 DDP (p), and follow the same
convention for the relations dened below. If no misunderstanding can occur, we omit the
specication of the relation. In Figure 8.3 the arrows are the edges of DDP (p) for a particular
p.
We obtain the complete dependency graph for a structure tree by `pasting together' the
direct dependencies according to the syntactic structure of the tree.
8.7 Definition
Let S be the attributed structure tree corresponding to a sentence in L(G), and let K0 : : : Kn
be the nodes corresponding to application of p : X0 ! X1 : : : Xn . We write Ki :a ! Kj :b if
Xi :a ! Xj :b 2 DDP (p). The set DT (S ) = fKi :a ! Kj :bg, where we consider all applications
of productions in S , is called the dependency relation over the tree S .
8.8 Theorem
An attribute grammar is well-dened if and only if it is complete and the graph of DT (S ) is
acyclic for each structure tree S corresponding to a sentence of L(G).
158 Attribute Grammars

If AG is a well-dened attribute grammar (WAG ) then a nondeterministic algorithm can be

used to compute all attribute values in the attributed structure tree for a sentence in L(G): We
provide a separate process to compute each attribute value, which is started after all operands
of the attribution rule dening that value have been computed. Upon completion of this
process, the value will be available and hence other processes may be started. Computation
begins with intrinsic attributes, which become available as soon as the structure tree has been
built. The number of processes depends not upon the grammar, but upon the number of nodes
in the structure tree. Well-denedness guarantees that all attributes will be computed by this
system without deadlock, independent of the precise construction of the attribute rules.
Before building a compiler along these lines, we should verify that the grammar on which it
is based is actually WAG. Unfortunately, exponential time is required to verify the conditions
of Theorem 8.8. Thus we must investigate subclasses of WAG for which this cost is reduced.
It is important to note that the choice of subclass is made solely upon practical consid-
erations; all well-dened attribute grammars have the same formal descriptive power. The
proof of this assertion involves a `hoisting' transformation that is sometimes useful in molding
a grammar to a pre-specied tree traversal: An inherited attribute of a symbol is removed,
along with all synthesized attributes depending upon it, and replaced by a computation in
the parent node. We shall see an example of this transformation in Section 8.2.3.

8.2 Traversal Strategies

A straightforward implementation of any attribute evaluation scheme will fail in practice
because of gigantic storage requirements for attribute values and correspondingly long com-
putation times. Only by selecting an evaluation scheme that permits us to optimize memory
usage can the attribute grammar technique be made practical for compiler construction. Sec-
tion 8.3.2 will discuss optimizations based upon the assumption that we can determine the
sequence of visits to a particular node solely from the symbol corresponding to that node.
We shall require that each production p : X0 ! X1 : : : Xn 2 P be associated with a xed
attribution algorithm made up of the following basic operations:
Evaluate an element of R(p).
Move to child node i (i = 1; : : : ; n).
Move to parent node.
Conceptually, a copy of the algorithm for p is attached to each node corresponding to an
application of p. Evaluation begins by moving to the root and ends when the algorithm for
the root executes `move to parent'.
We rst discuss algorithms based upon these operations { what they look like and how
they interact { and characterize the subclass of WAG for which they can be constructed. We
then examine two dierent construction strategies. The rst uses the attribute dependencies
to dene the tree traversal, while the second species a traversal a priori. We only discuss
the general properties of each strategy in this section; implementation details will be deferred
to Section 8.3.
8.2.1 Partitioned Attribute Grammars
Because of the properties of inherited and synthesized attributes, the algorithms for two
productions p : X ! and q : Y ! X must cooperate to evaluate the attributes of an
interior node of the structure tree. Inherited attributes would be computed by rules in R(q),
synthesized attributes by rules in R(p). The attribution of X represents the interface between
8.2 Traversal Strategies 159

Evaluate name.environment
Move to name
Evaluate expression.environment
Move to expression
Evaluate name.postmode
Move to name
Evaluate expression.postmode
Move to expression
Move to parent
a) Procedure for assignment ::= name ':=' expression
Evaluate name[1].environment
Move to name[1]
Evaluate name[2].environment
Move to name[2]
Evaluate expression.primode
Move to parent
Evaluate name[1].postmode
Move to name[1]
Evaluate addop.mode
Move to addop
Evaluate name[2].postmode
Move to name[2]
Evaluate condition
Move to parent
b) Procedure for expression ::= name addop name
Evaluate name.primode
Move to parent
Evaluate condition
Move to parent
c) Procedure for name ::= identifier
Figure 8.4: Evaluation Procedures for Figure 8.1
the algorithms for p and q. In Figure 8.3, for example, the algorithms for expression ::= name
addop name and assignment ::= name ':=' expression are both involved in computation
of attributes for the expression node. Because all computation begins and ends at the
root, the general pattern of the (coroutine) interaction would be the following: The algorithm
for q computes values for some subset of AI (X ) using a sequence of evaluation instructions.
It then passes control to the algorithm for p by executing `move to child i'. After using a
sequence of evaluation operations to compute some subset of AS (X ), the algorithm for p
returns by executing `move to parent'. (Of course both algorithms could have other attribute
evaluations and moves interspersed with these; here we are considering only computation of
X 's attributes.) This process continues, alternating computation of subsets of AI (X ) and
AS (X ) until all attribute values are available. The last action of each algorithm is `move to
parent'.
Figure 8.4 gives possible algorithms for the grammar of Figure 8.1. Because a symbol like
expression can appear in several productions on the left or right sides, we always identify
the production for the child node by giving only the left-hand-side symbol. We do not answer
the question of which production is really used because in general we cannot know. For the
same reason we do not specify the parent production more exactly.
160 Attribute Grammars

The attributes of X constitute the only interface between the algorithms for p and q.
When the algorithm for q passes control to the algorithm for p by executing `move to child i',
it expects that a particular subset of AS (X ) will be evaluated before control returns. Since the
algorithms must work for all structure trees, this subset must be evaluated by every algorithm
corresponding to a production of the form X ! . The same reasoning holds for subsets of
AI (X ) evaluated by algorithms corresponding to productions of the form Y ! X .
8.9 Definition
Given a partition of A(X ) into disjoint subsets Ai (X ), i = 1; : : : ; m(X ) for each X in the
vocabulary of G, the resulting partition of the entire attribute set A is admissible if, for all
X , Ai (X ) is a subset of AS (X ) for i = m; m , 2; : : : and Ai (X ) is a subset of AI (X ) for
i = m , 1; m , 3; : : : Ai (X ) may be empty for any i.
8.10 Definition
An attribute grammar is partitionable if it is locally acyclic and an admissible partition exists
such that for each X in the vocabulary of G the attributes of X can be evaluated in the
order A1 (X ); : : : ; Am (X ). An attribute grammar together with such a partition is termed
partitioned.
Since all attributes can be evaluated, a partitionable grammar must be well-dened.
A set of attribution algorithms satisfying our constraints can be constructed if and only
if the grammar is partitioned. The admissible partition denes a partial ordering on A(X )
that must be observed by every algorithm. Attributes belonging to a subset Ai (X ) may be
evaluated in any order permitted by DDP (p), and this order may vary from one production
to another. No context switch across the X interface occurs while these attributes are being
evaluated, although context switches may occur at other interfaces. A move instruction
crossing the X interface follows evaluation of each subset.
The grammar of Figure 8.1 is partitioned, and the admissible partition used to construct
Figure 8.4 was:
A1 (expression ) =fenvironment g A1 (name ) =fenvironmen tg
A2 (expression ) =fprimode g A2 (name ) =fprimode g
A3 (expression ) =fpostmode g A3 (nam e) =fpostmode g
A4 (expression ) =fg A4 (name ) = fg
A1 (addop ) =fmode g
A2 (addop ) =foperation g
A4 is empty in the cases of both expression and name because the last nonempty subset
in the partition consists of inherited attributes, while Denition 8.9 requires synthesized
attributes. At this point the algorithm actually contains a test of the condition, which we
have already noted can be regarded as a synthesized attribute of the left-hand-side symbol.
With this interpretation, it would constitute the single element of A4 for each symbol.

8.2.2 Derived Traversals

Let us now turn to the questions of how to partition an attribute grammar and how to derive
algorithms from an admissible partition that satises Denition 8.10, assuming no a priori
constraints upon the tree traversal. For this purpose we examine dependency graphs, with
which the partitions and algorithms must be compatible.
Suppose that X:a is an element of Ai (X ) and X:b is an element of Aj (X ) in an admissible
partition, and i > j . Clearly KX :a ! KX :b cannot be an element of DT (S ) for any structure
8.2 Traversal Strategies 161

tree S , because then X:b could not be calculated before X:a as required by the fact that
i > j . DDP (p) gives direct dependencies for all attributes, but the graph of DT (S ) includes
indirect dependencies resulting from the interaction of direct dependencies. These indirect
dependencies may lead to a cycle in the graph of DT (S ) as shown in Figure 8.5. We need a
way of characterizing these dependencies that is independent of the structure tree.
p

r s

Figure 8.5: A Cycle Involving More Than One Production

In a locally acyclic grammar, dependencies between attributes belonging to AF (p) can be
removed by rewriting the attribution rules:
Xi :a f (: : : ; Xj :b; : : : ) becomes Xi :a f (: : : ; g(: : : ); : : : )
Xj :b g (: : : ) Xj :b g(: : : )
In Figure 8.3 this transformation would, among other things, replace the dependency
expression.primode !
addop.mode by name[1].primode ! addop.mode and name[2].
primode !addop.mode . Dependencies that can be removed in this way may require that
the attributes within a partition element Ai (X ) be computed in dierent orders for dierent
productions, but they have no eect on the usability of the partition itself (Exercise 8.3).
8.11 Definition
For each p : X0 ! X1 : : : Xn 2 P , the normalized transitive closure of DDP (p) is
NDDP (p) = DDP (p)+ , f(Xi :a; Xj :b) j Xi:a; Xj :b 2 AF (p)g
The dependencies arising from interaction of nodes in the structure tree are summarized
by two collections of sets, IDP and IDS . IDP (p) shows all of the essential dependencies
between attributes appearing in production p, while IDS (X ) shows those between attributes
of symbol X .
8.12 Definition
The induced attribute dependencies of an attribute grammar (G; A; R; B ) are dened as fol-
lows:
1. For all p 2 P , IDP (p) := NDDP (p).
2. For all X in the vocabulary of G,
IDS (X ) := f(X:a; X:b) j 9q such that (X:a; X:b) 2 IDP (q)+ g
3. For all p : X ! X : : : Xn 2 P ,
0 1

IDP (p) := IDP (p) [ IDS (X0 ) [ [ IDS (Xn )

4. Repeat (2) and (3) until there is no change in any IDP or IDS .
162 Attribute Grammars

rule Z ::= X. (* Production 1 *)

attribution
X.a 1;

rule X ::= s Y. (* Production 2 *)

attribution
X.b Y.f ;
Y.c X.a ;
Y.d Y.e ;

rule X ::= t Y. (* Production 3 *)

attribution
X.b Y.e ;
Y.c Y.f ;
Y.d X.a ;

rule Y ::= u. (* Production 4 *)

attribution
Y.e 2;
Y.f Y.d ;

rule Y ::= v. (* Production 5 *)

attribution
Y.e Y.c ;
Y.f 3;
a) Rules

IDS (X ) = fa ! bg
IDS (Y ) = fc ! e; d ! f; e ! d; f ! cg
b) Induced dependencies for symbols
Figure 8.6: A Well-Dened Grammar

IDP (p) and IDS (X ) are pessimistic approximations to the desired dependency relations.
Any essential dependency that could be present in any structure tree is included in IDP (p)
and IDS (X ), and all are assumed to be present simultaneously. The importance of this point
is illustrated by the grammar of Figure 8.6, which is well-dened but not partitioned. Both
c ! e and d ! f are included in IDS (Y ) even though it is clear from Figure 8.7 that only
one of these dependencies could occur in any structure tree. A similar situation occurs for
e ! d and f ! c. The result is that IDS (Y ) indicates a cycle that will never be present in
any DT .
The pessimism of the indirect dependencies is crucial for the existence of a partitioned
grammar. Remember that it must always be possible to evaluate the attributes of X in
the order specied by the admissible partition. Thus the order must satisfy all dependency
relations simultaneously.
8.2 Traversal Strategies 163

Z ::= X Z ::= X
a b a b

X ::= sY X ::= sY

c d e f c d e f

Y ::= u Y ::= v

Z ::= X Z ::= X

a b a b
X ::= tY X ::= tY

c d e f c d e f

Y ::= u Y ::= v

Figure 8.7: Dependency Graphs DT (s)

8.13 Theorem
If an attribute grammar is partitionable then the graph of IDP (p) is acyclic for every p 2 P
and the graph of IDS (X ) is acyclic for every X in the vocabulary of G. Further, if a ! b is
in IDS (X ) then a 2 Ai (X ) and b 2 Aj (X ) implies i j .
Note that Theorem 8.13 gives a necessary, but not sucient, condition for a partitionable
grammar. The grammar of Figure 8.8 illustrates the reason, and provides some further
insight into the properties of partitionable grammars.
Given the rules of Figure 8.8, a straightforward computation yields IDS (X ) = fa !
b; c ! dg. Three of the ve admissible partitions of fa; b; c; dg satisfy Theorem 8.13:
fagfbgfcgfdg
fcgfdgfagfbg
fa; cgfb; dg
Figure 8.9 gives the dependency graphs for the two structure trees that can be derived ac-
cording to this grammar. Simple case analysis shows that none of the three partitions can be
used to compute the attributes of X in both trees. For example, consider the rst partition.
Attribute a must be computed before attribute d. In the rst tree X [1]:d must be known for
the computation of X [2]:a, so the sequence must be X [1]:a, X [1]:d, X [2]:a, X [2]:d. This is
impossible, however, because X [2]:d ! X [1]:a is an element of NDDP (Z ! sXX ).
When we choose a partition, this choice xes the order in which certain attributes may
be computed. In this respect the partition acts like a set of dependencies, and its eect may
be taken into account by adding these dependencies to the ones arising from the attribution
rules.
164 Attribute Grammars

rule Z ::= s X X.
attribution
X[1].a X[2].d ;
X[1].c 1;
X[2].a X[1].d ;
X[2].c 2;

rule Z ::= t X X.
attribution
X[1].a 3;
X[1].c X[2].b ;
X[2].a 4;
X[2].c X[1].b ;

rule X ::= u.
attribution
X.b X.a ;
X.d X.c ;

Figure 8.8: An Attribute Grammar That Is Not Partitioned

8.14 Definition
Let A1 (X ); : : : ; Am (X ) be an admissible partition of A(X ). For each p : X0 ! X1 : : : Xn in
P the set of dependencies over the production p is:
DP (p) = IDP (p) [ f(Xi :a; Xi :b) j a 2 Aj (Xi ); b 2 Ak (Xi ); 0 i n; j < kg
8.15 Theorem
Given an admissible partition for an attribute grammar, the grammar is partitioned if and
only if the graph of DP (p) is acyclic for each p 2 P .
Unfortunately, Theorem 8.15 does not lead to an algorithm for partitioning an at-
tribute grammar. Figure 8.10 is a partitioned grammar, but the obvious partition A1 (X ) =
fbg; A2 (X ) = fag causes cyclic graphs for both DP (1) and DP (2). In order to avoid the
problem we must use A1 (X ) = fag; A2 (X ) = fbg, A3 (X ) = fg. A backtracking procedure
for constructing the partition begins with the dependency relations of IDS (X ) and considers
pairs of independent attributes (a; b), one of which is inherited and the other synthesized. It
adds a ! b to the dependencies currently assumed and immediately checks all DP graphs
for cycles. If a cycle is found then the dependency b ! a is tested. If this also results in
a cycle then the procedure backtracks, reversing a previously assumed dependency. Because
this procedure involves exponential cost, it is of little practical interest.
As in the case of parser construction, where pragmatic considerations forced us to use
subclasses of the LL(k) and LR(k) grammars, the cost of obtaining an appropriate partition
forces us to consider a subclass of the partitioned grammars. The following denition yields
a nonbacktracking procedure for obtaining a partition that evaluates each attribute at the
latest point consistent with IDS (X ).
8.16 Definition
An attribute grammar is ordered if the following partition of A results in a partitioned gram-
mar:
Ai (X ) = Tm,i+1 (X ) , Tm,i,1 (X ) (i = 1; : : : ; m)
8.2 Traversal Strategies 165

Z ::= sXX

a b c d a b c d

X ::= u X ::= u

Z ::= tXX

a b c d a b c d

X ::= u X ::= u

Figure 8.9: DT Graphs for Figure 8.8

Here m is the smallest n such that Tn,1 (X ) [ Tn (X ) = A(X ), T,1 (X ) = T0 (X ) = ;, and for
k<0
T2k,1 (X ) = fa 2 AS (X ) j a ! b 2 IDS (X ) implies b 2 Tj (X ); j (2k , 1)g
T2k (X ) = fa 2 AI (X ) j a ! b 2 IDS (X ) implies b 2 Tj (X ); j 2kg

This denition requires that all Tj (X ) actually exist. Some attributes remain unassigned to
any Tj (X ) if (and only if) the grammar is locally acyclic and some IDS contains a cycle.
For the grammar of Figure 8.10, construction 8.16 leads to the `obvious' partition discussed
above, which fails. Thus the grammar is not ordered, and we must conclude that the ordered
grammars form a proper subclass of the partitionable grammars.
Suppose that a partitioned attribute grammar is given, with partitions A1 (X ); : : : ; Am (X )
for each X in the vocabulary. In order to construct an attribution algorithm for a production
p : X0 ! X1 : : : Xn , we begin by dening a new attribute ci;j corresponding to each subset
Ai (Xj ) of attributes not computed in the context of p. (These are the inherited attributes
Ai (X0 ), i = m , 1; m , 3; : : : of the left-hand side and the synthesized attributes Ai(Xj ); j 6=
0; i = m; m , 2; : : : of the right-hand side symbols.) For example, the grammar of Figure 8.1
is partitioned as shown at the end of Section 8.2.1. In order to construct the attribution
algorithm of Figure 8.4b, we must dene new attributes as shown in Figure 8.11a.
Every occurrence of an attribute from Ai (Xj ) is then replaced by ci;j in DP (p) [ DDP (p),
as illustrated by Figure 8.11b. DP (p) alone does not suce in this step because it was derived
(via IDP (p)) from NDDP (p), and thus does not re ect all dependencies of DDP (p). In
Figure 8.11b, for example, the dependencies expression.primode ! name[i].postmode
(i = 1; 2) are in DDP but not DP .
Figure 8.11b has a single node for each ci;j because each partition contains a single at-
tribute. In general, however, partitions will contain more than one attribute. The resulting
graph still has only one node for each ci;j . This node represents all of the attributes in Ai (Xj ),
and hence any relation involving an attribute in Ai (Xj ) is represented by an edge incident
upon this node.
166 Attribute Grammars

rule Z ::= s X Y. (* Production 1 *)

attribution
X.b Y.d ;
Y.c 1;
Y.e X.a ;

rule Z ::= t X Y. (* Production 2 *)

attribution
X.b Y.f ;
Y.c X.a ;
Y.e 2;

rule X ::= u. (* Production 3 *)

attribution
X.a 3;

rule Y ::= v. (* Production 4 *)

attribution
Y.d Y.c ;
Y.f Y.e ;

Figure 8.10: A Partitioned Grammar

The graph of Figure 8.11b describes a partial order. To obtain an attribution algorithm,
we augment the partial order with additional dependencies, consistent with each other and
with the original partial order, until the nodes are totally ordered. Figure 8.11c shows such
additional dependencies for Figure 8.11b. The total order denes the algorithm: Each element
that is an attribute in AF (p) corresponds to a computation of that attribute, each element
ci;0 corresponds to a move to the parent, and each element ci;j (j > 0) corresponds to a move
to the j th child. Finally, a `move to parent' operation is added to the end of the algorithm.
Figure 8.4b is the algorithm resulting from the analysis of Figure 8.11.
The construction sketched above is correct if we can show that all attribute dependencies
from IDP (p) and DDP (p) are accounted for and that the interaction with the moves between
nodes is proper. Since IDP (p) is a subset of DP (p), problems can only arise from the merging
of attributes that are not elements of AF (p). We distinguish ve cases:
Xi :a ! Xi :b 2 IDP (p), a 2= AF (p), b 2= AF (p)
Xi :a ! Xi :b 2 IDP (p), a 2 AF (p), b 2= AF (p)
Xi :a ! Xi :b 2 IDP (p), a 2= AF (p), b 2 AF (p)
Xi :a ! Xj :b 2 IDP (p), i 6= j , a 2= AF (p)
Xi :a ! Xj :b 2 IDP (p), i 6= j , b 2= AF (p)
In the rst case the dependency is accounted for in all productions q for which a and b
are elements of AF (q). In the second and third cases Xi :a and Xi :b must belong to dierent
subsets Ar (Xi ) and As (Xi ). The dependency manifests itself in the ordering condition r < s
or s < r, and will not be disturbed by collapsing either subset. In the fourth case we compute
Xj :b only after all of the attributes in the subset to which Xi :a belongs have been computed;
this is simply an additional restriction. The fth case is excluded by Denition 8.11: Xi :a !
Xj :b cannot be an element of DDP (p) because Xj :b is not in AF (p); it cannot be an element
of any IDS because i 6= j .
8.2 Traversal Strategies 167

c1;0 = fexpression.environment g
c3;0 = fexpression.postmode g
c2;1 = fname[1].primode g
c4;1 = fg
c2;2 = faddop.operation g
c2;3 = fname[2].primode g
c4;3 = fg
a) New attributes
c 1,0

name [1].environment c 2,2 name [2].environment

addop.mode
c 2,1 c 2,3

expression.primode

name[1].postmode c 3,0 name[2].postmode

c 4,1 condition c 4,3

b) Graph dening DP (p) [ DDP (p)

c2;1 ! name[2].environment
c3;0 ! name[1].postmode
c4;1 ! addop.mode
c2;2 ! name[2].postmode
c4;3 ! condition
c) Additional dependencies used to establish a total order
Figure 8.11: Deriving the Algorithm of Figure 8.4b

When an algorithm begins with a visit cj;i , this visit may or may not actually be carried
out. Suppose that the structure tree has been completed before the attribution is attempted.
The traversal then begins at the root, and every algorithm will be initiated by a `move to child
i'. Now if the rst action of the algorithm is c1;0 , i.e. a move to the parent to compute inherited
attributes, this move is super uous because the child is only invoked if these attributes are
available. Hence the initial c1;0 should be omitted. The situation is reversed if the tree is
being processed bottom-up, as when attribution is merged with a bottom-up parse: An initial
ci;j that causes a move to the leftmost subtree should be omitted.
Semantic conditions are taken care of in this schema by treating them as synthesized
attributes of the left-hand side of the production. They can be introduced into an algorithm
at any arbitrary point following computation of the attributes upon which they depend.
In practice, conditions should be evaluated as early as possible to enhance semantic error
recovery and reduce the lifetime of attributes.
168 Attribute Grammars

8.2.3 Pre-Specied Traversals

Overall compiler design considerations may indicate use of one or more depth-rst, left-to-
right and or right-to-left traversals for attribute evaluation. This allows us to linearize the
structure tree as discussed in Section 4.1.2 and make one or more passes over the linearized
representation. (For this reason, attribute grammars that specify such traversals are called
multi-pass attribute grammars. We shall discuss the left-to-right case in detail here, leaving
the analogous right-to-left case to the reader.
8.17 Definition
An attribute grammar is LAG(1) if, for every node corresponding to an application of p :
X0 ! X1 : : : Xn 2 P , the attributes in AI (X0 ), AI (X1 ), AS (X1 ), AI (X2 ); : : : ; AS (Xn ),
AS (X0 ) can be computed in that order.
An LAG(1) grammar is partitioned, with the partition being A1 (X ) = AI (X ), A2 (X ) =
AS (X ) for all X . Further constraints on the order of evaluation within a production are
introduced to force processing of the symbols from left to right.
8.18 Theorem
An attribute grammar is LAG(1) if and only if it is locally acyclic and, for all p : X0 !
X1 : : : Xn 2 P , Xi :a ! Xj :b 2 DDP (p) implies one of the following conditions:
j=0
i = 0 and a 2 AI (X0 )
1i<j
1 i = j and a 2 AI (Xi )
Note that Theorem 8.18 makes use only of DDP (p); it does not consider induced attribute
dependencies. This is possible because every induced dependency that would aect the com-
putation must act over a path having a `top' node similar to that in Figure 8.5: An inherited
attribute of a symbol depends directly upon a synthesized attribute of the same symbol. This
case is prohibited, however, by the conditions of the theorem.
LAG(1) grammars are inadequate even in comparatively simple cases, as can be seen by
considering the grammar of Figure 8.1. The production for assignment satises the condi-
tions of Theorem 8.18, but that for expression does not because both name[1].postmode
and name[2].postmode depend upon expression.primode . We can repair the problem in
this example by applying the `hoisting' transformation mentioned at the end of Section 8.1:
Delete the inherited attribute postmode and move the condition using it upward. A similar
change is required to move the operator identication upward (Figure 8.12).
If one tree traversal does not suce to compute all attributes, a sequence of several
traversals might be used. This idea is actually much older and more general than that of
attribute grammars. We have already met it in Section 1.3: `Any language requires at least one
pass over the source text, but certain language characteristics require more.' (The procedure
determine traversals discussed below describes, in terms of attributes, the fundamental
mechanism by which the number of passes of a compiler is determined.) The dierence
between LAG and RAG appears in the same section as the distinction between forward and
backward passes.
All attributes in the structure tree of a sentence derived from any arbitrary well-dened
attribute grammar can be evaluated with an unlimited number of traversals, but the cost of
determining dynamically whether another traversal is necessary is roughly as high as that of
the nondeterministic evaluation procedure in Section 8.1. Here we are interested in cases for
which the number of traversals can be determined from the grammar alone, independent of
any structure tree.
8.2 Traversal Strategies 169

rule assignment ::= name ':=' expression .

attribution
name.environment assignment.environment ;
expression.environment assignment.environment ;
condition
coercible (
expression.primode,
if
name.primode = ref_int_type then int_type
else
real_type );
rule expression ::= name addop name .
attribution
name[1].environment expression.environment ;
name[2].environment expression.environment ;
expression.primode
if
coercible (name[1].primode , int_type ) and
coercible (name[2].primode, int_type )
thenint_type
elsereal_type
;
addop.operation
if expression.primode = int_type then
int_addition
else real_addition
;
condition
coercible (name[1].primode, expression.primode ) and
coercible (name[2].primode, expression.primode );

rule addop ::= '+'.

rule name ::= identifier .

attribution
name.primode
defined_type (identifier.symbol ,name.environment );
Figure 8.12: Transformation of Figure 8.1
8.19 Definition
An attribute grammar is LAG(k) if and only if for each X in the vocabulary a partition
AI (X ) = AI1 (X ) [ [ AIk (X )
AS (X ) = AS1 (X ) [ [ ASk (X )
exists such that for all productions p : X0 ! X1 : : : Xn , the attributes in AI1 (X0 ),
AI1 (X1 ); : : : ; AS1 (Xn ), AS1 (X0 ), AI2 (X0 ); : : : ; AIk (X0 ); : : : ; ASk (X0 ) can be computed in
that order.
Note that this reduces to Denition 8.17 for k = 1.
The set of partitions taken together form an admissible partition of the attribute set A
with m(X ) = 2k for every X . We can think of the sets AIi (X ) and ASi (X ) as belonging to
an LAG(1) grammar with AIj (X ) and ASj (X )(j < i) as intrinsic attributes. This reasoning
leads to the following LAG(k) condition which closely parallels Theorem 8.18:
170 Attribute Grammars

8.20 Theorem
An attribute grammar is LAG(k) if and only if it is locally acyclic and a partition A =
A1 [ [ Ak exists such that for all p : X0 ! X1 : : : Xn 2 P , Xi:a ! Xj :b 2 DDP (p),
a 2 Au(Xi ), b 2 Av (Xj ) implies one of the following conditions:
u<v
u = v and j = 0
u = v and i = 0 and a 2 AI (X0 )
u = v and 1 i < j
u = v and 1 i = j and a 2 AI (Xi )

function determine_traversals : integer ;

(* Test an attribute grammar for the LAG(k ) property
On entry-
Attribute grammar (G ; A ; R ; B ) is defined as in Section 8.1
Sets A , AS (X ) and AF (p ) are defined as in Section 8.1
Set NDDP (p ) is defined as in Section 8.2.2
If the grammar is LAG(k )
then on exit- determine_traversals = k
else on exit- determine_traversals = -1
*)
var
k : integer ; (* current traversal number *)
candidates , (* possibly evaluable in the current traversal *)
later : attribute_set ;
(* not evaluable in the first k traversals *)
candidates_unchanged :boolean ;
begin (* determine_traversals *)
k := 0; later := A ; (* no attributes evaluable in 0 traversals *)
repeat (* determine the next Ak *)
k := k + 1; candidates := later ; later := ; ;
repeat (* delete those unevaluable in traversal k *)
candidates_unchanged := true ;
for all productions p : X0 X1 ! : : : do
Xn
forall Xj .b 2
(AF(p) \
candidates ) do
for
all Xi .a A 2 do
if
Xi .a !
Xj .b 2
NDDP (p ) then
Xi .aif later2 or 6 and
j = 0
(i > j or
(i = 0 or
i = j) and
a 2
AS(Xi )) then
begin
candidates := candidates - fXj .b g;
later := later [ fXj .b g;
candidates_unchanged := false ;
; end
until candidates_unchanged ;
Ak := candidates ;
until later = ; or candidates = ;
if later = ; then determine_traversals := k
else determine_traversals := - 1;
end ; (* determine_traversals *)
Figure 8.13: Testing the LAG(k) Property
8.2 Traversal Strategies 171

Theorem 8.20 leads directly to a procedure for determining the partition and the value
of k from a locally acyclic grammar (Figure 8.13). For k = 1; 2; : : : this procedure assumes
that all remaining attributes belong to Ak and then deletes those for which this assumption
violates the theorem. There are two distinct stopping conditions:
No attribute is deleted. The number of traversals is k and the partition is A1 ; : : : ; Ak .
All attributes are deleted. The conditions of Theorem 8.20 cannot be met and hence
the attribute grammar is not LAG(k) for any k.
Analogous constructions are possible for RAG(k) grammars and for the alternating evalu-
able attribute grammars (AAG(k)). With the latter class, structure tree attributes are evalu-
ated by traversals that alternate in direction: The rst is left-to-right, the second right-to-left,
and so forth. We leave the derivation of these denitions and theorems, plus the necessary
processing routines, to the reader.
It is important to note that the algorithm of Figure 8.13 and its analogs for RAG(k) and
AAG(k) assign attributes to the rst traversal in which they might be computed. These
algorithms give no indication that it might also be possible to evaluate an attribute in a later
traversal without delaying evaluation of other attributes or increasing the total number of
traversals.
rule Z ::= X .
attribution
X.b 1;

rule X ::= W X.
attribution
X[1].a W.c ;
X[2].b X[1].b ;
W.d X[2].a ;

rule X ::= 's'.

attribution
X.a X.b ;

rule W ::= 't'.

attribution
W.c W.d ;
Figure 8.14: An RAG(1) Grammar That Is Not LAG(k)
Figure 8.14 is RAG(1) but not LAG(k) for any k. Each left-to-right traversal can only
compute the value of one X:a because of the dependency relation involving the preceding
nonterminal W . Hence the number of traversals is not xed, but is the depth of the recur-
sion. A single right-to-left traversal suces to compute all X:a, however, because traversal
of W 's subtree follows traversal of X [2]'s. If we combine two such attribute relationships
with opposite dependencies then we obtain an AAG(2) grammar that is neither LAG(k) nor
RAG(k) for any k (Figure 8.15).
It is, of course, possible to construct an appropriate partition for a multi-pass grammar
by hand. The development usually proceeds as follows: On the basis of given properties of
the language one determines the minimum number of traversals required, partitions the at-
tributes accordingly, and then constructs the attribute denition rules to make that partition
valid. The `hoisting' transformation referred to earlier is often used implicitly during rule
construction.
172 Attribute Grammars

rule Z ::= X .
attribution
X.b 1;

rule X ::= W X Y.
attribution
X[1].a W.c ;
X[1].e Y.g ;
X[2].b X[1].b ;
W.d X[2].a ;
Y.f X[2].e ;

rule X ::= 's'.

attribution
X.a X.b ;
X.e X.b ;

rule W ::= 't'.

attribution
W.c W.d ;

rule Y ::= 'u'.

attribution
Y.g Y.f ;
Figure 8.15: An AAG(2) Grammar That Is Neither LAG(k) Nor RAG(k)
The disadvantage of this technique is that it is based upon an initial opinion about the
number of traversals and the assignment of attributes to traversals that may turn out to be
wrong. For example, one may discover when constructing the rules that an attribute can
only be computed if additional arguments are available, or even that important attributes are
missing entirely. Experience shows that small changes of this kind often have disastrous eects
on the basic structure being built. Considering the cost involved in developing a semantic
analyzer { an attribute grammar for LAX is barely 30 pages, but specications for complex
languages can easily grow to well over 100 pages { such eects cannot be tolerated. It is more
advisable to construct an attribute grammar without regard to the number of traversals.
Only when it is certain that all aspects of the language have been covered correctly should
substitutions and other alterations to meet a constraint upon the number of traversals be
undertaken. The greater part of the grammar will usually be unaected by such changes.
As soon as a partition of the attribute set satisfying Denition 8.17 or 8.19 is available, it
is simple to derive an algorithm via the technique discussed at the end of the last section.

8.3 Implementation Considerations

Section 8.2 showed methods for constructing attribute evaluation algorithms from attribute
grammars. Here we concern ourselves with the implementation of these algorithms. First we
assume that the structure tree appears as a linked data structure providing storage for the
attributes, and later we show how to reduce the storage requirements.

8.3.1 Algorithm Coding

Our evaluation procedures are coroutines that transfer control among themselves by executing
the basic operations `move to child i' and `move to parent'. They might be coded directly,
8.3 Implementation Considerations 173

class expression ;
begin comment Declarations of primode , postmode and environment end;
class name ;
begin comment Declarations of primode , postmode and environment end;
class addop ;
begin comment Declarations of mode and operation end;
expression class p2 ;
begin ref (name) X1 ; ref (addop) X2 ; ref (name) X3 ;
comment Initialization of X1 , X2 and X3 needed here;
detach;
X1.environment := environment ;
resume (X1) ;
X3.environment := environment ;
resume (X3) ;
primode := if : : :
;
detach ;
X1.postmode := primode ;
resume (X1) ;
X2.mode := primode ;
resume (X2) ;
X3.postmode := primode ;
resume (X3) ;
if : : :
; comment
Evaluate the condition;
detach ;
end;
Figure 8.16: SIMULA Implementation of Figure 8.4b
transformed to a collection of recursive procedures, or embodied in a set of tables to be
interpreted. We shall discuss each of these possibilities in turn.
The coroutines can be coded directly in SIMULA as classes, one per symbol and one
per production. Each symbol class denes the attributes of the symbol and serves as a
prex for classes representing productions with that symbol on the left side. This allows us
to obtain access to a subtree having a particular symbol as its root without knowing the
production by which it was constructed. Terminal nodes t are represented only by the class
t. Each production class contains pointer declarations for all of its descendants X1 : : : Xn. A
structure tree is built using statements of the form node :- new p (or node :- new t )
to create nodes and assignments of the form node.xi :- subnode to link them. Since a side
eect of new is execution of the class body, the rst statement of each class body is detach
(return to caller). (Intrinsic attributes could be initialized by statements preceding this rst
detach.) Figure 8.16 gives the SIMULA coding of the procedure from Figure 8.4b.
Figure 8.17 gives an implementation using recursive procedures. The tree is held in a
data structure made up of the nodes dened in Figure 8.17a. When a node corresponding to
application of p : X0 ! X1 : : : Xn is created, its elds are initialized as follows:
X0 p = p
x pi = pointer to node representing Xi ; i = 1; : : : ; n
The body of a coroutine is broken at the detach statements, with each segment forming
one branch of the case statement in the corresponding procedure. Then detach is imple-
mented by simply returning; resume (Xi ) is implemented by sproc s (x pi ; k), where sproc s
174 Attribute Grammars

type
tree_pointer = tree_node ;"
tree_node = record
case
symbols of
s: (* one per symbol in the vocabulary *)
( :::
(* storage for attributes of S *)
case
s_p : integer of
p : (* one per production p : S X1 Xn *) ! :::
(x_p : array
[1..n] tree_pointer );
)
end
;
a) General structure of a node
procedure pproc_p (t : tree_pointer; k : integer) ;
(* one procedure per production *)
begin (* pproc_p *)
case of
k
0 :
:::
(* actions up to the first detach *)
::: (* successive segments *)
end ;
end; (* pproc_p *)
b) General structure of a production procedure
proceduresproc_s (t : tree_pointer; k : integer) ;
(* one procedure per symbol *)
begin (* sproc_s *)
case "
t .s_p of
p : pproc_p (t, k) ; (* one case element per production *)
:::
end;
end; (* sproc_s *)
c) General structure of a symbol procedure
Figure 8.17: Transformation of Coroutines to Procedures
is the procedure corresponding to symbol Xi and k is the segment of that procedure to be ex-
ecuted. Figure 8.18 shows the result of applying the transformation to Figure 8.16. We have
followed the schema closely in constructing this example, but in practice the implementation
can be greatly simplied.
A tabular implementation, in which the stack is explicit, can be derived from Figure 8.17.
It involves a pushdown automaton that walks the structure tree, invoking evaluate in much
the same way that the parsing automata of Chapter 7 invoke parser actions to report connec-
tion points. In each case the automaton communicates with another processor via a sequence
of simple data items. Thus the implementations of the automaton and the communicating
processor are quite distinct, and dierent techniques may be used to carry them out. The
number of actions is usually very large, and when deciding how to handle them one must take
account of any restrictions imposed by the implementation language and its compiler.
Figure 8.19 shows how the pushdown automaton is implemented. Each entry in the table
corresponds to an element of some algorithm and there is an auxiliary function, segment ,
8.3 Implementation Considerations 175

such that segment (k; p) is the index of the rst entry for the kth segment of the algorithm
for production p. If the element corresponds to Xi :a then it species the computation in
some appropriate manner (perhaps as a case index or procedure address); otherwise it simply
contains the pair of integers dening the visit. Because the selectors for a visit must be
extracted from the table, rather than being built into the procedure, the tree node must be
represented as shown in Figure 8.19b.
type
"
tree_pointer = tree_node ;
tree_node = record
case symbols of
expression :
(expression_environment : environment ;
expression_primode,expression_postmode:type_specification ;
case expression_2 : integer of
:::2 : (x_2 : array
[1..3] tree_pointer ); of :::
name :
(name_environment : environment ;
name_primode, name_postmode : type_specification ; ); :::
addop :
(addop_mode : type_specification ; ); :::
end ;
procedure sproc_expression (t : tree_pointer; k : integer );
begin (* sproc_expression *)
case " t .expression_2 of
2 : pproc_2 (t , k );
end ;
end; (* sproc_expression *)
procedure pproc_2 (t : tree_pointer; k : integer );
begin (* pproc_2 *)
case of
k
0 : (* construction of subtrees *);
1 : begin
" " "
t .x_1[1] .name_environment := t .expression_environment ;
"
sproc_name (t .x_1[1], 1 );
" " "
t .x_1[3] .name_environment := t .expression_environment ;
"
sproc_name (t .x_1[3], 1 );
"
t .expression_primode := ; if : : :
end ;
2 : begin
" "
t .x_1[1].name_postmode := t .expression_primode ;
"
sproc_name (t .x_1[1], 2 );
" "
t .x_1[2].name_postmode := t .expression_primode ;
"
sproc_addop (t .x_1[2], 1 );
" "
t .x_1[3].addop_postmode := t .expression_primode ;
"
sproc_name (t .x_1[3], 2 );
if : : :
;
end ;
end ;
end; (* pproc_2 *)

Figure 8.18: Transformation of Figure 8.16

176 Attribute Grammars

Simplications in the general coding procedure are possible for LAG(k),RAG(k) and
AAG(k) grammars. When k = 1 the partition for each X is A1 (X ) = AI (X ), A2 (X ) =
AS (X ), so no intermediate detach operations occur in the coroutines. This, in turn, means
that no case statement is required in the production procedures or in the interpretive model.
For k > 1 there are k + 1 segments in each procedure proc p , corresponding to the ini-
tialization and k traversals. It is best to gather together the procedures for each traversal
as though dealing with a grammar for which k = 1, and then run them sequentially. When
parsing by recursive descent, the tree construction, the calculation of intrinsic attributes and
the rst tree traversal can be combined with the parsing.
type
table_entry = record
caseis_computation : boolean of
true : (* Rp ; X a *)
i :
(rule : attribute_computation );
false : ;
(* Csegment number child *)
(segment_number, child : integer )
end
;
a) Structure of a table entry
type
tree_pointer = "
tree_node ;
tree_node = record
production : integer ;
X : array
[1..max_right_hand_side] of tree_pointer
end
;
b) Structure of a tree node
procedure interpret ;
label 1 ;
var t : tree_pointer ;
state , next : integer ;
begin (* interpret *)
t := root_of_the_tree ;
state := segment (0, t".production );
repeat
next := state + 1;
with table[state] do
if is_computation then evaluate (t , rule )
else if segment_number <> 0 then
begin
stack_push (t , next );
t := t".X[child] ;
next := segment (segment_number ,t".production );
end
else if stack_empty then goto 1
else stack_pop (t , next );
state := next ;
until false ; (* forever *)
1 : end; (* interpret *)
c) Table interpreter
Figure 8.19: Tabular Implementation of Attribution Algorithms
8.3 Implementation Considerations 177

8.3.2 Attribute Storage

So far we have assumed that all attributes of a structure tree node were stored within the node
itself. Applying this assumption in practice usually leads to a gigantic storage requirement.
Several remedies are possible:
Overlaying of attributes
Use of local temporaries of evaluation procedures
Storage of specied attributes only at designated nodes.
Use of global variables and data structures.
Because these optimizations cannot be automated completely (given the present state of
the art), the question of attribute storage represents an important part of the development
of an attribute grammar implementation.
We classify the attributes of a node as nal or intermediate. Final attributes are neces-
sary in later phases of the compilation and must be available in the structure tree following
attribution. Intermediate attributes are used only as aids in computing other attributes or
testing conditions; they have a bounded lifetime. The largest intermediate attribute, which
we shall discuss in Chapter 9, is the environment used to obtain the meaning of an identier
at a particular point.
Distinct storage must be assigned to nal attributes, but this storage can be used earlier
to hold one or more intermediate attributes if their lifetimes do not overlap. Minimization of
overlap (not minimization of lifetimes for simple attributes) is thus one of the most important
uses of our freedom to specify the sequence of attribute evaluations. Usually it is best to begin
with the nal attributes and work backwards, xing the sequence so that attributes can take
one another's place in storage.
We often discover that two attribute lifetimes overlap, but only brie y. The overlap
can be eliminated by dening a new attribute whose lifetime is just this overlap, assigning
the rst attribute to it, and freeing the second attribute's storage. The second attribute is
then computed into that storage. In this manner we reduce the overlap among `long lived'
attributes and increase the number of `short lived' attributes. The new attributes generally
have little overlap among themselves, but even if they had we have gained something: This
transformation usually makes other optimizations applicable.
In many cases we can implement short-lived attributes as local variables of the evaluation
procedures, thus avoiding the need for space within the node entirely. If the attributes
are referenced by other procedures (for the parent or children of the node to which they
belong) then their values can be passed as extra parameters. This strategy only works for
implementations like that of Figure 8.17, where distinct processing procedures are provided.
The tabular implementation discussed at the end of Section 8.2.1 requires stacks instead of
procedure parameters or local variables to realize the same strategy.
An attribution rule can only access attributes of the nodes corresponding to the symbols
of the associated production. Many of the attributes in a typical grammar are therefore
concerned with transmission of information from one part of the tree to another. Since
attribute values do not change, they may be transmitted by reference instead of by value.
Thus we might store the value of a large attribute at a single node, and replace this attribute
in other nodes by a pointer to the stored information. The node at which the value is stored is
usually the root of a subtree to which all nodes using this information belong. For example,
the environment attribute of a block or procedure node is formed by combining the lists
generated by local denitions with the inherited environment. The result is passed to all
nodes in the subtree rooted in the block or procedure node. If a pointer to the next enclosing
178 Attribute Grammars

block or procedure node is given during the processing of the nodes in the subtree, then we
obtain the same environment: First we reach the local denitions in the innermost enclosing
block and, in the same manner, the next outermost, etc. The search of the environment for
a suitable denition thus becomes a search of the local denition lists from inner to outer.
Attributes should often be completely removed from the corresponding nodes and repre-
sented by global variables or linked structures in global storage. We have already noted that
it is usually impossible to retain the entire structure tree in memory. Global storage is used
to guarantee that an attribute accessible by a pointer is not moved to secondary storage with
the corresponding node. Global storage is also useful if the exact size of an attribute cannot
be determined a priori. Finally, global storage has the advantage that it is directly accessible,
without the need to pass pointers as parameters to the evaluation procedures.
If the environment is kept as a global attribute then it is represented by a list of local
denitions belonging to the nested blocks or procedures. In order to be certain that the
`correct' environment is visible at each node we alter the global attribute during the traversal
of the structure tree: When we move to a block or procedure node from its parent, we copy
the local denition set to this environment variable; when we return to the parent we delete
it.
The description in the previous paragraph shows that in reality we are using a global
data structure to describe several related attribute values. This situation usually occurs with
recursive language elements such as blocks. The environment attribute shows the typical
situation for inherited attributes: Upon descent in the tree we alter the attribute value, for
example increasing its size; the corresponding ascent in the tree requires that the previous
state be restored. Sometimes, as in the case of the nesting depth attribute of a LAX block,
restoration is a simple inverse of the computation done on entry to the substructure. Often
there is no inverse, however, and the old value of the attribute must be saved explicitly. (The
environment represents an intermediate situation that we shall consider in Section refsec-9.3.)
By replacing the global variable with a global stack, we can handle such cases directly.
Global variables and stacks are also useful for synthesized attributes, and the analysis par-
allels that given above. Here we usually nd that attribute values replace each other at suc-
cessive ascents in the tree. An example is the primode computation in a LAX case clause :

rule case ::= case_label ':' statement_list .

attribution
case.primode statement_list.primode ;

rule cases ::= case .

rule cases ::= cases '//' case .

attribution
cases[1].primode balance (cases[2].primode, case.primode );

The value of cases[2].primode becomes irrelevant as soon as cases[1].primode has

been evaluated. A case may, however, contain another case clause . Hence a stack must
be used rather than a variable.
By changing the attribution rules, we can often increase the number of attributes imple-
mentable by global variables or stacks. A specic change usually xes a specic traversal
strategy, but any one of several changes (each implying a dierent traversal strategy) could
be used to achieve the desired eect. Thus the designer should avoid such changes until
the last possible time, when they can be coordinated with the `natural' traversal strategies
determined by the basic information ow.
8.4 Notes and References 179

8.4 Notes and References

Attribute grammars stem from the `syntax-directed compilers' introduced by Irons [1961,
1963b]. Irons' grammars had a single, synthesized attribute attached to each nonterminal.
This attribute provided the `meaning' of the subtree rooted in the nonterminal. Knuth
[1968b, 1971b] proved that such a scheme was sucient to dene the meaning associated
with any structure tree, but pointed out that the description could be simplied considerably
through the use of inherited attributes in addition. (Suciency of synthesized attributes
leads immediately to the conclusion that all well-dened attribute grammars have the same
descriptive power.) Intrinsic attributes were rst characterized by Schulz [1976], although
Lewis et al. [1974] had previously allowed certain terminal symbols to have `attributes whose
values are not given by rules'. The ax grammars of Koster [1971, 1976] are similar to
attribute grammars, the main dierence being that axes are considered to be variables while
attributes are constants. Raiha [1980] provides a good overview of the attribute grammar
literature as it existed in 1979.
Our treatment of attribute classication diers from that of many authors because we
do not begin with disjoint sets of synthesized, inherited and intrinsic attributes. Instead,
Denition 8.2 classies the attributes based upon the placement of the attribution rules.
Tienari [1980] has derived results similar to Theorems 8.3 and 8.8 from a denition allowing
more than one attribution rule per attribute in a single production. His analog of Theorem 8.8,
however, includes the restriction to a single attribution rule as a part of the hypothesis.
Theorem 8.8 assumes `value semantics' for the attribution rules: The operands of the rule
are evaluated before the rule itself, and hence the following represents a circularity:
a if p then b else 1 ; b if not p then a else 2 ;
`Lazy evaluation', in which an operand is not evaluated until its value is required, would
not lead to circularity in this case. The attendant broadening of the acceptable grammars is
not interesting for us because we are attempting to dene the evaluation sequence statically.
Whenever there is a dierence between value semantics and lazy evaluation, the evaluation
sequence must be determined dynamically.
Dynamic attribute evaluators based on cooperating sequential processes have been re-
ported by Fang [1972] and Banatre et al. [1979]. Borowiec [1977] described a fragment
of COBOL in this manner. The process scheduling overhead can be avoided by deriving a
dependency graph from the specic tree being processed, and then converting this graph to a
partial order. Gallucci [1981] implemented such a system, adding dependency links to the
tree and using reference counts to derive the partial order.
One of the major arguments given in support of a dynamic evaluator is that it is simple to
implement. The actual evaluation algorithm is simple, but it will fail on certain programs if the
grammar is not well-dened. We have already pointed out that WAG testing is exponential,
[Jazayeri et al., 1975; Jazayeri, 1981] and hence occasional failure of the dynamic evaluator
is accepted by most authors advocating this strategy. Acyclicity of IDP (p) and IDS (X ), a
sucient condition for WAG, can be tested in polynomial time [Kastens, 1980]. This test
forms the basis of all systems that employ subclasses of WAG. Such systems are guaranteed
never to fail during evaluation.
Kennedy and Warren [1976] termed the subclass of WAG for which IDP (p) and
IDS (X ) are acyclic for all p and X `absolutely non-circular attribute grammars (ANCAG).
They developed an algorithm for constructing ANCAG evaluators that grouped attributes
together, avoiding individual dependency links for every attribute. The evaluation remains
dynamic, but some decisions are shifted to evaluator construction time. In a later paper,
Kennedy and Ramanathan [1979] retain the ANCAG subclass but use a pure dynamic
evaluator. Their reasoning is that, although this strategy is less ecient at run time, it is
easier to understand and simpler to implement.
180 Attribute Grammars

Ordered attribute grammars were originated by Kastens [1976, 1980], who used the term
`arranged orderly' to denote a partitioned grammar. OAG is a subclass of ANCAG for which
no decisions about evaluation order are made dynamically; all have been shifted to evaluator
construction time. This means that attribute lifetimes can be determined easily, and the
optimizations discussed in Section 8.3.2 can be applied automatically: In a semantic analyzer
for Pascal, constructed automatically from an ALADIN description by the GAG [Kastens
et al., 1982] system, attributes occupied only about 20% of the total structure tree storage.
Lewis et al. [1974] studied the problem of evaluating all attributes during a single depth-
rst, left-to-right traversal of the structure tree. Making no use of the local acyclicity of
DDP (p), they derived the rst three conditions we stated in Theorem 8.18. The same con-
ditions were deduced independently by Bochman [1976], who went on to point out that
dependencies satisfying the fourth condition of Theorem 8.18 are allowed if the relationship
NDDP (p) is used in place of DDP (p). There is no real need for this substitution, however,
because if DDP (p) is locally acyclic then the dependency Xi :a ! Xj :b immediately rules out
Xj :b ! Xi :a. Thus dependencies satisfying the fourth condition of Theorem 8.18 cannot lead
to any problem in left-to-right evaluation. Since local acyclicity is a necessary condition for
well-denedness, this assumption does not result in any loss of generality.
LAG(k) conditions similar to those of Theorem 8.20 were also stated by Bochman [1976].
Again, he did not make use of local acyclicity to obtain the last condition of our result.
Systems based upon LAG(k) grammars have been developed at the Universite de Montreal
[Bochmann and Lecarme, 1974] and the Technische Universitat Munchen [Giegerich,
1979].
The theoretical underpinnings of the latter system are described by Ripken [1977], Wil-
helm [1977] and Ganzinger [1978]. Wilhelm's work combines tree transformation with
attribution.
Alternating-evaluable grammars were introduced by Jazayeri and Walter [1975] as a
generalization of Bochmann's work. Their algorithm for testing the AAG(k) condition does
not provide precise criteria analogous to those of Theorem 8.18, but rather uses specications
such as `occur before [the current candidate] in the present pass' to convey the basic idea. A
group at the University of Helsinki developed a compiler generator based upon this form of
grammar [Raiha and Saarinen, 1977; Raiha et al., 1978].
Asbrock [1979] and Pozefsky [1979] considers the question of attribute overlap mini-
mization in more detail.
Jazayeri and Pozefsky [1977] and Pozefsky [1979] give a completely dierent method
of representing a structure tree and evaluating a multi-pass attribute grammar. They propose
that the parser create k sequential les Di such that Di contains the sequence of attribution
rules with parameters for pass i of the evaluation. Thus Di contains, in sequential form,
the entire structure of the tree; only the attribute values, arbitrarily arranged and without
pointers to subnodes, are retained in memory. Pozefsky [1979] also considers the question
of whether the evaluation of a multi-pass grammar can be arranged to permit overlaying of
the attributes in memory.

Exercises
8.1 Write an attribute grammar describing a LAX basic symbol as an identifier , integer
or floating point . (Section A.1 describes these basic symbols.) Your grammar should
compute the intrinsic attributes discussed in Section 4.1.1 for each basic symbol (with
the exception of location) as synthesized attributes. Use no intrinsic attributes in your
grammar. Be sure to invoke the appropriate symbol and constant table operations
during your computation.
8.4 Notes and References 181

8.2 [Banatre et al., 1979] Write a module for a given well-dened attribute grammar
(G; A; R; B ) that will build the attributed structure tree of a sentence of L(G). The
interface for the module must provide creation, access and assignment operations as
discussed in Section 4.1.2. The creation and assignment operations will be invoked by
parser actions to build the structure tree and set intrinsic attribute values; the access
operation will be invoked by other modules to examine the structure of the tree and
attribute values of the nodes. Within the module, access and assignment operations are
used to implement attribution rules. You may assume that all invocations of creation
and assignment operations from outside the module will precede any invocation of an
access operation from outside. Invocations from within the module must, of course, be
scheduled according to the dependencies of the attribute grammar. You may provide
an additional operation to be invoked from outside the module to indicate the end of
the sequence of external creation and assignment invocations.
8.3 Consider the following attribute grammar:
rule Z ::= s X.
attribution
X.a X.c ;
X.b X.a ;

rule Z ::= t X.
attribution
X.b X.d ;
X.a X.b ;

rule X ::= u.
attribution
X.d 1;
X.c X.d ;

rule X ::= v.
attribution
X.c 2;
X.d X.c ;

(a) Show that this grammar is partitionable using the admissible partition A1 (X ) =
fc; dg, A2 (X ) = fa; bg, A3 (X ) = fg.
(b) Compute IDP (p) and IDS (X ) replacing NDDP (p) by DDP (p) in Deni-
tion 8.12. Explain why the results are cyclic.
(c) Modify the grammar to make IDP (p) and IDS (X ) acyclic under the modication
of Denition 8.12 postulated in (b).
(d) Justify the use of NDDP (p) in Denition 8.12 in terms of the modication of (c).
8.4 Compute IDP and IDS for all p and X in the grammar of Figure 8.1. Apply construc-
tion 8.16, obtaining a partition (dierent from that given at the end of Section 8.2.1),
and verify that Theorem 8.13 is satised. Compute DP for all p, and verify that
Theorem 8.15 is satised.
8.5 Show that a partitionable grammar that is not ordered can be made into an ordered
grammar by adding suitable `articial dependencies' X:a ! X:b to some IDS (X ).
(In other words, the gap between partitionable and ordered grammars can always be
bridged by hand.)
182 Attribute Grammars

8.6 Dene a procedure Evaluate P for each production of an LAG(1) grammar such that
all attributes of a structure tree can be evaluated by applying Evaluate Z (where Z
is the production dening the axiom) to the root.
8.7 A right-to-left attribute grammar may have both inherited and synthesized attributes.
All of the attribute values can be obtained in some number of depth-rst, right-to-left
traversals of the structure tree. State a formal denition for RAG(k) analogous to
Denition 8.19 and prove a theorem analogous to Theorem 8.20.
8.8 [Jazayeri and Walter, 1975] Dene the class of alternating evaluable attribute gram-
mars AAG(k) formally, state the condition they must satisfy, and give an analysis pro-
cedure for verifying this condition. (Hint: Proceed as for LAG(2k), but make some of
the conditions dependent upon whether the traversal number is odd or even.)
8.9 Extend the basic denitions for multi-pass attribute grammars to follow the hybrid
linearization strategy of Figure 4.4d: Synthesized attributes can be evaluated not only
at the last visit to a node but also after the visit to the ith subnode, 1 i n, or
even prior to the rst subnode visit (i = 0). How does this change the procedure
determine traversals ?

8.10 Show that the LAG(k), RAG(k) or AAG(k) condition can be violated by a partitionable
attribute grammar only when a syntactic rule leads to recursion.
8.11 Complete the class denitions of Figure 8.16 and ll in the remaining details to obtain
a complete program that parses an assignment statement by recursive descent and then
computes the attributes. If you do not have access to SIMULA, convert the schema
into MODULA2, Ada or some other language providing coroutines or processes.
8.12 Under what conditions will the tabular implementation of an evaluator for a partitioned
attribute grammar require less space than the coroutine implementation?
8.13 Give detailed schemata similar to Figure 8.17 for LAG(k) and AAG(k) evaluators,
along the lines sketched at the end of Section 8.3.1.
8.14 Consider the implementation strategies for attribution algorithms exemplied by Fig-
ures 8.17 and 8.19.
(a) Explain why the tree node of Figure 8.19b is less space-ecient than that of
Figure 8.17a.
(b) Show that, by coding the interpreter of Figure 8.19c in assembly language and
assigning appropriate values to the child eld of Figure 8.19a, it is possible to use
the tree node of Figure 8.17a and also avoid the need for the sproc s procedures
of Figure 8.17c.
8.15 Modify Figure 8.1 by replacing name with expression everywhere, and changing the
second rule to expression ::= '(' expression addop expression ')'. Consider an
interpretive implementation of the attribution algorithms that follows the model of
Exercise 8.16.
(a) Show the memory layout of every possible node.
(b) Dene another rule, addop ::= '-', with a suitable attribution procedure. What
nodes are aected by this change, and how?
(c) Show that the addop node can be incorporated into the expression node without
changing the attribution procedures for addop . What is the minimum change
necessary to the interpreter and the attribution procedure for expression ? (Hint:
Introduce a second interpretation for ci;j .)
Chapter 9
Semantic Analysis
Semantic analysis determines the properties of a program that are classed as static semantics
(Section 2.1.1), and veries the corresponding context conditions { the consistency of these
properties.
We have already alluded to all of the tasks of semantic analysis. The rst is name anal-
ysis, nding the denition valid at each use of an identier. Based upon this information,
operator identication and type checking determine the operand types and verify that they
are allowable for the given operator. The terms `operator' and `operand' are used here in
their broadest sense: Assignment is an operator whether the language denition treats it as
such or not; we also speak of procedure parameter transmission and block end (end of extent)
as operations.
Section 9.1 is devoted to developing a formal specication of the source language from
which analysis algorithms can be mechanically generated by the techniques of Chapters 5-8.
Our goal for the specication is clarity, so that we can convince ourselves of its correctness.
This is an important point, because the correspondence between the specication and the
given source language cannot be checked formally. In the interest of clarity, we often use
impractically inecient descriptions that give the eect of auxiliary functions, but do not
re ect their actual implementation. Section 9.2 discusses the practical implementation of
these auxiliary functions by modules.

9.1 Description of Language Properties via Attribute Gram-

mars
The description of a programming language by an attribute grammar provides a formal de-
nition of both its context-free syntax and its static semantics. (Dynamic semantics, such as
expression evaluation, could be included also; we shall not pursue that point, however.) We
therefore approach the total problem of analysis via attribute grammars as follows:
First we develop an attribute grammar and replace the informal language description
with it.
From the attribute grammar we extract the context-free syntax and transform it to a
parsing grammar in the light of the chosen parsing technique.
Finally we implement the attribution rules to obtain the semantic analyzer.
The parsing grammar and implementation of the attribution rules can be derived individ-
ually from the informal language denition, as we have done implicitly up to this point. The
183
184 Semantic Analysis

advantage of using attribute grammars (or some other formal description tool such as denota-
tional semantics) lies in the fact that one has a comprehensive and unied specication. This
ensures that the parsing grammar, structure tree and semantic analysis `t together' without
interface problems.
Development of an attribute grammar consists of the following interdependent steps:
Development of the context-free syntax.
Determination of the attributes and specication of their types.
Development of the attribution rules.
Formulation of the auxiliary functions.
Three major aspects of semantic analysis described via attribution are scope and name
analysis, types and type checking, and operator identication in expressions. With a few
exceptions, such as the requirement for distinct case labels in a case clause (Section A.4.6),
all of the static semantic rules of LAX fall into these classes. Sections 9.1.1 to 9.1.4 examine
the relevant attribution rules in detail.
Many of the attribution rules in a typical attribute grammar are simple assignments. To
reduce the number of such assignments that must be written explicitly, we use the following
conventions: A simple assignment to a synthesized attribute of the left-hand side of a pro-
duction may be omitted when there is exactly one symbol on the right-hand side that has
a synthesized attribute with the same name. Similarly, simple assignments of inherited at-
tributes of the left-hand side to same-named inherited attributes of any number of right-hand
side symbols may be omitted. In important cases we shall write these (semantic) transfers
for emphasis. (Attribute grammar specication languages such as ALADIN [Kastens et al.,
1982] contain even more far-reaching conventions.)
We assume for every record type R used to describe attributes the existence of a function
N R whose parameters correspond to the elds of the record. This function creates a new
record of type R and sets its elds to the parameter values. Further, we may dene a list of
objects by records of the form:
type
"
t_list = t_list_element ;
t_list_element = record
first : t ; rest : t_list end;
If
e is an object of type t then we shall also regard e as a single element of type
t list wherever the context requires this interpretation. We write l 1 & l 2 to indicate
concatenation of two lists, and hence e & l describes addition of the single element e to the
front of the list l . `Value semantics' are assumed for list assignment: A copy of the entire
list is made and this copy becomes the value of the attribute on the left of the arrow.
9.1.1 Scope and Name Analysis
The scope of identiers is specied in most languages by the hierarchical structure of the
program. In block structured languages the scopes are nested. Languages like FORTRAN
have only a restricted number of levels in the hierarchy (level 1 contains the subprogram and
COMMON names, level 2 the local identiers of a subprogram including statement numbers).
Further considerations are the use of implicit denition (FORTRAN), the admissibility (AL-
GOL 60) or inadmissibility (LIS) of new denitions in inner blocks for identiers declared in
outer blocks, and the restriction of scope to the portion of the block following the denition
(C). We shall consider the special properties of eld selectors in Section 9.1.3.
Every denition of an identier is represented in the compiler by a variant record. The
types of Figure 9.1a suce for LAX; dierent variants would be required for other languages.
9.1 Description of Language Properties via Attribute Grammars 185

type
definition_class = (
object_definition , (* Section A.3.1 *)
type_definition , (* Section A.3.1 *)
label_definition , (* Section A.2.6 *)
unknown_definition ); (* Undefined identifier *)
definition = record
uid : integer ; :::(* Discussed in Section 9.1.3 *)
ident : symbol ; (* Identifier being defined *)
casek : definition_class of
object_definition : (object_type :mode ); (* mode is discussed *)
type_definition : (defined_type :mode ); (* in Section 9.1.2 *)
label_definition ,
unknown_definition : ()
end
;
a) The attributes of an identier
definition_table = "dt_element ;
dt_element = record first :definition ; rest :definition_table end;
b) Type of the environment attribute
rule name ::= identifier_use .
condition
identifier_use.corresponding_definition.k = object_definition ;

rule identifier_use ::= identifier .

attribution
identifier_use.corresponding_definition
current_definition (identifier.sym , identifier_use.environment );
c) Use of an environment
Figure 9.1: Environments
For example, the variant type definition would be missing in a language without type
identiers and FORTRAN would require additional variants for subprograms and COMMON
blocks because these are not treated as objects. The denition record could also specify
further characteristics (such as the parameter passing mechanism for ALGOL 60 parameters
or the access rights to Ada objects) that are known at the dening occurrence and used at
the applied occurrences.
The denition class unknown definition is important because semantic functions must
deliver a value under all circumstances. If no denition is available for an identier, one must
be supplied (with the variant unknown definition ).
Records of type definition are collected into linear lists referenced as the environment
attribute by every construct that uses an identier. The rules for this attribute describe the
scope rules of the language. Figure 9.1b gives the type of this attribute, and Figure 9.1c
shows a typical example of its use. (Examples such as that of Figure 9.1c will normally
contain only the attribution rules necessary for the point that we are trying to make. Do not
assume, therefore, that no additional attributes or attribution rules are associated with the
given syntax rule.)
186 Semantic Analysis

rule statement_list ::= statements .

attribution
statements.environment
statements.definitions & statement_list.environment ;
condition
unambiguous (statements.definitions );
a) Language construct that changes the environment
rule unlabelled_statement ::= expression .
attribution
expression.environment unlabelled_statement.environment ;
b) Language construct that does not change the environment
Figure 9.2: Environment Manipulation
The introduction of an additional nonterminal identifier use in Figure 9.1c is necessary
because we cannot attach the attribute corresponding definition to either the nonterminal
name or the terminal identifier . For the former the attribute would be meaningless in the
production name ::= name 0 " 0 , while for the latter we would have diculty with dening
occurrences of identiers.
In LAX, the environment attribute is changed only upon entry to ranges (A.2.0.2). Fig-
ure 9.2a shows the change associated with a statement list . For language constructs that
are not ranges, the environment attribute is simply passed along unchanged as illustrated in
Figure 9.2b. (Figure 9.2b is an example of a `transfer rule', where we would normally not
write the attribute assignment.)
The synthesized attribute statements.definitions is a definition table that has
one entry for each label denition. It describes the identiers given new meanings in the
statement list . This attribute is constructed as shown in Figure 9.3. (Note that the rule
statements ::= statement is simply a transfer, and hence the attribution rules are omitted.)
The function gennum is a source of unique integers: Each invocation of gennum yields a new
integer.
Section A.2.2 gives the visibility rules for LAX. Implementation of these rules in the
attribute grammar is illustrated by Figures 9.1c and 9.2a. The function unambiguous is used
in Figure 9.2a to verify that statements.definitions contains no more than one denition
of any identier. Current definition (Figure 9.1c) searches the environment linearly from
left to right and selects the rst denition for the desired identier. As shown in Figure 9.2a,
the local denitions are placed at the front of the environment list; they therefore `hide' any
denitions of the same identiers appearing in outer ranges because a linear search will nd
them rst.
We must reiterate that attributes belonging to dierent symbols in a production or to
dierent nodes in a structure tree are dierent, even if they are identically named. Thus
there is not just one attribute environment , but as many as there are nodes in the structure
tree. The fact that these many environments will be represented by a single denition table
in the implementation discussed in Section 9.2 does not concern us in the specication. In the
same way, it does not follow from the informal specication of current definition given
above that the implementation must also use an inecient linear search; this strategy is only
a simple specication of the desired eect.
If the scope of a denition begins at that denition, and not at the beginning of the range in
which it appears (an important property for one-pass compilers), then the environment must
be passed `along the text' as shown in Figure 9.4. The right-recursive solution of Figure 9.4a
9.1 Description of Language Properties via Attribute Grammars 187

rule statements ::= statement .

rule statements ::= statements ';' statement .

attribution
statements[1].definitions
statements[2].definitions & statement.definitions ;

rule statement ::= label_definition statement .

attribution
statement[1].definitions
label_definition.def & statement[2].definitions ;

rule statement ::= unlabelled_statement .

attribution
statement.definitions nil;
rule label_definition ::= identifier ':'.
attribution
label_definition.def
N_definition (gennum , identifier.sym , label_definition );

Figure 9.3: Label Denition

requires the parser to accumulate entries for all of the declarations on its stack before it can
begin reducing declaration lists. This can lead to excessive storage requirements. A better
approach is to use left recursion, as shown in Figure 9.4b. In this case the parser will never
have more than one declaration entry on its stack, no matter how many declarations appear in
the declaration list. Figure 9.4b is easy to understand, but it has the unpleasant property that
for each declaration the original environment is augmented by all of the denitions resulting
from earlier declarations in the list. Figure 9.4c, where the environment is extended in a
stepwise manner, is the best strategy.
Figure 9.4c makes the passing of the environment `along the text' explicit. Declara-
tion list has an (inherited) attribute environment in that describes the initial state and
a (synthesized) attribute environment out that describes the nal state. The latter consists
of the former augmented by the current denition. Although this solution appears to be
quite costly because of the multiple environments, it is actually the most ecient: Simple
analysis shows that all of the environments replace one another and therefore all of them can
be represented by a single data structure.
It is clear that all of the denitions of Figure 9.4 are equivalent from the standpoint of
the language denition. If, however, we wish to specify the semantic analyzer then we prefer
Figure 9.4c. Examining a given attribute grammar for optimizations of this kind often pays
dividends.
The implicit declarations of FORTRAN are described in a similar fashion, with each
identifier use a potential declaration (Figure 9.5). We pass the environment along the
text of the expressions and statements, modifying it at each operand, by rules analogous
to those of Figure 9.4c. This strategy avoids the problem of double implicit declarations in
expressions such as I I .
Greater diculties arise from the fact that the Pascal fragment shown in Figure 9.6 is
illegal because i is declared in p but used prior to its declaration. This is not allowed,
even though a declaration of i exists outside of p . On the other hand, the use of t in the
188 Semantic Analysis

rule declaration_list ::= declaration ';' declaration_list .

attribution
declaration.environment declaration_list[1].environment ;
declaration_list[2].environment
declaration.definitions & declaration_list[1].environment ;
declaration_list[1].definitions
declaration.definitions & declaration_list[2].definitions ;
a) Right-recursive solution
rule declaration_list ::= declaration_list ';' declaration .
attribution
declaration_list[2].environment declaration_list[1].environment
declaration.environment
declaration_list[2].definitions & declaration_list[1].environment ;
declaration_list[1].definitions
declaration_list[2].definitions & declaration.definitions ;
b) Left-recursive solution
rule declaration_list ::= declaration_list ';' declaration .
attribution
declaration_list[2].environment_in
declaration_list[1].environment_in ;
declaration.environment declaration_list[2].environment_out ;
declaration_list[1].environment_out
declaration_list[2].environment_out & declaration.definitions ;
declaration_list[1].definitions
declaration_list[2].definitions & declaration.definitions ;
c) Stepwise environment construction
Figure 9.4: Scope Beginning at the Declaration

rule identifier_use ::= identifier .

attribution
identifier_use.implicit_definitions
iffound (identifier.sym , identifier_use.environment ) then nil
else
N_definition (
gennum ,
identifier.sym ,
object_definition ,
identifier.implicit_type );
identifier_use.corresponding_definition
current_definition (
identifier.sym ,
identifier_use.implicit_definitions & identifier_use.environment );

Figure 9.5: Implicit Declarations in FORTRAN

9.1 Description of Language Properties via Attribute Grammars 189

const i = 17;
type t = : : : ; (* First declaration of t *)
procedure p ;
type
j = i; (* Use of i illegal here *)
i = 1; (* This makes the previous line illegal *)
type
tt = "t ; (* Refers to second declaration of t *)
t = :::; (* Second declaration of t *)

Figure 9.6: Denition Before Use in Pascal

declaration of tt is correct and identies the type whose declaration appears on the next
line. This problem can be solved by a variant of the standard technique for dealing with
declarations in a one-pass ALGOL 60 compiler (Exercise 9.5).

9.1.2 Types
A type species the possible operations on an entity and the coercions that can be applied
to it. During semantic analysis this information is used to identify operators and verify
the compatibility of constructs with their environment. We shall concentrate on languages
with manifest types. Languages with latent types, in which type checking and operator
identication occur during execution, are treated in the same manner except that these tasks
are deferred.
In order to perform the tasks outlined in the previous paragraph, every structure tree
node that represents a value must have an attribute describing its type. These attributes
are usually tree-valued, and are built of linked records. For uniformity, the compiler writer
should dene a single record format to be used in building all of them. The record format
must therefore be capable of representing the type of any value that could appear in a source
program, regardless of whether the language denition explicitly describes that value as being
typed. For example, the record format used in a LAX compiler must be capable of representing
the type of nil because nil can appear as a value. Section A.3.1 does not describe nil as
having a specic type, but says that it `denotes a value of type ref t , for arbitrary t' .
Figure 9.7 denes a record that can be used to build attributes describing LAX types.
Type class bad type is used to indicate that errors have made it impossible to determine the
proper type. The type itself must be retained, however, since all attributes must be assigned
values during semantic analysis. Nil type is the type of the predened identier nil . We
also need a special mechanism for describing the result type of a proper procedure. Void type
species this case, and in fact is used whenever a result is to be discarded.
For languages like ALGOL 60 and FORTRAN, which have only a xed number of types,
an enumeration similar to type class serves to represent all types. Array types must also
specify the number of dimensions, but the element type can be subsumed into the enumeration
(e.g. integer array type or real array type ). Pascal requires additional specications for
the index bounds; in LAX the bounds are expressions whose values do not belong to the static
semantics, as illustrated by the rules of Figure 9.8.
Figure 9.9 shows how procedure types are constructed in LAX. (Bad symbol represents a
nonexistent identier.) Because parameter transmission is always by value (reference param-
eters are implemented by passing a ref value as discussed in Section 2.5.3) it is not necessary
to give a parameter transmission mechanism. In Pascal or ALGOL 60, however, the trans-
mission mechanism must be included for each parameter. For a language like Ada, in which
190 Semantic Analysis

type
type_class = (
bad_type , nil_type , void_type , bool_type , int_type , real_type ,
ref_type ,
arr_type ,
rec_type ,
proc_type ,
unidentified_type , (* See Section 9.1.3 *)
identified_type ); (* See Section 9.1.3 *)
mode = record
case k : type_class of
bad_type , nil_type , void_type , bool_type ,
int_type , real_type : ();
"
ref_type : (target : mode );
arr_type : (dimensions : integer ; element : mode ); "
rec_type : (fields : definition_table );
proc_type : (parameters : definition_table ; result : mode ); "
unidentified_type : (identifier : symbol );
identified_type : (definition : integer )
end ;
Figure 9.7: Representation of LAX Types
keyword association of arguments and parameters is possible, the identiers must be retained
also. We retain the parameter identiers, even though this is not required in LAX, to reduce
the number of attributes for the common case of a procedure declaration (A.3.0.8). Here we
can use the procedure type attribute both to validate the type compatibility and to provide
the parameter denitions. If we were to remove the parameter identiers from the procedure
type this would not be possible.
When types and denitions are represented by attributes, the complete set of declarations
(other than procedure declarations) can, in principle, be deleted from the structure tree
to avoid duplicating information both as attributes and as subtrees of the structure tree.
Actually, however, this compression of the representation should only be carried out under
extreme storage constraints; normally both representations should be retained. The main
reason is that expressions (like dynamic array bounds) appearing within declarations cannot
be abstracted as attributes because they are not evaluated until the program is executed.
Context-sensitive properties of types lead to several relations that can be expressed as
recursive functions over types (objects of type mode ). These basic relations are:
Equivalent : Two types t and t' are semantically equivalent.
Compatible : Usually an asymmetric relation, in which an object of type t can be used in
place of an object of type t' .
Coercible : A type t is coercible to a type t' if it is either compatible with t' or can be
converted to t' by a sequence of coercions.
Type equivalence is dened in Section A.3.1 for LAX; this denition is embodied in the
procedure type equivalent of Figure 9.10. Type equivalent must be used in all cases
where two types should be compared. The direct comparison t1 = t2 may not yield true for
equivalent composite types because the pointers contained in the type records may address
equivalent types represented by dierent records.
The test for equivalence of type identiers is for the identity of the type declarations
rather than for the equivalence of types they declare. This re ects the name equivalence
9.1 Description of Language Properties via Attribute Grammars 191

rule of Section A.3.1. If structural equivalence is required, as in ALGOL 68, then we must
compare the declared types instead. A simple implementation of this comparison leads to
innite recursion for types containing pointers to themselves. The recursion can, however, be
stopped as soon as we attempt to compare two types whose comparison has been begun but
has not yet terminated. During comparison we therefore hold such pairs in a stack. Since the
only types that can participate in innite recursion are those of class identified type , we
enter pairs of identified type types into the stack when we begin to compare them. The
next pair is checked against the stack before beginning their comparison; if the pair is found
then they are considered to be equivalent and no further comparison of them is required. (If
they are not equivalent, this will be detected by the rst comparison { the one on the stack.)
rule type_specification ::= 'ref' type_specification .
attribution
type_specification[1].repr N_mode (ref_type,
type_specification[2].repr );

rule type_specification ::= 'ref' array_type .

attribution
type_specification.repr N_mode (ref_type , array_type.repr );

rule array_type ::= 'array' '[' dimensions ']' 'of' type_specification .

attribution
array_type.repr
N_mode (arr_type , dimensions.count , type_specification.repr );

rule dimensions ::= .

attribution
dimensions.count 1;

rule dimensions ::= dimensions ','.

attribution
dimensions[1].count dimensions[2].count + 1;

rule record_type ::= 'record' fields 'end'.

attribution
record_type.repr N_mode (rec_type , fields.definitions );
condition
unambiguous(fields.definitions );

rule fields ::= field .

rule fields ::= fields ';' field .

attribution
fields[1].definitions fields[2].definitions & field.definitions ;

rule field ::= identifier ':' type_specification .

attribution
field.definitions
N_definition (gennum , identifier.sym , object_definition ,
type_specification.repr );
Figure 9.8: Type Denition
192 Semantic Analysis

rule type_specification ::= 'procedure' parameter_type_list result_type .

attribution
type_specification.repr
N_mode (proc_type ,parameter_type_list.definitions ,result_type.repr );

rule parameter_type_list ::=.

attribution
parameter_type_list.definitions nil;
rule parameter_type_list ::= '(' parameter_types ')'.

rule parameter_types ::= type_specification .

attribution
parameter_types.definitions
N_definition (gennum , bad_symbol , type_definition ,
type_specification.repr );

rule parameter_types ::= parameter_types ',' type_specification .

attribution
parameter_types[1].definitions
parameter_types[2].definitions &
N_definition (gennum, bad_symbol , type_definition ,
type_specification.repr );

Figure 9.9: Procedure Type Denition

Figure 9.10 compares exactly two types. If we wish to group all types of a block, procedure
or program into classes of structurally equivalent types then it is better to use the renement
algorithm of Section B.3.2 as generalized in Exercise B.7. This algorithm has the advantage
of reducing the number of records that represent types, and therefore the amount of storage
required to hold the attributes.
The Pascal Standard proposes name equivalence for all types except sets and subranges,
whose equivalence depends upon the equivalence of the base types. In addition, however, it
denes the property of type compatibility and relies upon that property for assignments and
parameter transmission. Among other things, two array types are compatible if they have the
same bounds and compatible element types. Other languages also provide (explicitly or im-
plicitly) a somewhat weaker compatibility relation in addition to the strong type equivalence.
There is no separate type compatibility rule in LAX.
The allowable LAX coercions (Section A.4.2) are embodied in the function coercible
(Figure 9.11). Note that when the type class of a type is bad type any coercion is allowed.
The reason is that this class can only occur as the result of an error. If we did not allow the
coercion, the use of an erroneous construct would lead to further (super uous) error messages.
9.1.3 Declarations
Figure 9.12 shows the attribution rules for variable and identity declarations in LAX. A
denition is created for each declaration, just as was done for label denitions in Figure 9.3.
Note that the variable declaration creates a reference to the given type, while the identity
declaration uses that type as it stands. This is because the variable declaration creates `a
variable referring to an undened value (of the specied type)' (Section A.3.2) and the identity
declaration creates `a new instance of the value (of the specied type)' (Section A.3.3).
9.1 Description of Language Properties via Attribute Grammars 193

function type_equivalent (t1 , t2 : mode ) : boolean ;

(* Compare two types for equivalence *)
function compare_parameters (f1 , f2 : definition_table ) : boolean ;
(* Compare parameter lists for equivalent types *)
begin (* compare_parameters *)
if
f1 = nil then compare_parameters := f2 = nil
else if f2 = nil then compare_parameters := false
elsecompare_parameters :=
" "
type_equivalent (f1 .first.object_type ,f2 .first.object_type )
and "
compare_parameters (f1 .rest , f2 .rest ) "
end; (* compare_parameters *)
begin (* type_equivalent *)
ift1.k <> t2.k then
type_equivalent := false
else case t1.k of
ref_type :
type_equivalent := type_equivalent (t1.target , t2.target ); " "
arr_type :
type_equivalent :=
t1.dimension = t2.dimension and
"
type_equivalent (t1.element , t2.element ); "
rec_type :
type_equivalent := false ;
proc_type :
type_equivalent :=
compare_parameters (t1.parameters , t2.parameters ) and
"
type_equivalent (t1.result , t2.result ); "
identified_type :
type_equivalent := t1.definition = t2.definition
otherwise type_equivalent := true
end
;
end; (* type_equivalent *)

Figure 9.10: Type Equivalence in LAX

function coercible (t1 , t2 : mode ) : boolean ;
(* Verify that t1 can be coerced to t2 *)
begin (* coercible *)
iftype_equivalent (t1 , t2 ) or
t2.k = void_type or
t2.k = bad_type
then coercible := true
else case t1.k of
bad_type : coercible := true
nil_type : coercible := t2.k = ref_type ;
int_type : coercible := t2.k = real_type ;
ref_type : coercible := coercible (t1.target , t2) ; "
proc_type : coercible := t1.parameters = nil and
coercible (t1.result , t2 ) "
otherwise
coercible := false
end
;
end; (* coercible *)

Figure 9.11: Coercibility in LAX

194 Semantic Analysis

rule variable_declaration ::= identifier ':' type_specification .

attribution
variable_declaration.definitions
N_definition (gennum , identifier.sym , object_definition ,
N_mode (ref_type , type_specification.repr ));

rule variable_declaration ::=

identifier ':' 'array' '[' bounds ']' 'of' type_specification .
attribution
variable_declaration.definitions
N_definition (gennum , identifier.sym , object_definition ,
N_mode (ref_type ,
N_mode (arr_type , bounds.count ,
type_specification.repr )));

rule bounds ::= bound_pair .

attribution
bounds.count := 1;

rule bounds ::= bounds ',' bound_pair .

attribution
bounds[1].count := bounds[2].count + 1;

rule identity_declaration ::=

identifier 'is' expression ':' type_specification .
attribution
identity_declaration.definitions
N_definition (gennum , identifier.sym , object_definition ,
type_specification.repr );

Figure 9.12: Variable and Identity Declarations

The treatment of array variables in Figure 9.12 re ects the requirements of Section A.3.2.
We construct the array type based only on the dimensionality and element type. The bounds
must be integer expressions, but they are to be evaluated at execution time.
Type declarations introduce apparent circularities into the declaration process: The de-
nition of an identier must be known in order to dene that identier. One obvious example,
the declaration type t = record x : real ; p : ref t end, was mentioned in Section 8.1.
Another is the fact that the analysis process discussed in Section 9.1.1 assumes we can con-
struct denitions for all identiers in a range and then form an environment for that range.
Unfortunately the denition of a variable identier includes its type, which might be specied
by a type identier declared in the same range. Hence the environment must be available to
obtain the type.
We solve the problem in three steps, as shown in Figure 9.13, using the unidenti-
fied type and identified type variants of mode :

1. Collect all of the type declarations of a range into one attribute, of type defini-
tion table . Any type identiers occurring in the corresponding types are not yet
identied, but are given by the unidentified type variant.
9.1 Description of Language Properties via Attribute Grammars 195

rule type_specification ::= identifier .

attribution
type_specification.repr N_mode (unidentified_type , identifier.sym );
a) Reference to a type identier
rule type_declaration ::= 'type' identifier '=' record_type .
attribution
type_declaration.definitions
N_definition (gennum ,identifier.sym ,type_definition ,record_type.repr );

rule declaration ::= variable_declaration .

rule declaration ::= identity_declaration .

rule declaration ::= type_declaration .

rule declarations ::= declarations ';' declaration .

attribution
declarations[1].definitions
declarations[2].definitions & declaration.definitions ;

rule block ::= 'declare' declarations 'begin' statements 'end'.

attribution
declarations.environment
complete_env (
declarations.definitions ,
declarations.definitions & statements.definitions &
block.environment ) & statements.definitions & block.environment ;
statements.environment declarations.environment ;
condition
unambiguous (declarations.definitions & statements.definitions );
b) Completing the type declarations
Figure 9.13: Type Declarations

2. As soon as step (1) has been completed, transform the entire attribute to an-
other definition table in which each unidentified type has been replaced by an
identified type that identies the proper denition. This transformation uses the
environment inherited by the range as well as the information present in the type dec-
larations.
3. Incorporate the newly-created definition table into the range's environment, and
then process all of the remaining declarations (none of which are type declarations).
Complete env is a recursive function that traverses the denitions seeking unidentied
types. Whenever one is found, identify type (Figure 9.14) is used to obtain the current
denition of the type identier. Note that identify type must use a unique representation
of the denition, not the denition itself, corresponding to the type identier. The reason
is that, if types involve recursive references, we cannot construct any of the denitions until
we have constructed all of them! (Remember that attributes are not variables, so it is not
possible to construct an `empty' denition and then ll it in later.)
196 Semantic Analysis

function identify_type (s : symbol ; e : definition_table ) : mode ;

(* Find the type defined by an identifier *)
begin (* identify_type *)
ife = nil thenidentify_type := N_mode (bad_type )
else with "e , first do
if
s <> ident
then identify_type := identify_type (s , rest )
else ifdef.k <> type_definition
then identify_type := N_mode (bad_type );
elseidentify_type := N_mode (identified_type , uid )
end; (* identify_type *)

Figure 9.14: Type Identication

9.1.4 Expressions and Statements

The a priori type (primode ) of an expression is a synthesized attribute, and describes the type
with which a result is computed; the a posteriori type (postmode ) is an inherited attribute,
and describes the type required by the context. If these two types are dierent then a sequence
of coercion operations must be used during execution to convert the value from one to the
other.
The a posteriori type of a particular expression may or may not depend upon its a priori
type. If the expression is an operand of an operator indication like +, which can stand for
several operations (e.g. integer addition, real addition), then its postmode depends upon the
primode attributes of both operands. If, on the other hand, the expression is an array index
in LAX then postmode is integer independent of the expression's primode .
Some constructs, like the LAX clause , may not yield a result of the same type every time
they are executed. This does not lead to diculty when the construct appears in a context
where the a posteriori type is xed, because each part of the construct simply inherits the
xed postmode . When the a posteriori type depends upon the a priori types of the operands,
however, we need a type t to serve as a `model a priori type' in place of the result types
t1; : : : ; tn. This type is obtained by balancing : A set of types t1 ; : : : ; tn , n > 1 can be balanced
to a type t if each ti is coercible to t, and there is no type t0 coercible to t such that each ti
is coercible to t0 .
For LAX (and most other languages) balancing is commutative and `associative' (Exer-
cise 9.11), so that we may restrict ourselves to the case n = 2 (Figure 9.15).
Three facts were used in constructing balance :
If t is coercible to but not equivalent to t , t is not coercible to t .
1 2 2 1

If not voided, the result has the same base type (type after all references and procedures
have been removed) as one of the operands.
If t1 is coercible to the base type of t2 but not to t2 itself, the result type is a dereferencing
and/or deproceduring of t2 .
If LAX types t1 and t2 are coerced to an a posteriori type t0 , then the type balance (t1 ; t2 )
always appears as an intermediate step. This may not be true in other languages, however. In
ALGOL 68, for example, balance (integer , real ) = real but both types can be coerced
to union (integer , real ) and in this case integer is not coerced to real rst.
Figure 9.16 illustrates the use of balancing. In addition to the attributes primode and
postmode , this example uses label values (synthesized, type case selectors ). Postmode
9.1 Description of Language Properties via Attribute Grammars 197

function base_type (t : mode ) : mode ;

(* Remove all levels of reference and procedure call from a type *)
begin (* base_type *)
if t.k = ref_type then
base_type := base_type (t.target ) "
else if t.k = proc_type then
ift.parameters <> nil then
base_type := t
else base_type := base_type (t.result ) "
else base_type := t
end ; (* base_type *)

function balance (t1 , t2 : mode ) : mode ;

(* Obtain the representative a priori type of t1,t2 *)
begin (* balance *)
if coercible (t1 , t2 ) then
balance := t2
else if coercible (t2 , t1 ) then
balance := t1
else if coercible (t1 , base_type (t2 )) then
case t2.k of
ref_type : balance := balance (t1 , t2.target ); "
proc_type : balance := balance (t1 , t2.result ) "
end
else if coercible (t2 , base_type (t1 )) then
case t1.k of
ref_type : balance := balance (t1.target , t2 ); "
proc_type : balance := balance (t1.result , t2 ) "
end
else N_mode (void_type ); (* incompatible types *)
end; (* balance *)

Figure 9.15: Balancing in LAX

is simply passed through from top to bottom, so we follow our convention of not writing these
transfers explicitly. Label values collects the values of all case labels into a list so we can
check that no label has occurred more than once (Section A.4.6).
Note that there is no condition checking coercibility of the resulting a priori type of the
case clause to the a posteriori type. Similarly, the a priori type of the selecting expression is
not checked against its a posteriori type in these rules. Such tests appear only in those rules
where the a priori type is determined by considerations other than balancing or transfer from
adjacent nodes.
Figure 9.17 illustrates some typical attribution rules for primode and postmode in ex-
pressions. Table A.2 requires that the left operand of an assignment be a reference, and
Section A.4.2 permits only dereferencing coercions of the left operand. Thus the assignment
rule invokes deproc (Figure 9.18)
Figure 9.18 to obtain an a posteriori type for the name. Note that there is no guarantee
that the type obtained actually is a reference, so additional checks are needed. Coercible
(Figure 9.11) is invoked to verify that the a priori type of the assignment itself can be coerced
to the a posteriori type required by the context in which the assignment appears. As can be
seen from the remainder of Figure 9.17, this check is made every time an object is created.
Assignment is the only dyadic operator in Table A.2 whose left and right operands have
dierent types. In all other cases, the types of the operands must be the same. The attribution
198 Semantic Analysis

type
case_selectors = "cs_element ;
cs_element = record first : integer ; rest : case_selectors end;
a) Type of label values
rule case_clause ::=
'case' expression 'of' cases 'else' statement_list 'end'.
attribution
case_clause.primode balance (cases.primode ,
statement_list.primode );
expression.postmode N_mode (int_type );
condition
values_unambiguous (cases.label_values );

rule cases ::= case .

rule cases ::= cases '//' case .

attribution
cases[1].primode balance (cases[2].primode , case.primode );
cases[1].label_values cases[2].label_values & case.label_values ;

rule case ::= case_label ':' statement_list .

attribution
case.label_values case_label.value ;
b) Attribution rules
Figure 9.16: Case Clauses

rules for comparison show how balance can be used in this case to obtain a candidate
operand type. The two rules for eqop illustrate placement of additional requirements upon
this candidate.
The attribution for a simple name sets the a priori type to the type specied by the
identier's denition, and must also verify (via coercible ) that the a priori type satises
the requirements of the context as specied by the a posteriori type. Field selection is a bit
trickier. Section A.4.4 states that the name preceding the dot may yield either an object or
a reference to an object. This requirement, which also holds for index selection, is embodied
in one ref (Figure 9.18). Note that the environment in which the eld identier is sought is
that of the record type denition, not the one in which the eld selection appears. We must
therefore write the transfer of the environment attribute explicitly. Finally, the type yielded
by the eld selection is a reference if and only if the object yielded by the name to the left of
the dot was a reference (Section A.4.4).
Figure 9.19 shows how the eld denitions of the record are obtained. Section A.3 requires
that every record type be given a name. The declaration process described in Figures 9.13
and 9.14 guarantees that if this name is associated with an identified type , the type
denition will actually be in the current environment. Moreover, the type denition cannot
specify anything but a record. Thus record env need not verify these conditions.
In most programming languages the specication of the operator and the a posteriori types
of the operands uniquely determines the operation to be carried out, but usually no operation
attribute appears in the language description itself. The reason is that semantic analysis does
9.1 Description of Language Properties via Attribute Grammars 199

rule assignment ::= name ':=' expression .

attribution
assignment.primode name.postmode ;
name.postmode deproc (name.primode );
expression.postmode
ifname.postmode.k <> ref_type then
N_mode (bad_type )
elsename.postmode.target ; "
condition
coercible (assignment.primode , assignment.postmode ) and
name.postmode.k = ref_type ;
rule comparison ::= relation eqop relation .
attribution
comparison.primode N_mode (bool_type );
relation[1].postmode eqop.operand_post ;
eqop.operand_pri balance (relation[1].primode ,relation[2].primode );
relation[2].postmode eqop.operand_post ;
condition
coercible (comparison.primode, comparison.postmode );
rule eqop ::= '='.
attribution
eqop.operand_post deref (eqop.operand_pri );
condition
eqop.operand_post.k <> void_type ;

rule eqop ::= '=='.

attribution
eqop.operand_post deproc (eqop.operand_pri );
condition
eqop.operand_post.k = ref_type ;
rule name ::= name '.' identifier_use .
attribution
name[1].primode
if
identifier_use.current_definition <> object_definition then
N_mode (bad_type )
else if
name[2].postmode.k = ref_type then
N_mode (ref_type , identifier_use.current_definition.object_type )
else
identifier_use.current_definition.object_type ;
name[2].postmode one_ref (name[2].primode );
name[2].environment name[1].environment ;
identifier_use.environment
if
deref (name[2].postmode).k <> identified_type then nil
else
record_env (deref (name[2].postmode ).definition,
name[1].environment );
condition
coercible (name[1].primode , name[1].postmode ) and
identifier_use.current_definition.k = object_definition ;

Figure 9.17: Determining A Priori and A Posteriori Types

200 Semantic Analysis

function deproc (t : mode ) : mode ;

(* Remove all levels of procedure call from a type *)
begin (* deproc *)
if t.k <> proc_type then deproc := t
else if t.parameters <> nil then deproc := t
else deproc := deproc (t.result ")
end; (* deproc *)
function deref (t : mode ) : mode ;
(* Remove all levels of reference from a type *)
begin (* deref *)
if t.k <> ref_type then deref := t
else deref := deref (t.target ");
end; (* deref *)
function one_ref (t : mode ) : mode ;
(* Remove all but one level of reference from a type *)
begin (* one_ref *)
case t.k of
ref_type :
if t.target".k <> arr_type and t.target".k <> rec_type
then one_ref := one_ref (t.target") else one_ref := t ;
proc_type :
if t.parameters <> nil then one_ref := t
else one_ref := one_ref (t.result ")
otherwise
one_ref := t
end;
end; (* one_ref *)
Figure 9.18: Type Transformations in LAX
function record_env (i :integer ;e :definition_table ):definition_table ;
(* Obtain the field definitions of a record type
On entry-
i =type for which the fields are sought
e =environment containing the type definition *)
begin (* record_env *)
if "
e .first.uid <> i then "
record_env := record_env (i ,e .rest )
else "
record_env := e .first.defined_type.fields ;
end; (* record_env *)
Figure 9.19: Obtaining a Record's Field Denitions
not make any further use of the operation, and the operation determined by the semantic
analysis may be either an over- or underspecication for code generation purposes. For
example, the distinction between integer and real assignment is usually an overspecication
because only the length of the object being assigned is of interest. On the other hand, a
record assignment operator is an underspecication because the code generator must decide
between a load/store sequence, a block transfer and a closed subroutine on the basis of the
record size.
The situation is dierent for languages like ALGOL 68 and Ada, in which a user may dene
operations. There the semantic analyzer must identify the operations, and there is scarcely
any distinction between operators and functions of one or two operands. Which operations
or functions are implemented with closed subprograms and which with open sequences of
instructions is a decision made by the code generator.
9.2 Implementation of Semantic Analysis 201

Operator identication for Ada depends not only upon the a priori types of the operands,
but also upon the a posteriori type of the result. There is no coercion, so the a priori and
a posteriori types must be compatible, but on the other hand the constant 2 (for example)
could have any of the types `short integer', and `integer' and `long integer'. Thus both the
operand types and the result types must be determined by analysis of the tree.
Each operand and result is given one inherited and one synthesized attribute, each of
which is a set of types. We begin at the leaves of the tree and compute the possible (a priori)
types of each operand. Moving up the tree, we specify the possible operations and result types
based upon the possible combinations of operand types and the operator indication. Upon
arriving at the root of the tree for the expression we have a synthesized attribute for every
node giving the possible types for the value of this node. Moving down the tree, these type
sets are now further restricted: An inherited attribute, a subset of the previous synthesized
attribute, is computed for each node. It species the set of types permitted by the use of this
value as an operand in operations further up the tree. At the beginning of the descent, the
previously-computed set of possible result types at the root is used as the inherited attribute
of the root. If this process leads to a unique type for every node of the tree, i.e. if the inherited
attribute is always a singleton set, then the operations are all specied; otherwise at least one
operator (and hence the program) is semantically ambiguous and hence illegal.
Because LAX is an expression-oriented language, statements and statement-like constructs
(statement list , iteration , loop , etc.) also have primode and postmode attributes.
Most rules involving these constructs simply transfer those attributes. Figure 9.20 shows
rules that embody the conditions given in Sections A.2.4 through A.2.6.

9.2 Implementation of Semantic Analysis

If we have fully specied the semantic analysis with an attribute grammar and auxiliary
functions, the implementation consists of the following steps:
Derive the abstract syntax for the structure tree.
Derive the attribution algorithms as discussed in Section 8.2.
Derive the attribute storage layout as discussed in Section 8.3.
Code the attribution rules and auxiliary functions.
As we noted in connection with Figure 4.2, the distinction between the concrete and
abstract syntax is that groups of symbols appearing in the former are really dierent names
for a single construct of the latter, and hence chain rules that simply transform one of these
symbols into another are omitted. The abstract syntax is derived from the attribute grammar
by identifying symbols whose attributes are the same, and deleting all rules whose attribution
consists solely of transfers.
We extract the context-free syntax directly from the attribute grammar for input to a
parser generator. The only thing missing is the connection point specications, which can be
attached systematically as discussed in Section 7.1.1. If a rule does not belong to the abstract
syntax, no connection points are attached to it. Thus the parser uses the concrete syntax for
its analysis of the token sequence, but produces a connection sequence that is a linearization
of a structure tree obeying the abstract syntax.
The result of the attribution algorithm specication leads to the choice of analysis tech-
nique: multi-pass, ordered, etc. As with the selection of a parsing technique discussed in
Chapter 7, this choice depends primarily upon the experience of the compiler writer and the
availability of tools for automated processing. Tools are indispensable if ordered grammars
202 Semantic Analysis

rule statements := statements ';' statement .

attribution
statements[1].primode statement.primode ;
statements[2].postmode N_mode (void_type );
statement.postmode statements[1].postmode ;

rule iteration := 'while' expression loop .

attribution
iteration.primode N_mode (void_type );
expression.postmode N_mode (bool_type );
loop.postmode N_mode (void_type );
condition
iteration.postmode.k = void_type ;

rule iteration :=
'for' identifier 'from' expression 'to' expression loop .
attribution
iteration.primode N_mode (void_type );
expression[1].postmode N_mode (int_type );
expression[2].postmode N_mode (int_type );
loop.environment
N_definition (gennum ,identifier.sym ,object_definition ,
N_mode (int_type )) &
iteration.environment ;
loop.postmode N_mode (void_type );
condition
iteration.postmode.k = void_type ;

rule jump := 'goto' identifier_use .

attribution
jump.primode N_mode (void_type );
condition
jump.postmode.k = void_type and
(identifier_use.corresponding_definition.k = label_definition or
identifier_use.corresponding_definition.k = unknown_definition );

Figure 9.20: A Priori and A Posteriori Types in Statements

are to be used; the evaluation sequence for multi-pass grammars can be obtained by hand.
Further, the available memory plays a role. Roughly the same amount of memory suces to
store the attributes for any method, if intermediate attributes are suitably overlaid. In the
case of multi-pass evaluation, however, the algorithm and attribution rules can be segmented
and overlaid so that only the relevant part is required during each pass.
The storage layout of the attributes is xed last, based upon the discussion in Section 8.3.2.
As noted there, particular attention must be paid to the interaction among attribute repre-
sentation, algorithms and formulation of the attribution rules. Often one can in uence the
entire behavior of the semantic analysis through small (in terms of content) variations in
the attribute representation or attribution rules. For example, a one-pass attribution for
languages like Pascal is usually not obtained at rst, but only after some modication of
the original specication. This is not surprising, since the language description discussed in
Section 9.1 aims above all for a correct rendition of the language properties and does not
consider implementability.
9.2 Implementation of Semantic Analysis 203

One of the most common attributes in the structure tree is the environment, which allows
us to determine the meaning of an identier at a given point in the program. In the simplest
case, for example in several machine-oriented languages, each identier has exactly one de-
nition in the program. The denition entry can then be reached directly via a pointer in the
symbol table. In fact, the symbol and denition table can be integrated into a single table in
this case.
Range header Possession relations for the range

Symbol
Entity

Current
Possession
relation

Possession relations holding in

outer ranges
Symbol stack
headers
Note : `Entity' is a pointer to a denition.
Figure 9.21: A Denition Table Structure
Most languages permit an identier to have several meanings. Figure 9.21 shows a deni-
tion table organization that provides access to the current denition for an identier, given its
symbol table entry, in constant time: The symbol table entry points to a stack of elements,
the rst of which contains a pointer to the current possession, and the current possession
points to the denition. But this access is exactly the current definition function of Fig-
ure 9.1c. Thus Figure 9.21 allows us to implement current definition without using any
list search at all. The access time is essentially the same as that in the simple case of the
previous paragraph; only two additional memory accesses (to follow the possession pointer
contained in the stack and the denition pointer contained in the possession) are required.
At rst glance, it may seem that there is too much indirection in Figure 9.21. Why does
the stack element contain a pointer to the possession instead of a pointer to the denition?
Why does the possession contain a pointer to the denition instead of the denition itself?
The answers to these questions become clear if we examine the operations that take place on
entry to and exit from a range, when the set of currently-valid declarations changes and the
denition table must be updated to re ect these changes.
204 Semantic Analysis

type
one = record f : integer ; g : "two end;
two = record f : boolean ; h : "one end;
var
j : "one ;
:::
with j " do
begin
:::
with g " do
begin
:::
with h " do
begin
:::
end
end
end;
Figure 9.22: Self-Nesting Ranges
When a range is entered, the stack for each identier dened in the range must be pushed
down and an entry describing the denition valid in this range placed on top. Conversely,
the stack for each identier dened in a range must be popped when leaving that range. To
simplify the updating, we represent the range by a linear list of elements specifying a symbol
table entry and a corresponding denition as shown at the top of Figure 9.21. This gives
constant-time access to the stacks to be pushed or popped, and means that the amount of
time required to enter or leave a range is linear in the number of identiers having denitions
in it.
We use a pointer to the denition rather than the denition itself in the range list because
many identiers in dierent ranges may refer to the same denition. (For example, in Pascal
many type identiers might refer to the same complex record type.) By using a pointer we
avoid having to store multiple copies of the denition itself, and also we simplify equality
tests on denitions.
We stack a pointer to the appropriate range list entry instead of stacking the range list
entry itself because it is possible to enter a range and then enter it again before leaving it.
(Figure 9.22 is a Pascal fragment that has this property. The statement with j " enters the
range of the record type one ; the range will be left at the end of that statement. However,
the nested statement with h " also enters the same range!) When a range is entered twice
without being left, its denitions are stacked twice. If the (single) range list entry were placed
on the stack twice, a cycle would be created and the compiler would fail.
Finally, we stack a pointer to the range list entry rather than a pointer to the denition
to cater for languages (such as COBOL and PL/1) that allow partial qualication: In a eld
selection the specication of the containing record may be omitted if it can be determined
unambiguously. (This assumes that, in contrast to LAX, exactly one object exists for each
record type. In other words, the concepts of record and record type merge.)
Figure 9.23 illustrates the problem of partial qualication, using an example from PL/1.
Each qualied name must include sucient identiers to resolve any ambiguity within a single
block; the reference is unambiguous if either or both of the following conditions hold:
The reference gives a valid qualication for exactly one declaration.
The reference gives the complete qualication for exactly one declaration.
9.2 Implementation of Semantic Analysis 205

A: PROCEDURE;
DECLARE
1 W,
:::;
B: PROCEDURE;
DECLARE
P,
1 Q,
2 R,
3 Z,
2 X,
3 Y,
3 Z,
3 Q;
Y = R.Z; /* Q.X.Y from B, Q.R.Z from B */
W = Q, BY NAME; /* W from A, major Q from B */
C: PROCEDURE
DECLARE Y,
1 R,
2 Z;
Z = Q.Y /* R.Z from C, Q.X.Y from B */
X = R, BY NAME; /* Q.X from B, R from C */
END C;
END B;
END A;
Figure 9.23: Partial Qualication
Most of the references in Figure 9.23 are unambiguous because the rst of these conditions
holds. The Q in W = Q, however, gives a valid qualication for either the major structure or
the eld Q:X:Q; it is unambiguous because it gives the complete qualication of the major
structure. References Z and Q:Z in procedure B would be ambiguous.
In order to properly analyze Figure 9.23, we must add three items of structural in-
formation to each possession relation in Figure 9.21: The level is the number of identi-
ers in a fully-qualied reference to the entity possessed. If the level is greater than 1,
containing structure points to the possession relation for the containing structure. In any
case, the range to which the possession belongs must be specied. Figure 9.24 shows the
possession relations for procedure B of Figure 9.23. Note that this range contains two valid
possession relations for Q and two for Z . The symbol stack entries for Z have been included
to show that this results in two stack entries for the same range.
A reference is represented by an array of symbols. The stack corresponding to the last
of these is scanned, and the test of Figure 9.25 applied to each possession relation. When a
relation satisfying the test is found, no further ranges are tested; any other relations for the
same symbol within that range must be tested, however. If more than one relation in a range
satises the test, then the reference is ambiguous unless the level of one of the relations is
equal to the number of symbols in the reference.
A denition table module might provide the following operations:
New range () range : Establish a new range.
Add possession (symbol, definition, range) : Add a possession relation to a given
range.
Enter range (range) : Enter a given range.
206 Semantic Analysis
Range header Possession relations for the range

P Q R Z X Y Z Q

1 1 2 3 2 3 3 3

Symbol stack
headers
Figure 9.24: Range Specication Including Structure
Leave range : Leave the current range.
Current definition (symbol) definition : Identify the denition corresponding to
a given identier at the current point in the program.
Definition in range (symbol, range) definition : Identify the denition corre-
sponding to a given identier in a given range.
The rst two of these operations are used to build the range lists. The next three have been
discussed in detail above. The last is needed for eld selection in languages such as Pascal
and LAX. Recall the treatment of eld selection in Figure 9.17. There the environment in
which the eld identier was sought consisted only of the eld identiers dened in the record
yielded by name . This is exactly the function of definition in range . If we were to enter
the range corresponding to the record and then use current definition , we would not
achieve the desired eect. If the identier sought were not dened in the record's range, but
was dened in an enclosing range, the latter denition would be found!
Unfortunately, definition in range must perform a search. (Actually, the search is
slightly cheaper than the incorrect implementation outlined in the previous paragraph.) It
might linearly search the list of denitions for the range representing the record type. This
technique is advantageous if the number of elds in the record is not too large. Alternatively,
we could associate a list of pairs (record type, pointer to a denition entry for a eld with
this selector) with each identier and search that. This would be advantageous if the number
of record types in which an identier occurred was, on the average, smaller than the number
of elds in a record.

9.3 Notes and References

Many language denitions use context-free syntax rules to indicate properties that are more
easily checked with attribute computations. The compiler designer should not slavishly follow
the language denition in this regard; checks should be apportioned between the context-free
rules and attribution rules on the basis of simplicity.
9.3 Notes and References 207

In many compilers the semantic analysis is not treated as a separate task but as a by-
product of parsing or code generation. The result is generally that the static semantic condi-
tions are not fully veried, so erroneous programs are sometimes accepted as correct. We have
taken the view here that semantic analysis is the fundamental target-independent task of the
compiler, and should be the controlling factor in the development of the analysis module.
type
possession = record
"
range : range_header ;
"
next : possession ;
possessing_symbol : symbol ;
possessed_entity : entity ;
level : integer ;
containing_structure : possession "
end;
symbol_array = array
[1..max_qualifiers ] of symbol ;

function test (qualifier :symbol_array ;i :integer ;p :possession ) : boolean ;

(* Check a qualified reference
On entry-
qualifier =reference to be checked
i =number of symbols in the reference
p =possession to be checked
If the reference describes the possession then on exit-
test = true
else on exit-
test = false *)
label 1;
begin (* test *)
test := true ;
while i < p.level do
begin
if qualifier[i] = p.possessing_symbol then
begin
i := i - 1;
ifi = 0 then goto
1;
end ;
p := p.containing_structure
end ;
if i = p.level then
while qualifier[i] = p.possessing_symbol do
begin
i := i - 1;
if
i = 0 then goto
1;
p := p.containing_structure
end
;
test := false
1 : end
; (* test *)

Figure 9.25: Test for Partially-Qualied Reference

208 Semantic Analysis

Many of the techniques presented here for describing specic language facilities were the
result of experience with attribute grammars for PEARL, [DIN, 1980] Pascal [Kastens et al.,
1982], and Ada [Uhl et al., 1982] developed at the Universitat Karlsruhe. The representation
of arbitrarily many types by lists was rst discussed in conjunction with ALGOL 68 com-
pilers [Peck, 1971]. Koster [1969] described the recursive algorithm for ALGOL 68 mode
equivalence using this representation.
The attribution process for Ada operator identication sketched in Section 9.1.4 is due
to Persch and his colleagues [Persch et al., 1980]. Baker [1982] has proposed a similar
algorithm that computes attributes containing pointers to the operator nodes that must be
identied. The advantage claimed by the author is that if the nodes can be accessed randomly,
this means that a complete second traversal is unnecessary. Operator identication cannot
be considered in isolation, however. It is not at all clear that a second complete traversal
will not be required by other attribution, giving us the operator identication `for free'. This
illustrates the importance of constructing the complete attribute grammar without regard to
number of traversals, and then processing it to determine the overall evaluation order.
Most authors combine the symbol and denition tables into a single `symbol table' [Gries,
1971; Bauer and Eickel, 1976; Aho and Ullman, 1977]. Separate tables appear in descrip-
tions of multi-pass compilers and serve above all to reduce the main storage requirements;
[Naur, 1964] the literature on ALGOL 68 [Peck, 1971] is an exception. In his description of
a multi-pass compiler for `sequential Pascal', Hartmann [1977] separates the tables both to
reduce the storage requirement and simplify the compiler structure.
The basic structure of the denition table was developed for ALGOL 60 [Randell and
Russell, 1964; Grau et al., 1967; Gries, 1971]. We have rened this structure to allow
it to handle record types and incompletely-qualied identiers [Busam, 1971]. An algebraic
specication of a module similar to that sketched at the end of Section 9.2 was given by
Guttag [1975, 1977].

Exercises
9.1 Determine the visibility properties of Pascal labels. Write attribution rules that embody
these properties. Treat the prohibition against jumping into a compound statement as
a restriction on the visibility of the label denition (as opposed to the label declaration,
which appears in the declaration part of the block).
9.2 Write the function current definition (Figure 9.1c).
9.3 Write the function unambiguous (Figure 9.2a).
9.4 Note that Figure 9.5 requires additional information: the implicit type of an identier.
Check the FORTRAN denition to nd out how this information is determined. How
would you make it available in the attribute grammar? Be specic, discussing the role
of the lexical analyzer and parser in the process.
9.5 [Sale, 1979] Give attribution rules and auxiliary functions to verify the denition before
use constraint in Pascal. Assume that the environment is being passed along the text,
as illustrated by Figure 9.4.
(a) Add a depth eld to the denition record, and provide attribution rules that set
this eld to the static nesting depth at which the denition occurred.
9.3 Notes and References 209

(b) Add attribution rules that check the denition depth at each use of an identier.
Maintain a list of identiers that have been used at a depth greater than their
denition.
(c) When an identier is dened, check the list to ensure that the identier has not
previously been used at a level greater than or equal to the current level when it
was dened at a level less than the current level.
(d) Demonstrate that your rules correctly handle Figure 9.6.
9.6 What extensions to the environment attribute are required to support modules as
dened in MODULA2?
9.7 Extend the representation of LAX types to handle enumerated types and records with
variants, described as in Pascal.
9.8 Develop type representations analogous to Figure 9.7 for FORTRAN, ALGOL 60 and
Ada.
9.9 Modify the procedure type equivalent to handle the following alterations in the LAX
denition:
(a) Structural type equivalence similar to that of ALGOL 68 is specied instead of
the equivalence of A.3.1.
(b) Union types union (t1 ; : : : ; tn ) similar to those of ALGOL 68. The sequence
of types is arbitrary and union (t1 ; union (t2 ; t3 )) = union (union (t1 ; t2 ); t3 ) =
union (t1 ; t2 ; t3 ).

9.10 Consider the case clause described in Figure 9.16.

(a) Formulate a procedure value unambiguous to verify the uniqueness of the case
labels.
(b) Alter the attribution rules to check the uniqueness at each label.
(c) Alter the attribution rules and extend the value unambiguous procedure so that
the labels may be constants of an enumerated type (see Exercise 9.7).
9.11 Prove the following relations for types t1 , t2 and t3 , using the coercion rules dened in
A.4.1:
(a) t ;t t ;t
balance ( 1 2 ) = balance ( 2 1 )
(b) t ;t ;t
balance (balance ( 1 2 ) 3 ) = balance ( t1 ; balance (t2 ; t3 ))
9.12 Suppose that we chose to use the denition table discussed in Section 9.2 for a LAX
compiler.
(a) [Guttag, 1975, 1977] The denition table module operations were stated as oper-
ations of a package, with `denition table' as an implied parameter. Restate them
as operations of an abstract data type, making this dependence explicit.
(b) Two abstract data types, range and definition table , are involved in this
module. Which of the attributes in the LAX rules discussed in this chapter will
be of type range , and which of type definition table ?
(c) Replace the computations of the attributes you listed in (b) with computations
involving the operations of the denition table module. Does this change aect
the traversal strategy?
210 Semantic Analysis

(d) Given the modied rules of (c), do any of the attributes you listed in (b) satisfy
the conditions for implementation as global variables? As global stacks? How do
your answers to these questions bear upon the implementation of the denition
table as a package vs. an abstract data type?
9.13 Develop denition tables for BASIC, FORTRAN, COBOL and Pascal.
9.14 Add the use before denition check of Exercise 9.5 to the denition table of Figure 9.21.
9.15 Give a detailed explanation of the problems encountered when analyzing Figure 9.22 if
possession relation entries are stacked directly.
9.16 How must a Pascal denition table be set up to handle the with statement? (Hint:
Build a stack of with expressions for each record type.)
9.17 Show the development during compilation of the denition table for the program of
Figure 9.23 by giving a sequence of snapshots.
Chapter 10
Code Generation
The code generator creates a target tree from a structure tree. This task has, in principle,
three subtasks:

Resource allocation: Determine the resources that will be required and used during
execution of instruction sequences. (Since in our case the resources consist primarily of
registers, we shall speak of this as register allocation.)
Execution order determination: Specify the sequence in which the descendants of a node
will be evaluated.
Code selection: Select the nal instruction sequence corresponding to the operations
appearing in the structure tree under the mapping discussed in Chapter 3.

In order to produce code optimum under a cost criterion that minimizes either program
length or execution time, these subtasks must be intertwined and iterated. The problem is
NP-complete even for simple machine architectures, which indicates that in practice the cost
will be exponential in the number of structure tree nodes. In view of the simple form of
the expressions that actually occur in programs, however, it is usually sucient to employ
linear-cost algorithms that do not necessarily produce the optimum code in all cases.
The approach taken in this chapter is to rst map the source-language objects onto the
memory of the target machine. An estimate of register usage is then made, and the execution
order determined on the basis of that estimate. Finally, the behavior of the target machine is
simulated during an execution-order traversal of the structure tree, driving the code selection
and register assignment. The earlier estimate of register usage must guarantee that all register
requirements can actually be met during the nal traversal. The code may be suboptimal in
some cases because the nal register assignment cannot aect the execution order.
The computation graph discussed in Section 4.1.3 is implicit in the execution-order struc-
ture tree traversal. Chapter 13 will make the computation graph explicit, and discuss opti-
mizing transformations that can be applied to it. If a compiler writer follows the strategies of
Chapter 13, some of the optimization discussed here becomes redundant. Nevertheless, the
three code generation subtasks introduced above remain unchanged.
Section 10.1 shows how the memory map is built up, starting with the storage requirements
for elementary objects given by the implementor in the mapping specication of Section 3.4.
We present the basic register usage estimation process in Section 10.2, and show how addi-
tional attributes can be used to improve the generated code. Target machine simulation and
code selection are covered in Section 10.3.
211
212 Code Generation

10.1 Memory Mapping

Memory mapping determines the size and (relative) address of each object. In the process, it
yields the sizes and alignments for all target types and the relative addresses of components
of composite objects. This information is used to nd access paths during the code selection
and, in the case of static allocation, to generate storage reservation requests to the assembly
module. It also constitutes most of the information needed to construct the type templates
discussed in Section 3.3.3, if these are required.
The storage mapping process begins with elementary objects whose sizes and alignments
are known. These are combined, step-by-step, into larger aggregates until an object is created
whose base address cannot be determined until run time. We term such an object allocatable.
Examples of allocatable objects are activation records and objects on the heap. Objects are
characterized during this aggregation process by their size and relative address within the
containing object. The sum of the base address determined at run time and the sequence of
relative addresses of aggregates in which an object is contained yields the eective address of
that object.
When the objects are combined, the compactness (packed/aligned) may be specied. This
specication in uences not only the relative address of a component, but also its size and the
alignment of the composite object: If the source language permits value constraints (e.g.
Pascal subranges), then a type can be characterized by both a size (for the unconstrained
value set) and a minimum size (taking the constraint into account). For example, in Pascal an
object dened to lie in a subrange 0..10 would have a minimum size of 4 (if sizes are expressed
in bits) or 1 (if sizes are expressed in bytes) and a size equal to that of an unconstrained integer.
When this object is combined with others in a packed composite object, its minimum size is
assumed; when the composite object is not packed, the size is used.
The alignment of a composite object that is not packed is the least common multiple of the
alignments of its components. When the object is packed, however, no alignment constraint
is imposed.
The storage mapping process can, of course, only use objects of known length as compo-
nents of other objects. As noted in Chapter 3, this means that activation records containing
arrays whose bounds are not known until run time must be split into two parts; only the
array descriptor is held in the static part. For languages like FORTRAN, in which all objects
have xed size, and in which each procedure is associated with one and only one local storage
area, the procedure and its activation record can be combined into a single allocatable object.
This object then becomes the basis for planning run-time overlay structure.
Figure 10.1 denes an interface for a memory mapping module. The module is independent
of both source language and target machine. It can be used for packing to either the memory
cell or the bit, depending upon the interpretation of the types size and location .
The basic idea of the storage module is that one has areas that may grow by accretion of
blocks (objects of known size). An area whose growth has ceased becomes a block and can
itself be added to other areas. Areas may grow either upward or downward in memory, and
the packing attribute is specied individually for each area. (Both properties are xed at
the time the area is established.) Each area has a growth point that summarizes the current
amount of the area's growth. For example, at the beginning of the variant part of a Pascal
record, the storage mapping module notes the growth point; for each alternative it resets to
that point. Since variants may be nested, the growth points must be saved on stacks (one per
area) within the memory mapping module. After all of the alternatives have been specied,
the growth point is advanced by the maximum length over all alternatives.
In Pascal, the size and alignment of each variant of a record must be kept so that new
and dispose calls can be handled correctly. This requirement is most easily satised by
10.1 Memory Mapping 213

type
area = :::
size = :::
location = :::
direction = (up,down );
strategy = (align, pack );
procedure new_area (d : direction ; s : strategy ; var
a : area );
(* Establish a new memory area
On entry -
d = growth direction for this area
s = growth strategy for this area
On exit -
a specifies the new area
*)
:::;
procedure add_block (a : area ; s : size ; alignment : integer ;
varl : location );
(* Allocate a block in an area
On entry -
a specifies the area to which the block is to be added
s = size of the block
alignment = alignment of the block
On exit -
l = relative location of the first cell of the block
*)
:::;
procedure end_area (a : area ; var
s : size ; var
alignment : integer );
(* Terminate an area
On entry -
a specifies the area to be terminated
On exit -
s = size of the resulting block
alignment = alignment of the resulting block
*)
:::;
procedure mark (a : area );
(* Mark the current growth point of an area *)
:::;
procedure back (a : area );
(* Reset the growth point of an area to the last outstanding mark *)
:::;
procedure combine (a : area );
(* Erase the last outstanding mark in an area and
reset the growth point to the maximum of all previous growths
*)
:::;

Figure 10.1: Memory Mapping Module Interface

214 Code Generation

adding two output parameters to both back and combine (Figure 10.1), making their calling
sequences identical to that of end area .
In areas that will become activation records, storage must be reserved for pointers to
static and dynamic predecessors, plus the return address and possibly a template pointer.
The size and alignment of this information is xed by the mapping specication, which may
also require space for saving registers and for other working storage. It is usually placed
either at the beginning of the record or between the parameters and local variables. (In
the latter case, the available access paths must permit both negative and positive osets.)
Finally, it is convenient to leave an activation record area open during the generation of code
for the procedure body, so that compiler-generated temporaries may be added. Only upon
completion of the code selection will the area be closed and the size and alignment of the
activation record nally determined.
In principle, the storage module is invoked at the beginning of code generation to x
the length, relative address and alignment of all declared objects and types. For languages
like Ada, integration with the semantic analyzer is essential because object size may be
interrogated by the program and must be used in verifying semantic conditions. Even in this
case, however, we must continue to regard the storage module as a part of the synthesis task
of the compiler; only the location of the calls, not the modular decomposition, is changed.

10.2 Target Attribution

In the simplest case we x the execution order without regard to target machine register
allocation. The code selector performs a depth-rst, left-to-right traversal of the structure
tree that corresponds directly to the postx form of the expressions. It does not alter the
left-to-right evaluation of the operands, since there is no additional information upon which
to base such an alteration. If the number of registers available does not suce to hold the
intermediate results while computing the value of an expression then an ad hoc decision is
made during the code generation about which intermediate value(s) should be left in memory.
In general this strategy leads to greater register requirements and longer code than necessary;
hence some planning is recommended. This planning results in computation of additional
attributes.
In this section we consider the computation of seven attributes: Register count ,
store and operand sequence are used to determine the execution order, desire and
target labels provide information about the use of a result, cost and decision are used
to modify the instruction sequence generated from a node. These attributes are evaluated
by three distinct kinds of computation, which we treat in the following subsections: Register
allocation (Section 10.2.1) is concerned with determining the temporary storage requirements
of subtrees and hence the execution order. Targeting (Section 10.2.2) species desirable place-
ment of results. Finally, algebraic identities (Section 10.2.3) can be used to obtain equivalent
computations having better properties.

10.2.1 Register Allocation

We distinguish global register allocation, which holds over an entire procedure, from local
register allocation, which controls the use of registers within expressions and in uences the
execution order. Further, we partition the task into allocation, by which we plan the register
usage, and assignment, by which we x the registers actually used for a specic purpose.
Register assignment takes place during code selection, and will be discussed in Section 10.3.1;
here we concern ourselves only with allocation.
10.2 Target Attribution 215

(x + y)=(a b + c d)
a) A LAX expression
LE 0,x LE 2,a
AE 0,y ME 2,b
LE 2,a LE 0,c
ME 2,b ME 0,d
LE 4,c AER 2,0
ME 4,d LE 0,x
AER 2,4 AE 0,y
DER 0,2 DER 0,2
(uses 3 registers) (uses 2 registers)
b) Two possible IBM 370 implementations
Figure 10.2: Dependence of Register Usage on Evaluation Order

Global register allocation begins with values specied by the implementation as being held
permanently in registers. This might result in the following allocations for the IBM 370:
Register 15: Subprogram entry address
Register 14: Return address
Register 13: Local activation record base address
Register 12: Global activation record base address
Register 11: Base address for constants
Register 10: Code base address
Register 9: Code oset (Section 11.1.3)
Only two registers are allocated globally as activation record bases; registers for access to
the activation records of intermediate contours are obtained from the local allocation, as are
registers for stack and heap pointers.
Most compilers use no additional global register allocation. Further global allocation
might, for example, be appropriate because most of a program's execution time is spent in
the innermost loops. We could therefore stretch the register usage considerably and shorten
the code if we reserved a xed number of registers (say, 3) for the most-frequently used values
of the innermost loops. The controlled variable of the loop is often one of these values. The
simple approach of assigning the controlled variables of the innermost loops to the reserved
registers gives very good results in practice; more complex analysis is generally unnecessary.
Upon completion of the global allocation, we must ensure that at least n registers always
remain for local allocation. Here n is the maximum number of registers used in a single
instruction. (For the IBM 370, n = 4 in the MVCL instruction.) A rule of thumb says that
we should actually guarantee that n+1 registers remain for local allocation, which allows at
least one additional intermediate result or base address to be held in a register.
Pre-planning of local register allocation would be unnecessary if the number of available
registers always suced for the number of simultaneously-existing intermediate results of an
expression. Given a limited number of registers, however, we can guarantee this only for some
subtrees. Outside of these, the register requirement is not xed unambiguously: Altering the
sequence of operations may change the number of registers required. Figure 10.2 shows an
example.
The general strategy for local register allocation is to seek subtrees evaluable, possibly
with rearrangement, using only the number of registers available to hold intermediate results.
These subtrees can be coded without additional store instructions. We choose the largest,
216 Code Generation

and generate code to evaluate it and store the result. All registers are then again available to
hold intermediate results in the next subtree.
Consider an expression represented as a structure tree and a machine with n identical
registers ri . The machine's instructions have one of the following forms:
Load: ri := memory location
Store: memory location := ri
Compute: ri := op(vj ; : : : ; vk ), where vh may be either a register or a memory location.
The machine has various computation instructions, each of which requires specic
operands in registers and memory locations. (Note that a load instruction can be consid-
ered to compute the identity function, and require a single operand in a memory location.)
We say that a program fragment is in normal form if it is written as P1 J1 : : : Ps,1 Js,1 Ps
such that each J is a store instruction, each P is a sequence containing no store instructions,
and all of the registers are free immediately after each store instruction. Let I1 : : : In be one
of the sequences containing no stores. We term this sequence strongly contiguous if, whenever
Ii is used to compute an operand of Ik (i < k) all Ij such that i j < k are also used in the
computation of operands of Ik . The sequence P1 J1 : : : Ps is in strong normal form if Pq is
strongly contiguous for all 1 q s.
Aho and Johnson [1976] shows that, provided no operand or result has a size exceeding
the capacity of a single register, an optimal program to evaluate an expression tree on our
assumed machine can be written in strong normal form. (The criterion for optimality is
minimum program length.) Thus to achieve an optimal program it suces to determine a
suitable sequence in which to evaluate the operands of each operator and { in case the register
requirements exceed n { to introduce store operations at the proper points. The result can
be described in terms of three attributes: register count , store and operand sequence .
Register count species the maximum number of registers needed simultaneously at any
point during the computation of the subtree. Store is a Boolean attribute that is true if the
result of this node must be stored. Operand sequence is an array of integers giving the order
in which the operands of the node should be evaluated. A Boolean attribute can be used if
the maximum number of operands is 2.
The conditions for a strong normal form stated above are fullled on most machines by
oating point expressions with single-length operands and results. For integer expressions
they generally do not hold, since multiplication of single-length values produces a double-
length result and division requires a double-length dividend. Under these conditions the
optimal instruction sequence may involve `oscillation'. Figure 10.3a shows a tree that requires
oscillation in any optimal program. The square nodes produce double-length values, the round
nodes single-length values. An optimal PDP11 program to evaluate the expression appears as
Figure 10.3b. The PDP11 is an `even/odd machine' { one that requires double-length values
to be held in a pair of adjacent registers, the rst of which has an even register number. No
polynomial algorithm that yields an optimal solution in this case is known.
Under the conditions that the strong normal form theorem holds and, with the exception of
the load instruction, all machine instructions take their operands from registers, the following
register allocation technique leads to minimum register requirements: For the case of two
operands with register requirements k1 < k2 , always evaluate the one requiring k1 registers
rst. The result remains as an intermediate value in a register, so that while evaluating the
other operand, k2 +1 registers are actually required. Since k1 < k2 however, the total register
requirement cannot exceed k1 .
When k1 = k2 , either operand may be evaluated rst. The evaluation of the rst operand
will still require k1 registers and the result remains in a register. Thus k1 + 1 registers will be
10.2 Target Attribution 217

DIV

* DIV

DIV F * +

+ E G H I J

* * Note:Round nodes have single-length results

Square nodes have double-length results
A B C D

a) An expression involving single- and double-length values

MOV A,R0 (R0,R1):=A*B
MUL B,R0
MOV C,R2 (R2,R3):=C*D
MUL D,R2
ADD R3,R1 (R0,R1):=(R0,R1)+(R2,R3)
ADC R2
ADD R2,R0
DIV E,R0 R0:=(R0,R1) DIV E
MOV G,R2 (R2,R3):=G*H
MUL H,R2
MOV I,R1 R1:=I+J
ADD J,R1
DIV R1,R2 R2:=(R2,R3) DIV R1
MUL F,R0 (R0,R1):=R0*F
DIV R2,R0 R0:=(R0,R1) DIV R2
b) An optimal PDP11 program to evaluate (a)
Figure 10.3: Oscillation

needed to evaluate the second operand, leading to an overall requirement for k1 + 1 registers.
If k1 = n then it is not possible to evaluate the entire expression in the registers available,
although either subexpression can be evaluated entirely in registers. We therefore evaluate
one operand (usually the second) and store the result. This leaves all n registers free to
evaluate the other operand. Figure 10.4 formalizes the computation of these attributes.
If the second operand may be either in a register or in memory we apply the same rules,
but begin with simple operands having a register count of 0; further, the left operand
count is replaced by max (expression[2].register count ; 1) since the rst operand must
always be loaded and therefore has a cost of at least one register. Extension to the case in
which the second operand must be in memory (as for halfword arithmetic on the IBM 370)
presents some additional problems (Exercise 10.3). For integermutiplication and division we
must take account of the fact that the result (respectively the rst operand) requires two
registers. The resulting sequence is not always optimal in this case.
Several independent sets of registers can also be dealt with in this manner; examples
are general registers and oating point registers or general registers and index registers. The
problem of the Univac 1108, in which the index registers and general registers overlap, requires
additional thought.
On machines like the PDP11 or Motorola 68000, which have stack instructions in addition
to registers or the ability to execute operations with all operands and the result in memory,
218 Code Generation

rule expression ::= simple_operand .

attribution
expression.register_count 1;
expression.operand_sequence true,

rule expression ::= expression operator expression .

attribution
expression[1].operand_sequence
expression[2].register_count > expression[3].register_count ;
expression[1].register_count
if
expression[2].register_count = expression[3].register_count
then
min (expression[2].register_count + 1,n )
else
max (expression[2].register_count ,expression[3].register_count );
expression[2].store false;
expression[3].store
expression[2].register_count = n
and
expression[3].register_count = n ;

Figure 10.4: Local Register Allocation and Execution Order Determination

optimization of the local register allocation is a very dicult problem. The minimum register
requirement in these cases is always 0, so that we must include the program length or execution
time as cost criteria. The result is that in general memory-to-memory operations are only
reasonable if no operands are available in registers, and also the result does not appear in a
register and will not be required in one. Operations involving the stack usually have longer
execution time than operations of the same length involving registers. On the other hand,
the operations to move data between registers and the stack are usually shorter and faster
than register-memory moves. As a general principle, then, intermediate results that must be
stored because of insucient registers should be placed on the stack.
10.2.2 Targeting
Targeting attributes are inherited attributes used to provide information about the desired
destination of a result or target of a jump.
We use the targeting attribute desire to indicate that a particular operand should be in a
register of a particular class. If a descendant can arrange to have its result in a suitable register
at no extra cost, this should be done. Figure 10.5 gives the attribution rules for expressions
containing the four basic arithmetic operations, assuming the IBM 370 as the target machine.
This machine requires a multiplicand to be in an odd register, and a dividend to be in a
register pair. We therefore target a single-length dividend to the even-numbered register of
the pair, so that it can be extended to double-length with a simple shift.
In the case of the commutative operators addition and multiplication, we target both operands
to the desired register class. Then if the register allocation can satisfy our preference for
the second operand but not the rst, we make use of commutativity (Section 10.2.3) and
interchange the operands. If neither of the preferences can be satised, then an instruction to
move the information to the proper register will be generated as a part of the coding of the
multiplication or division operator. No disadvantages arise from inability to satisfy the stated
10.2 Target Attribution 219

type register_class = (dont_care , even , odd , pair );

rule expression ::= expression operator expression .

attribution
expression[2].desire
case operator.operator of
plus , minus :
if
expression[1].desire = pair
theneven
else
expression[1].desire ;
times : odd ;
divided_by : even
end
;
expression[3].desire
case operator.operator of
plus :
if
expression[1].desire = pair
theneven
else
expression[1].desire ;
times : odd ;
otherwisedont_care
end
;

Figure 10.5: Even/Odd Register Targeting for the IBM 370

preference. This example illustrates the importance of the non-binding nature of targeting
information. We propagate our desire to both branches in the hope it will be satised on one
of them. If it is satised on one branch then it is actually spurious on the other, and no cost
should be incurred by trying to satisfy it there.
Many Boolean expressions can be evaluated using conditional jumps (Section 3.2.3), and it
is necessary to specify the address at which execution continues after each jump. Figure 10.6
shows the attribution used to obtain short-circuit evaluation, in the context of a conditional
jump. (If short-circuit evaluation is not permitted by the language, the only change is to delay
generation of the conditional jumps until after all operands not containing Boolean operators
have been evaluated, as discussed in Section 3.2.3.) Labels (and procedure entry points) are
specied by references to target tree elements, for which the assembler must later substitute
addresses. Thus the type assembler symbol is dened not by the code generator, but by the
assembler (Section 11.1.1).
Given the attribution of Figure 10.6, it is easy to see how code is generated: A conditional
jump instruction is produced following the code to evaluate each operand that contains no
further Boolean operators (e.g. a relation). The target of the jump is the label that does not
immediately follow the operand, and the condition is chosen accordingly. Boolean operator
nodes generate no code at all. Moreover, the execution order is xed; no use of commutativity
is allowed.
10.2.3 Use of Algebraic Identities
The goal of the attribution discussed in Section 10.2.1 was to reduce the register requirements
of an expression, which usually leads to a reduction in the length of the code sequence. The
length of the code sequence can often be reduced further through use of the algebraic identities
220 Code Generation

type boolean_labels = record

false_label , true_label : assembler_symbol ;
immediate_successor : boolean ;
end;

rule conditional_clause ::=

'if' boolean_expression 'then' statement_list
'else' statement_list 'end'.
attribution
boolean_expression.location N_assembler_symbol ;
conditional_clause.then_location N_assembler_symbol ;
conditional_clause.else_location N_assembler_symbol ;
boolean_expression.jump_target
N_boolean_labels (
conditional_clause.else_location ,
conditional_clause.then_location ,
true ); (* true target follows immediately *)

rule boolean_expression ::=

boolean_expression boolean_operator boolean_expression .
attribution
boolean_expression[2].location boolean_expression[1].location ;
boolean_expression[3].location N_assembler_symbol ;
boolean_expression[2].jump_target
if
boolean_operator.operator = 'or' then
N_boolean_labels (
boolean_expression[3].location ,
boolean_expression[1].jump_target.true_label ,
false ) (* false target follows immediately *)
else
(* operator must be *)and
N_boolean_labels (
boolean_expression[1].jump_target.false_label ,
boolean_expression[3].location ,
true );
boolean_expression[3].jump_target
boolean_expression[1].jump_target ;

rule boolean_expression ::= 'not' boolean_expression .

attribution
boolean_expression[2].location boolean_expression[1].location ,
boolean_expression[2].jump_target
N_boolean_labels (
boolean_expression[1].jump_target.true_label ,
boolean_expression[1].jump_target.false_label ,
not
boolean_expression[1].jump_target.immediate_successor) ;

Figure 10.6: Jump Targeting for Boolean Expression Evaluation

10.2 Target Attribution 221

summarized in Figure 10.7a. We distinguish two steps in this reduction:

Reduction of the number of computational instructions.
Reduction of the number of load instructions.

x+y =y+x
x , y = x + (,y) = ,(y , x)
,(,x) = x
x y = y x = (,x) (,y)
,(x y) = (,x) y = x (,y)
a) Identities for integer and real operands
L 1,x
LNR 1,1
L 2,y
S 2,z
MR 0,2
b) Computation of (,x) (y , z )
L 2,z
S 2,y
L 1,x
MR 0,2
c) Computation of x (z , y), which is equivalent to (b)
L 1,z
S 1,y
M 0,x
d) Computation of (z , y) x, which is equivalent to (c)
Figure 10.7: Algebraic Identities
The number of computational instructions can be reduced by, for example, using the
identities of Figure 10.7a to remove a change of sign or combine it with a load instruction
(unary complement elimination). Load operations can be avoided by applying commutativity
when the right operand of a commutative operator is already in a register and the left operand
is still in memory. Figures 10.7b-d give a simple example of these ideas.
None of the identities of Figure 10.7a involve the associative or distributive laws of algebra.
Computers do not obey these axioms, and hence transformations based upon them are not
safe. Also, if the target machine uses a radix-complement representation for negative numbers
then the identity ,(,x) = x fails when x is the most negative representable value, leaving
commutativity of addition and multiplication as the only safe identities. As implementors,
however, we are free to specify the range of values representable using a given type. By simply
stating that the most negative value does not lie in that range, we can use all of the identities
listed in Figure 10.7a. This does not unduly constrain the programmer, since its only eect is
to make the range symmetric and thus remove an anomaly of the hardware arithmetic. (We
normally remove the analogous anomaly of sign-magnitude representation, the negative zero,
without debate.)
222 Code Generation

Although use of algebraic identities can reduce the register requirement, the decisive cost
criterion is the code size. Here we assume that every instruction has the same cost; in prac-
tical applications the respective instruction lengths must be introduced. Let us also assume,
for the moment, a machine that only provides register-register arithmetic instructions. All
operands must therefore be loaded into registers before they are used. We shall restrict our-
selves to addition, subtraction, multiplication and negation in this example and assume that
multiplication yields a single-length result. The basic idea consists of attaching a synthesized
attribute, cost , to each expression. Cost species the minimum costs (number of instruc-
tions) to compute the result of the expression in its correct and inverse (negated) form. It is
determined from the costs of the operation, the operand computations, and any complement-
ing required. An inherited attribute, decision , is then computed on the basis of these costs
and species the actual form (correct or inverse) that should be used.
To generate code for a node, we must know which operation to actually implement. (In
general this may dier from the operator appearing in the structure tree.) If the actual
operation is not commutative then we have to know whether the operands are to be taken
in the order given by the structure tree or not. Finally, we need to know whether the result
must be complemented. As shown in Table 10.1, all of this information can be deduced from
the structure tree operator and the forms of the operands and result.
Tree Result Operand k Reverse Negate Actual Method
Node Form Forms Operands Operation
c cc 1 false false plus a+b
ci 1 false false minus a , (,b)
ic 1 true false minus b , (,a)
ii 2 false true plus ,(,a + (,b))
a+b i cc 2 false true plus ,(a + b)
ci 1 true false minus ,b , a
ic 1 false false minus ,a , b
ii 1 false false plus ,a + (,b)
c cc 1 false false minus a,b
ci 1 false false plus a + (,b)
ic 2 false true plus ,(,a + b)
ii 1 true false minus ,b , (,a)
a,b i cc 1 true false minus b,a
ci 2 false true plus ,(,a + (,b))
ic 1 false false plus ,a + b
ii 1 false false minus ,a , (,b)
c cc 1 false false times ab
ci 2 false true times ,(a (,b))
ic 2 false true times ,(,a b)
ii 1 false false times ,a (,b)
ab i cc 2 false true times ,(a b)
ci 1 false false times a (,b)
ic 1 false false times ,a b
ii 2 false true times ,(,a (,b)
c means that the sign of the operand is not inverted
i means that the sign of the operand is inverted
k is a typical cost of the operation in instructions
Table 10.1: Unary Complement Elimination
10.2 Target Attribution 223

type form = (correct , inverse );

combination = (cc, ci, ic, ii );
cost_specification = array
[correct .. inverse ] of record
length : integer ;
operands : combination
end ;
function best (op:operator;kcc,kci,kic,kii:integer):cost_specification ;
(* Determine the cheapest combination
On entry- op = Structure tree operator
kpq = Sum of the operand costs for combination pq
On exit - best = Cost of the optimum instructions yielding,
respectively, the correct and inverted values of the
expression *)
var operand_length : array
[ci .. ii ] of
integer ;
cost : cost_specification ;
next : integer ;
begin (* best *)
operand_length[ci] := kci ;
operand_length[ic] := kic ;
operand_length[ii] := kii ;
for f := correct to
inverse do
begin
cost[f].length := kcc + k[op, f, cc]; cost[f].operands := cc ;
for pq := ci to
ii do
begin
next :=operand_length[pq]+k[op,f,pq] ; (*k from Table 10.8*)
if cost[f].length > next then begin
cost[f].length := next; cost[f].operands := pq
end
end
end;
best := cost
end; (* best *)
Figure 10.8: The Cost Attribute
The k column of Table 10.1 gives the cost of the operation, including any complementing.
This information is used to obtain the minimum costs of the correct and inverse forms of
the expression as shown in Figure 10.8: Best is invoked with the structure tree operator
and the costs of all combinations of operand computations. It tests all of the possibilities,
nding the combination of operand forms that minimizes the cost of computing each of the
possible result forms. Figure 10.9 gives the attribution rules. Note that the costs assessed to
simple operands in Figure 10.9 do not include the cost of a load operation. Loads and stores
are completely determined by the local register allocation process for a machine with only
register-register instructions.
Let us now consider a machine that has an additional instruction for each binary arithmetic
operation. These additional instructions require the left operand value to be in a register and
the right operand value to be in memory. Since the best choice of computation depends upon
the operand locations, we must extend Table 10.1 to include this information. Table 10.2
shows such an extension for the integer addition operator. The k column of Table 10.2
includes the cost of a load instruction when both operands are in memory.
224 Code Generation

rule assignment ::= name ':=' expression .

attribution
expression.decision correct ;

rule expression ::= denotation .

attribution
expression.cost
N_cost_specification ( (* Combination is a dummy value *)
0, cc , (* Load instruction only *)
0, cc ); (* Negative constant is stored *)

rule expression ::= name .

attribution
expression.cost
N_cost_specification ( (* Combination is a dummy value *)
0, cc , (* Load instruction only *)
1, cc ); (* Load and complement *)

rule expression ::= expression binary_operator expression .

attribution
expression[1].cost
best (
binary_operator.op ,
expression[2].cost[correct].length+expression[3].cost[correct].length ,
expression[2].cost[correct].length+expression[3].cost[inverse].length ,
expression[2].cost[inverse].length+expression[3].cost[correct].length ,
expression[2].cost[inverse].length+expression[3].cost[inverse].length );
Vexpression[2].decision
if
expression[1].cost[expression[1].decision].operands in
[cc, ci ]
then
correct
else
inverse ;
expression[3].decision
if
expression[1].cost[expression[1].decision].operands in
[cc, ic ]
then
correct
else
inverse ;

rule expression ::= unary_operator expression .

attribution
expression[1].cost
best (
unary_operator.op ,
expression[2].cost[correct].length ,
maxint, maxint , (* ci, ic are invalid in this case *)
expression[2].cost[inverse].length );
expression[2].decision
if
expression[1].cost[expression[1].decision].operands = cc
then
correct
else
inverse ;

Figure 10.9: Unary Complement Costing

10.2 Target Attribution 225

We took the operand location as xed in deriving Table 10.2. This meant, for example,
that when the correct left operand was in memory and the inverted right operand was in a
register we used the sequence subtract , negate to obtain the correct value of the expression
(Table 10.2, row 7). We could also have used the sequence load , subtract , but this would
have increased the register requirements. If we allow the unary complement elimination to
alter the register requirements then it must be integrated with the local register allocation,
increasing the number of attribute dependencies and possibly requiring a more complex tree
traversal. Our approach is optimal provided that the cost of a load instruction is never less
than the cost of negating a value in a register.
Result Operand Operand k Reverse Negate Actual Method
Form Forms Locations Operands Operation
c cc rr 1 false false plus a+b
rm 1 false false plus a+b
mr 1 true false plus b+a
mm 2 false false plus a+b
ci rr 1 false false minus a , (,b)
rm 1 false false minus a , (,b)
mr 2 true true minus ,(,b , a)
mm 2 false false minus a , (,b)
ic rr 1 true false minus b , (,a)
rm 2 false true minus ,(,a , b)
mr 1 true false minus b , (,a)
mm 2 true false minus b , (,a)
ii rr 2 false true plus ,(,a + (,b))
rm 2 false true plus ,(,a + (,b))
mr 2 true true plus ,(,b + (,a))
mm 3 false true plus ,(,a + (,b))
i cc rr 2 false true plus ,(a + b)
rm 2 false true plus ,(a + b)
mr 2 true true plus ,(b + a)
mm 3 false true plus ,(a + b)
ci rr 1 true false minus ,b , a
rm 2 false true minus ,(a , (,b))
mr 1 true false minus ,b , a
mm 2 true false minus ,b , a
ic rr 1 false false minus ,a , b
rm 1 false false minus ,a , b
mr 2 true true minus ,(b , (,a))
mm 2 false false minus ,a , b
ii rr 1 false false plus ,a + (,b)
rm 1 false false plus ,a + (,b)
mr 1 true false plus ,b + (,a)
mm 2 false false plus ,a + (,b)
c means that the sign of the operand is not inverted
i means that the sign of the operand is inverted
r means that the value of the operand is in a register
m means that the value of the operand is in memory
k is a typical cost of the operation in instructions
Table 10.2: Addition on a Machine with Both Memory and Register Operands
226 Code Generation

When we apply algebraic identities on a machine with both register-register and register-
memory instructions, the local register allocation process should assume that each computa-
tional instruction can accept any of its operands either in a register or in memory, and returns
its result to a register (the general model proposed in Section 10.2.1). This assumption leads
to the proper register requirement, and allows complete freedom in applying the identities.
Local register allocation decides the evaluation order of the operands, but leaves open the
question of which operand is left and which is right. Algebraic identities, on the other hand,
deal with the choice of left and right operands but make no decisions about evaluation order.

10.3 Code Selection

Although the techniques of the previous sections largely determine the shape of the generated
code, a number of problems remain open. These include the nal assignment of registers and
the question of which instructions will actually implement a previously-specied operation:
On the IBM 370, for example, can a constant be loaded with an LA instruction or must it be
stored as a literal? Does an addition of two addresses require a separate add instruction, or
can the addition be carried out during computation of the eective address of the following
instruction?

10.3.1 Machine Simulation

The relationship between values computed by the program being compiled and the machine
resources that will be used to represent them during execution can be characterized by a
sequence of machine states. These states form the pre- and post-conditions for the generated
instructions. We could include the machine state as an attribute in the structure tree and
specify it in advance by attribution rules. This would mean, for example, that we would com-
bine register assignment with local register allocation and thereby specify the nal register
numbers for operands and results. Such a strategy complicates a number of optimizations,
however. Examples are the re-use of intermediate results that remain in registers from previ-
ous computations in the same expression, and the delay of store instructions discussed below.
Thus we assume that, during the execution-order traverssal of the structure tree in which
code selection takes place, a machine simulation is used to determine the run-time machine
state as closely as possible.
Every value computed by the program and every allocatable resource of the target machine
is (conceptually) specied by a descriptor. The machine state consists of links between these
descriptors, indicating the relationship between the values and the resources representing them
at a given point in the execution sequence. Figure 10.10 shows typical descriptor layouts for
implementing LAX on the IBM 370.
Constants that might appear in the address eld of the instruction, and constants whose
values are to be processed further by the code generator, are described by the value class
literal value . Other constants, like strings and oating point numbers, will be placed in
storage and consequently appear as memory values.
Label and procedure references are represented by closures (Section 2.5.2), leaving the
code location to be dened by the assembler and indicating the proper environment by an
execution-time value. Note that this representation is used only for an explicit label or
procedure reference; the closure for a label or procedure-type variable or parameter is not
known at compile time and must therefore appear as a memory or register value.
The value descriptors of Figure 10.10 contain no information for the storage classes `pro-
gram counter' and `condition code' (Section 3.1.1), since these classes occur only implicitly in
10.3 Code Selection 227

IBM 370 instructions. The situation could be dierent on the PDP11, where explicit assign-
ments to the program counter are possible. Computers like the Motorola 68000 and PDP11,
which provide stack instructions, also require information about the storage class `stack'. The
actual representation in the descriptor depends upon how many stacks there are and whether
only the top element or also lower elements can be accessed. We restrict ourselves here to
two storage classes: `main storage' and `registers'. Similar techniques can be used for other
storage classes.
type
main_storage_access = record
"
base , index : value_descriptor ;
displacement : internal_int ;
end;

value_class = (, (* Current access *)

literal_value , (* Manipulable integer constant *)
label_reference , (* Explicitly-referenced label *)
procedure_reference , (* Explicitly-referenced procedure *)
general_register , (* Single general register *)
register_pair , (* Adjacent even/odd general registers *)
floating_point_register , (* Single floating point register *)
memory_address , (* Pointer to a memory location *)
memory_value ); (* Contents of a memory location *)

value_descriptor = record
tmode : target_type ; (* Pointer to target definition table *)
case class : value_class of
literal_value :
(lval : internal_int );
label_reference , procedure_reference :
(code : assembler_symbol ;
"
environment : value_descriptor );
general_register , register_pair , floating_point_register :
"
(reg : register_descriptor );
memory_address , memory_value :
(location : main_storage_access )
end;

register_state = ( (* Current usage *)

free , (* Unused *)
copy , (* A copy exists in memory *)
unique , (* No other copy available *)
locked ); (* Not available for assignment *)

register_descriptor = record
state : register_state ;
"
content : value_descriptor ;
memory_copy : main_storage_access ;
end;
Figure 10.10: Descriptors for Implementing LAX on the IBM 370
228 Code Generation

When an access function is realizable within a given addressing structure, we say that the
accessed object is addressable within that structure. If an object required by the computation
is not addressable then the code generator must issue instructions to manipulate the state,
making it addressable, before it can be used. These manipulations can be divided into two
groups, those required by source language concepts and those required by limitations on the
addressing structure of the target machine. Implementing a reference with a pointer variable
would be an example of the former, while loading a value into an index register illustrates
the latter. The exact division between the groups is determined by the structure of the main
storage access function implemented in the descriptors. We assume that every non-literal leaf
of the structure tree is addressable by this access function. The main storage access function
of Figure 10.10 is stated in terms of a base, an index and a displacement. The base refers
to an allocatable object (Section 10.1) whose address may, in general, be computed during
execution. The index is an integer value computed during execution, while the displacement
is xed at compile time. Index and displacement values are summed to yield the relative
address of the accessed location within the allocatable object referred to by the base.
If the access is to statically-allocated storage then the `allocatable object' to which the
accessed object belongs is the entire memory. We indicate this special case by a nil base, and
the displacement becomes the static address. A more interesting situation arises when the
access is to storage in the activation record of a LAX procedure.
Figure 10.11 shows a LAX program with ve static nesting levels. If we associate activation
records only with procedures (Section 3.3.2) then we need consider only three levels. Value
descriptors for the three components of the assignment in the body of q could be constructed
as shown in Figure 10.11b.
The level array is built into the compiler with an appropriate maximum size. When the
compiler begins to translate a procedure, it ensures one value descriptor for each level up to the
level of the procedure. Initially, the descriptor at level 1 indicates that the global activation
record base address can be found in register 12 and the descriptor at the procedure's level
indicates that the local activation record base address can be found in register 13. Base
addresses for other activation records can be found by following the static chain, as indicated
by the descriptor at level 2. This initial condition is determined by the mapping specication.
We are assuming here that the LAX-to-IBM 370 mapping specication makes the global
register allocation proposed at the beginning of Section 10.2.1.
When a value descriptor is created for a variable, its base is simply a copy of the level
array element corresponding to the variable's static nesting depth. (The program is assumed
at level 0 here.) The index eld for a simple variable's access function is nil (indicated in
Figure 10.11b by an empty eld) and the displacement is the oset of the variable within the
activation record. For array variables, the index eld points to the value descriptor of the
index, and the displacement is the ctitious oset discussed in Section 3.2.2.
The access function for a value may change as instructions that manipulate the value
are generated. For example, suppose that we generate code to carry out the assignment in
Figure 10.11a, starting from the machine state described by Figure 10.11b. We might rst
consider generating a load instruction for b. Unfortunately, b is not addressable; the IBM 370
load instruction requires that the base be in a register. Thus we must rst obtain a register
(say, general register 1) and load the base address for the activation record at level 2 into
it. When this instruction has been generated, we change the value descriptor for the base to
have a value class of general register and indicate general register 1. Generation of the
load for b is now possible, and the value descriptor for b must be altered to re ect the fact
that it is in (say) general register 3.
There is one register descriptor for each register used by the code generator. This includes
both the registers controlled by the local register allocation and globally-assigned registers
10.3 Code Selection 229

declare
a : integer ;
procedure p;
declare
b : integer ;
procedure q (c : integer ); a := b + c
begin
b := 1; q (2)
end
begin
p
end
a) A LAX program

general memory
register address

a offset

0
1 memory memory
2 value address
3
4
.
.
. static chain b offset
offset
Level
Array general memory
register address

c offset

b) Value descriptors for the IBM 370

Figure 10.11: Referencing Dynamic Storage
with xed interpretations. The local register allocation process discussed in Section 10.2.1
schedules movement of values into and out of registers. As we noted at the beginning of the
chapter, however, only an estimate of the register requirements is possible. The code selection
process, working with the machine state description, may be able to reduce the register count
below that estimated by the local register allocator. As a consequence, it may be unnecessary
to store an intermediate value whose node had been given the store attribute. For this
reason, we defer the generation of store instructions requested by these attributes in the hope
that the register holding the value will not actually be required before the value can be used
again. Using this strategy, we may have to free the register `unexpectedly' in a context where
the value descriptor for the value is not directly accessible. This means that the register
descriptor of a register containing a value must point to the value descriptor for the contained
value. If the register must be freed, a store instruction can be emitted and the value descriptor
updated to re ect the current location of the value.
230 Code Generation

Immediately after a load or store instruction, the contents of a register are a copy of
the contents of some memory location. This `copy' relationship represents a condition that
occurs during execution, and to specify it the register descriptor must be able to dene a
memory access function. This access function is copied into the register descriptor from a
value descriptor at the time the two are linked; it might describe the location from which the
register was loaded or that to which it was stored. Some care must be exercised in deciding
when to establish such a relationship: The code generator must be able to guarantee that
the value in memory will not be altered by side eects without explicitly terminating the
relationship. Use of programmer-dened variables is particularly dangerous because of this
requirement, but use of compiler-generated temporaries and activation record bases is safe.
if free registers exist then choose one arbitrarily
else if copy registers exist then choose the least-recently accessed
else
begin
choose the least-recently accessed unique register;
allocate a temporary memory location;
emit a store instruction;
end
;
ifchosen register has an associated value descriptor then
de-link the value descriptor;
lock the chosen register;
Figure 10.12: Register Management
The register assignment algorithm should not make a random choice when asked to assign
a register (Figure 10.12). If some register is in state free , it may be assigned without penalty.
A register whose state is copy may be assigned without storing its value, but if this value is
needed again it will have to be reloaded. The contents of a register whose state is unique must
be stored before the register can be reassigned, and a locked register cannot be reassigned
at all. All globally-allocated registers are locked throughout the simulation. The states of
locally-allocated registers change during the simulation; they are always free at a label.
As shown in Figure 10.12, the register assignment algorithm locks a register when it is
assigned. The code selection routine requesting the register then links it to the proper value
descriptor, generating any code necessary to place the value into the register. If the value is
the result of a node with the store attribute then the register descriptor state is changed to
unique . This makes the register available for reassignment, and guarantees that the value
will be saved if the register is actually reassigned. When a value descriptor is destroyed, it is
rst de-linked from any associated register descriptor. The state of the register descriptor is
changed to free if the register descriptor species no memory copy; otherwise it is changed
to copy . In either case it is available for reassignment without any requirement to store
its contents. The local register allocation algorithm of Section 10.2.1 guarantees that the
simulator can never block due to all registers being locked.
10.3.2 Code Transformation
We traverse the structure tree in execution order, carrying out a simulation of the target
machine's behavior, in order to obtain the nal transformation of the structure tree into a
sequence of instructions. When the traversal reaches a leaf of the tree, we construct a value
descriptor for the object that the leaf represents. When the traversal reaches an interior node,
a decision table specic to that kind of node is consulted. There is at least one decision table
for every abstract operation, and if the traversal visits the node more than once then each
10.3 Code Selection 231

visit may have its own decision table. The condition stubs of these decision tables involve
attributes of the node and its descendants.
Result correct Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y N N N N N N N N N N N N N N N N
l correct YYYYYYYYNNNNNNNNYYYYYYYYNNNNNNNN
r correct YYYYNNNNYYYYNNNNYYYYNNNNYYYYNNNN
l in register Y Y N N Y Y N N Y Y N N Y Y N N Y Y N N Y Y N N Y Y N N Y Y N N
r in register Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N
swap(l; r) X X X XX X X X XX X X
lreg(l; desire) X X X X X X X X
gen(A; l; r) XXX XXX XXX XXX
gen(AR; l; r) X X X X
gen(S; l; r) XXX XXX XXX XXX
gen(SR; l; r) X X X X
gen(LCR; l; r) X X XXXXXXXX X X
free(r) XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
result(l; store) X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X
"correct" means the sign is not inverted
l=value descriptor of the left operand, r=value descriptor of the right operand
desire =desire attribute of the current node
store =store attribute of the current node
A, AR, S, SR and LCR are IBM 370 instructions
Figure 10.13: IBM 370 Decision Table for + (integer, integer) integer Based on Ta-
ble 10.2
Figure 10.13 shows a decision table for integer addition on the IBM 370 that is derived
from Table 10.2. The condition stub uses the form and location attributes discussed in
Section 10.2.3 to select a single column, and the elements of the action stub corresponding to
X's in that column are carried out in sequence from top to bottom. These actions are based
primarily upon the value descriptors for the operands, but they may interrogate any of the
node's attributes. They are basically of two kinds, machine state manipulation and instruction
generation, although instructions must often be generated as a side eect of manipulating the
machine state.
Four machine state manipulation actions appear in Figure 10.13: swap (l, r) simply
interchanges the contents of the value descriptors for the left and right operands. A regis-
ter is allocated by lreg (l, desire) , taking into account the preference discussed in Sec-
tion 10.2.2. This action also generates an instruction to load the allocated register with the
value specied by value descriptor l , and then links that value descriptor to the register
descriptor of the allocated register. After the code to carry out the addition has been gen-
erated, registers that might have been associated with the right operand must be freed and
the descriptor for the register holding the left operand must be linked to the value descriptor
for the result. If the store attribute is true then the result register descriptor state is set to
unique ; otherwise it remains locked as discussed in Section 10.3.1.
Figure 10.13 contains one action to generate the RR-format of the add instruction and
another to generate the RX-format. A single action could have been used instead, deferring
the selection to assembly. The choice between having the code generator select the instruction
format and having the assembler select it is made on grounds of convenience. In our case the
code generator possesses all of the information necessary to make the selection; for machines
with several memory addressing formats this is not always true because the proper format
232 Code Generation

may depend upon the location assigned to an operand by the assembler.

We must stress here a point made earlier: The code selection process, specied by the
decision tables and the register assignment algorithm operating on the machine state, produces
the nal code. All previous attribution prepares for this process, gathering information but
making no decisions.
Decision tables occurring in the code generator usually have a comparatively small number
of conditions (two to six), and well-known techniques for converting decision tables into
programs can be applied to implement them. We can distinguish two essentially dierent
methods: programmed decision trees and realization as data structures. The former method
generally leads to long programs with large storage requirements. In the latter case the tables
must be interpreted; the storage costs are smaller but the execution time is longer. Because
each decision table is used infrequently, we give priority to reduction of memory requirements
over shortening of execution time. Mixed-code approaches, based upon the frequency of use
of the table, can also be followed. Programmed decision tables are most successful in small,
simple compilers. The more cases and attributes that the code generator distinguishes, the
more heavily the advantages of a data structure weigh.
To represent the decision tables by data structures we rst collect all of the possible actions
into a large case statement. The actions can then be represented in the tables by their case
selectors. In most cases the tables are (or are close to being) complete, so we can apply a
method based upon the idea that the sequence of values for the conditions that characterize
the possible cases can be regarded as a mixed-radix number. The lower right quadrant of the
decision table (see Figure 10.13) is implemented as a Boolean matrix indexed by the action
number (row) and the condition (column). An X corresponds to a true element, a blank to
a false element. Instead of using a Boolean matrix, each column could also be coded as a list
of the case labels that correspond to the actions which must be carried out.

10.4 Notes and References

The memory map module enters blocks into an area as they are delivered, regardless of whether
or not gaps are introduced because of alignment constraints. As noted in Chapter 3, such gaps
can often be eliminated or reduced by rearrangement of the components of a composite object.
Unfortunately, the problem of obtaining an optimum layout is a variant of the `knapsack
problem' [Miller and Thatcher, 1972], which is known to be NP-complete.
The problem of optimal code generation for expression trees has been studied extensively.
Proof that the problem is NP-complete was given by Bruno and Sethi [1976]. Our treatment
is derived from those of Bruno and Lassagne [1975] and Aho and Johnson [1976]. The
basic method for estimating register usage is due to Sethi and Ullman [1970] Multi-register
machines were discussed by Aho et al. [1977] who showed that a polynomial algorithm for
optimal code generation could be obtained if double-length values could occupy arbitrary pairs
of registers. Unfortunately, most machines restrict double-length values to pairs of adjacent
registers, and usually require that the rst register of the pair have an even number.
Targeting is a concept that is implicit in the notion of an inherited attribute. Wulf and
his students 1975 were the rst to make systematic use of targeting under that name, and
our discussion of unary complement elimination is based upon their work.
Target attribution is described by an attribute grammar, and hence the semantic analysis
and code generation tasks can be interfaced by merging their attribute grammars. If storage
constraints require splitting of this combined attribution, the split should be made on the
basis of traversals required by the combined attribute grammar. Thus each traversal may
be implemented as a pass, and each pass may carry out both semantic analysis and code
10.4 Notes and References 233

generation tasks. The specications of the two tasks remain distinct, however, their merging
is an implementation decision that can be carried out automatically.
`Peephole optimization' [McKeeman, 1965] uses a machine simulation, and capitalizes
upon relationships that arise when certain code fragments are joined together. Wilcox
[1971] proposed a code generator consisting of two components, a transducer (which essen-
tially evaluates attributes) and a simulator (which performs the machine simulation and code
selection). He introduced the concepts of value and register descriptors in a form quite similar
to that discussed here. Davidson and Fraser [1980] uses a simulation following a simple
code selector based upon a depth-rst, left-to-right traversal of the structure tree with no
attempt to be clever about register allocation. He claims that this approach is easier to
automate, and gives results approaching those of far more sophisticated techniques.
Formulation of the code selection process in terms of decision tables is relatively rare in
the literature, although they seem to be the natural vehicle for describing it. A number of
authors [Elson and Rake, 1970; Wilcox, 1971; Waite, 1976] have proposed special code
generator description languages that eectively lead to programmed decision trees. Gries
[1971] mentions decision tables, but only in the context of a rather specialized implementation
used by the IBM FORTRAN H compiler [Lowry and Medlock, 1969]. This technique,
known as `bit strips', divides the conditions into two classes. Conditions in the rst class
select a column of the table, while those in the second are substituted into particular rows of
the selected column. It is useful only when a condition applies to some (but not all) elements
of a row. The technique precludes the use of a bit matrix because it requires each element to
specify one of three possibilities (execute, skip and substitute) instead of two.
Glanville and Graham [1978] use SLR(1) parse tables as a data structure implementa-
tion of the decision tables; this approach has also been used in the context of LALR(1) parse
tables by Jansohn et al. [1982]

Exercises
10.1 Complete the denition of the memory mapping module outlined in Figure 10.1 for a
machine of your choice.
10.2 Devise a linear algorithm to rearrange the elds of a record to minimize waste space,
assuming that the only possible alignments are 1 and 2. (The DEC PDP11 and Intel
8086 have this property.)
10.3 [Aho and Johnson, 1976] Consider an expression tree attributed according to the
rules of Figure 10.4.
(a) State an execution-order traversal algorithm that will produce optimum code when
arithmetic instructions are emitted at the postx encounters of interior nodes.
(b) State the conditions under which LOAD and STORE instructions will be emitted
during the traversal of (a).
(c) Show that the attribution of Figure 10.4 is inadequate in the case where some
arithmetic operations can be carried out only by instructions that require one
operand in memory.
(d) Show that optimum code can be produced in case (c) if it is possible to create
a queue of pointers to the tree and use this queue to guide the execution-order
traversal.
10.4 Extend the attribution of Figure 10.4 to handle expression nodes with arbitrary num-
bers of operands, all of which must be in registers.
234 Code Generation

10.5 [Bruno and Lassagne, 1975] Suppose that the target computer has a stack of xed
depth instead of a set of registers. (This is the case for most oating point chips
available for microprocessors.) Show that your algorithm of Exercise 10.4 will still
work if extra constraints are placed upon the allowable permutations.
10.6 What changes would you make in your solution to Exercise 10.4 if some of a node's
operands had to be in memory and others in registers?
10.7 Show that the attribution rules of Figure 10.6 obey DeMorgan's law, i.e. that either
member of the following pairs of LAX expressions leads to the same set of attributes
for a and b:
not (a and b), not a or not b
not (a or b), not a and not b
10.8 Modify Figure 10.6 for a language that does not permit short-circuit evaluation. What
corresponding changes must be made in the execution-order determination?
10.9 [Elson and Rake, 1970] The PL/1 LENGTH function admits optimizations of string ex-
pressions analogous to short-circuit evaluation of Boolean expressions: LENGTH (AjjB )
becomes LENGTH (A)+ LENGTH (B ). (`jj' is the concatenation operator.) Devise targeting
attributes to carry this information and show how they are propagated.
10.10 Show that the unary complement elimination discussed in Section 10.2.3 also minimizes
register requirements.
10.11 Extend Table 10.1 to include division.
10.12 Show that the following relation holds for the cost attribute (Figure 10.9) of any ex-
pression node:
jcost[correct]:length , cost[inverse]:lengthj L
Where L is the length of a negation operator. (This condition must hold for all op-
erations, not just those illustrated in Table 10.1.) What follows from this if register-
memory instructions are also allowed?
10.13 What changes would be required in Figure 10.9 for a machine with a `load negative'
instruction that places the negative of a memory value into a register?
10.14 Modify Figure 10.8 for a machine with both register-register and register-memory in-
structions. Write a single set of attribution rules incorporating the tasks of both Fig-
ure 10.4 and Figure 10.9.
10.15 Specify descriptors to be used in implementing LAX on some computer other than
the IBM 370. Carefully explain any dierence between your specication and that of
Figure 10.10.
10.16 Under what circumstances could a LAX code generator link register values to
programmer-dened variables? Do you believe that the payo would justify the analysis
required?
10.17 There is no guarantee that the heuristic of Figure 10.12 will produce optimal code.
Under what circumstances would the code improve when unique registers were chosen
before copy registers?
10.18 Give, for a machine of your choice, the remaining decision tables necessary to translate
LAX trees involving simple integer operands and operators from Table A.2.
Chapter 11
Assembly
The task of assembly is to convert the target tree produced by the code generator into the
target code required by the compiler specication. This target code may be a sequence of bit
patterns to be interpreted by the control unit of the target machine, or it may be text subject
to further processing by a link editor or loader. In either case, the assembler must determine
operand addresses and resolve any issues left open by the code generator.
Since the largest fraction of the compilers for most machines originate from the manufac-
turer, the manufacturer's target code format provides a de facto standard that the compiler
writer should use: If the manufacturer's representation is abandoned then all access to the
software already developed using other compilers, and probably all that will be developed in
the future at other installations, is lost. For the same reason, it is best to use manufacturer-
supplied link editors and loaders to carry out the external address resolution. Otherwise, if
the target code format is extended or changed then we must alter not only the compilers,
but also the resolution software that we had developed. We shall therefore assume that the
output of the assembly task is a module rather than a whole program, and that external ad-
dress resolution is to be provided by other software. (If this is not the case, then the encoding
process is somewhat simplied.)
Assembly is essentially independent of the source language, and should be implemented by
a common module that can be used in any compiler for the given machine. To a large extent,
this module can be made machine-independent in design. Regardless of the particular com-
puter, it must be able to resolve operand addresses and encode instructions. The information
required by dierent link editors and loaders does not vary signicantly in content. In this
chapter we shall discuss the two main subtasks of assembly, internal address resolution and
instruction encoding, in some detail. We shall sketch the external address resolution problem
brie y in order to indicate the kind of information that must be provided by the compiler;
two specic examples of the way in which this information is represented can be found in
Chapter 14.

11.1 Internal Address Resolution

Internal address resolution is the process of mapping the target tree onto a block of contiguous
target machine memory locations, determining the addresses of all labels relative to the
beginning of this block. We begin by assuming that the size of an instruction is xed, and
then show how this assumption can be relaxed. Special problems can arise from particular
machine architectures, and we shall brie y discuss a representative example.

235
236 Assembly

11.1.1 Label Value Determination

We begin with the structure of the target tree discussed in Section 4.1.4, which can be
characterized by the context-free rules of Figure 11.1.
The attribution rules in Figure 11.1 gather information from the tree about the rela-
tionships among sequences (origin env ) and the placement of labels within the sequences
(label env ). This information is exactly what is found in the `symbol table' of a conven-
tional symbolic assembler. It can easily be shown that Figure 11.1 is LAG(1), and the single
traversal corresponds to `pass 1' of the conventional assembler. Clearly we could integrate this
traversal with the code selection process in an implementation, but it remains conceptually
distinct.
The environments are lists whose elements have the types shown in Figure 11.2a. A based
origin element species an address expression stored as a tree, using linked records of the
form shown in Figure 11.2b. This tree actually forms a part of the origin env attribute;
it is abstracted from the target tree by rules not shown in Figure 11.1, and delivered as
the attribute expression.expr in the rule for sequence ::= expression nodes . We shall
assume that all address computations either involve only absolute values or have the form
relative absolute ; situations requiring more complex calculations can easily be avoided
by the compiler.
On the basis of the information in label env and origin env , every label can be assigned
a value that is either absolute or relative to the origin of a sequence whose origin class is
arbitrary . We could simply consider each arbitrary -origin sequence as a separate `module'
and terminate the internal address resolution process when the attribution of Figure 11.1
was complete. This is generally not done. Instead, we compute the overall length of each
arbitrary -origin sequence and concatenate them, restating all but the rst as based . The
concatenated sequences form the relocatable portion of the program in which every label can
be assigned a relocatable address { an address relative to the single arbitrary origin.
Most programming languages do not oer the user a way to specify an absolute origin,
and hence the compiler will create only relocatable target code. If a particular implementa-
tion does require absolute sequences, there are two ways to proceed. The rst is to x the
arbitrary origin and treat the entire program as absolute; the second is to resolve the ad-
dresses separately in the absolute and relocatable portions, resolving cross references between
them by the methods of Section 11.2. The latter approach can also be taken when the source
language allows the programmer to specify that portions of the program reside in read-only
memory and others in read-write memory.

11.1.2 Span-Dependent Instructions

The assumption that the size of an instruction is xed does not hold for all machines. For
example, the conditional branch instructions of the PDP11 use a single-byte address and
can therefore transfer control a maximum of 127 words back or 128 words forward. If the
branch target lies outside of this range then a sequence involving a conditional branch over
an unconditional jump must be used. The code generator cannot decide between these two
possibilities, and hence it outputs an abstract conditional jump instruction for the assembler
to resolve. Clearly the size of the resulting code depends upon the relative locations of the
target label and jump instruction. (A simple-minded assembler could always assume the
worst case and generate the longest version of the jump.)
A span-dependent instruction can be characterized by its location and the manner in
which its length depends upon the label(s) appearing in its operand(s). For example, the
length of a jump may depend upon the dierence between the location of the jump and the
11.1 Internal Address Resolution 237

rule target_tree ::= sequences

rule sequences ::=

attribution
sequences.label_env nil;
sequences.origin_env nil;
rule sequences := sequences sequence
attribution
sequences[1].label_env
sequences[2].label_env & sequence.label_env ;
sequences[1].origin_env
sequences[2].origin_env & sequence.origin_env ;

rule sequence := nodes

attribution
nodes.base gennum ;
sequence.origin_env
N_origin_element (nodes.base , nodes.length , arbitrary )

rule sequence := expression nodes

attribution
nodes.base gennum ;
sequence.origin_env
N_origin_element (nodes.base , nodes.length ,
based , expression.expr );

rule nodes ::=

attribution
nodes.label_env nil;
nodes.length 0;

rule nodes := nodes operation

attribution
nodes[1].length nodes[2].length +
instr_size (operation.instr );

rule nodes := nodes constant

attribution
nodes[1].length nodes[2].length +
const_size (constant.value );

rule nodes := nodes label

nodes[1].label_env
nodes[2].label_env &
N_label_element (label.uid , nodes[1].base , nodes[2].length );

Figure 11.1: Target Tree Structure and Attribution

238 Assembly

type
label_element = krecord
uid : integer ; (* Unique identification for the label *)
base : integer ; (* Sequence to which the label belongs *)
relative_address : integer ,(* Address of the label in the sequence *)
end;

origin_class = (arbitrary , based );

origin_element = krecord
uid : integer ; (* Unique identification for the sequence *)
length : integer (* Space occupied by the sequence *)
case k : origin_class of
arbitrary : ();
based : (origin : address_exp )
end;
a) Types used in the environments of Figure 11.1
type
address_exp = record
case k : expr_class of
absolute :
(value : integer_value ); (* Provided by the constant table *)
relative :
(label : integer ); (* Unique identification of the
referenced label *)
computation :
(rator : (add, sub );
"
right , left : address_exp )
end;
b) Types used to represent address expressions
Figure 11.2: The Environment Attributes

location of its target; in rare cases the length of a constant-setting instruction may depend
upon the value of an expression (LABEL1 - LABEL2 ). In the remainder of this section we shall
consider only the former situation, and restrict the operand of the span-dependent instruction
to a simple label.
Span-dependence does not change the basic attribution of Figure 11.1, but it requires that
an extra attribute be constructed. This attribute, called mod list , consists of linked records
whose form is given in Figure 11.3a. Mod list is initialized and propagated in exactly the
same way as label env . Elements are added to it at span-dependent instructions as shown in
Figure 11.3b. The function instr size returns the minimum length of the span-dependent
instruction, and this value is used to determine origin values as discussed in Section 11.1.1.
The next step is to construct a relocation table that can be consulted whenever a label
value must be determined. Each relocation table entry species the total increase in size for
all span-dependent instructions lying below a given address (relative or absolute). When the
label address calculation of Section 11.1.1 indicates an address lying between two relocation
table entries, it is increased by the amount specied in the lower entry.
11.1 Internal Address Resolution 239

type
mod_element = record
base : integer ; (* Sequence in which instruction appears *)
relative_address :integer ;(* Address of the instruction in the sequence *)
operand : integer ; (* Unique identification for the operand label*)
instr : machine_op ; (* Characterization of the instruction *)
end;
a) Type used in mod list
rule nodes := nodes span_dependent_operation
attribution
nodes[1].length
nodes[2].length + instr_size (span_dependent_operation.instr );
nodes[1].mod_list
nodes[2].mod_list &
N_mod_element (
nodes[1].base ,
nodes[2].length ,
span_dependent_operation.operand_uid ,
span_dependent_operation.instr );
b) Calculation of mod list
Figure 11.3: Span-Dependent Instructions

The properties of the span-dependent instructions are embodied in a module that provides
two operations:
Too short (machine op , integer ) boolean : Yields true if the instruction dened by
machine op cannot have its operand at the (signed) distance from the instruction given
by the integer.
Lengthen (machine op , integer ) integer : Updates the given machine op , if necessary, so
that the instruction dened can have its operand at the (signed) distance given by the
integer. Yields the increase in instruction size resulting from the change.
The relocation table is built by the following algorithm:
1. Establish an empty relocation table.
2. Make the rst element of mod list current.
3. Calculate the addresses of the span-dependent instruction represented by the current
element of mod list and its operand, using the current environments and relocation
table.
4. Apply too short to the (signed) distance between the span-dependent instruction and
its operand. If the result is false , go to step 6.
5. Lengthen the instruction and update the relocation table accordingly. Go to step 2.
6. If elements remain in mod list , make the next element current and go to step 3.
Otherwise stop.
This algorithm has running time proportional to n2 in the worst case (n is the number
of span-dependent instructions), even when each span-dependent instruction has more than
two lengths.
240 Assembly

Span-dependency must be resolved separately in each portion of the program that depends
upon a dierent origin (see the end of Section 11.1.1). If span-dependent instructions provide
cross-references between portions based on dierent origins then either all analysis of span-
dependence must be deferred to external address resolution or some arbitrary assumption
must be made about the cross-referencing instructions. The usual approach is to optimize
span-dependent instructions making internal references and use the longest version of any
cross-referencing instruction.

11.1.3 Special Problems

The IBM 370 and its imitators have a short address eld and do not permit addressing relative
to the program counter. This is a design aw that means the general-purpose registers must be
used as base registers to provide addressability within the code sequence. Such addressability
is required for two purposes: access to constants and specication of jump targets. The
code generator could, as a part of the memory mapping process, map all constants into a
contiguous block of memory and determine the number of base registers required to provide
addressability for this block. Given our decomposition of the compilation process, however,
it is impossible to guarantee that the code generator can allocate the minimum number of
base registers needed for jump target specication.
The number of code base registers required for any procedure can be reduced to two,
at the cost of increasing the size of a jump instruction from 4 bytes to 8: One of the two
registers holds the address of the procedure's rst instruction. Any jump target is dened
by its address, t, relative to this address. Let t = 4096q + d, such that 0 d < 4096 will t
the displacement eld of an RX-format instruction. Assuming that the address of the rst
instruction is in register 10 and the second register allocated for code basing is 9, a jump to
t becomes
LH 9,CONS+2*q(10)
BC MASK,d(9,10)
(Here `CONS' is an array of halfword values for 4096q and `MASK' is the condition code
mask dening the branch condition.)
By performing additional analysis of the code sequence, it may be possible to avoid ex-
panding some of the jumps. The value of q (and hence the contents of register 9) is easily
determined at every point in the program. If the target of a jump has the same q as is in force
at the location of the jump then no expansion is necessary. Eectively, jump becomes a span-
dependent instruction. The problem of nding the minimum number of jumps that must be
expanded is NP-complete, but a linear algorithm that never shortens a previously-generated
jump gives adequate results in practice.

11.2 External Address Resolution

External address resolution combines separately-compiled modules into a complete program
or simply a larger module. Component modules may constitute a part of the input text,
or may be extracted automatically from one or more libraries. They may have originally
been coded in a variety of programming languages, and translated by dierent compilers.
(This last is only possible when all of the compilers produce target code using a common
representation.)
We restrict ourselves here to the basic problems of external address resolution and their
solution. To do so we must assume a particular code format, but this should in no way be
11.2 External Address Resolution 241

taken as advice that the compiler writer should design his own representation! As noted at
the beginning of the chapter, we strongly advocate use of manufacturer-supplied link editors
and loaders for external address resolution.

11.2.1 Cross-Referencing
In many respects, external address resolution is analogous to internal address resolution: Each
module is a single code sequence with certain locations (usually called entry points, although
they may be either data or code addresses) distinguished. These locations are analogous
to the label nodes in the internal address resolution case. The module may also contain
address expressions that depend upon values (usually called external references ) not dened
within that module. These values are analogous to the label references in the internal address
resolution case. When the modules are combined, they can be considered to be a list of
independent code sequences and all of the techniques discussed in Section 11.1 can be carried
over.
There can be some benet in going beyond the analogy discussed in the previous para-
graph, and simply deferring the internal address resolution until all modules have been gath-
ered together. Under those circumstances one could optimize the length of inter-module
references as well as intra-module references (Section 11.1.2). We believe that the bene-
ts are not commensurate with the costs, however, since inter-module references should be
relatively rare.
Two basic mechanisms are available for establishing inter-module references: transfer
vectors and direct substitution. A transfer vector is best suited to references involving a
transfer of control. It is a block of memory, included in each module that contains external
references, consisting of one element for each distinct external symbol referenced (Figure 11.4).
The internal address resolution process replaces every external reference with a reference to
the corresponding element of the transfer vector, and the external address resolution process
lls each transfer vector element with the address of the proper entry point. When the
machine architecture permits indirect addressing, the initial reference is indirect and may
be either a control or a data reference. If the machine does not provide indirect addressing
via main memory, the transfer vector address must be loaded into a base register for the
access. When the address length permits jumps to arbitrary addresses, we might also place
an unconditional jump to the entry point in the transfer vector and implement a call as a call
to that transfer vector entry.
Direct substitution avoids the indirection inherent in the transfer vector mechanism: The
actual address of an entry point is determined during external address resolution and stored
into the instruction that references it. Even with the transfer vector mechanism, direct
substitution is required within the transfer vector itself. In the nal analysis, we use a
transfer vector because it reduces to one the number of changes that must be made when the
address of an entry point changes, and concentrates these changes at a particular point in the
program. Entry point addresses may change statically, as when a module is newly compiled
and bound without altering the program, or they may change dynamically, as when a routine
resides in memory temporarily. For example, service routines in an operating system are
often `transient' { they are brought into memory only when needed. The operating system
provides a transfer vector, and all invocations of service routines must go via this transfer
vector. When a routine is not in memory, its transfer vector entry is replaced by a jump to
a loader. Even if the service routines are not transient, a transfer vector is useful: When
changes made to the operating system result in moving the service routine entry points, only
the transfer vector is altered; there is no need to x up the external references of all user
programs. (Note that in this case the transfer vector is a part of the operating system, not
242 Assembly

procedure ex (x , y : real ) : real ;

var
a , b : real ;
begin
a := sign (x ) * sqrt (abs (x ));
b := sign (y ) * sqrt (abs (y ));
ex := (a - b ) / (a + b )
end; (* ex *)
a) External references
abs
sign
sqrt

b) Transfer vector for procedure ex

Figure 11.4: Transfer Vectors

of each module using the operating system as discussed in the previous paragraph. If the
vector occupies a xed location in memory, however, it may be regarded either as part of the
module or as part of the operating system.)
In the remainder of this section we shall consider the details of the direct substitution
mechanism. As pointed out earlier, this is analogous to internal address resolution. We shall
therefore concern ourselves only with the dierences between external and internal resolution.
These dierences lie mainly in the representation of the modules.
A control dictionary is associated with each module to provide the following information:
Length of the module.
Locations of entry points relative to the beginning of the module.
Symbols used to denote entry points and external values.
Fields within the module that represent addresses relative to the beginning of the mod-
ule.
Fields within the module that represent external references.
Additional information about the size of external areas may also be carried, to support
external static data areas such as FORTRAN COMMON.
The module length, relative entry point addresses and symbols are used to establish
an attribute analogous to label element . Note that this requires a traversal of the list
of modules, but not of the individual modules themselves. After this attribute is known,
the elds representing relative and external addresses must be updated. A relative address
is updated by adding the address of the module origin; the only information necessary to
characterize the eld is the fact that it contains a relative address. One common way of
encoding this information is to associate relocation bits with the module text. The precise
relationship between relocation bits and elds depends upon the machine architecture. For
example, on the PDP11 a relative address occurring in an instruction must occupy one word.
We might therefore use one relocation bit per word, 1 indicating a relative address. Note
that this encoding precludes other placement of relative addresses, and may therefore impose
constraints upon the code generator's mapping of data structures to be initialized by the
compiler.
To characterize an external reference we must specify the particular external symbol in-
volved in addition to the fact that an external reference occurs in the eld. The concept of
11.3 Instruction Encoding 243

a relocation bit can be extended to cover the existence of an external reference by adding a
third state: For a particular eld the possibilities are `no change', `relative' and `external'.
The eld itself then contains an integer specifying the particular external symbol.
There are two disadvantages to this strategy for characterizing external references. The
most important is that it does not permit an address relative to an external symbol, since the
eld must be used to dene the symbol itself. Data references, especially those to external
arrays like FORTRAN COMMON, tend to violate this constraint. A second disadvantage is
that the number of relocation bits for every eld is increased, although only a small minority
of the elds may actually contain external references. Both disadvantages may be overcome
by maintaining a list of all elds containing external references relative to a particular symbol.
The eld itself contains the relative address and the symbol address is simply added to it,
exactly as a relative address is updated. (This same strategy can be used instead of relocation
bits for relative addresses on machines whose architectures tend to make relative addresses
infrequent; the IBM 370 is an example.)
The result of the cross-referencing process could be a ready-to-run program, with all
addresses absolute, or it could be single module with relative addresses, entry points and
external references that can be used as input to further linkage steps. In the latter case, the
input must specify not only the modules to be linked but also the entry points to be retained
after linkage. External references will be retained automatically if and only if they do not
refer to entry points of other input modules.

11.2.2 Library Search

A language such as Ada requires that the semantic analyzer verify the correctness of all inter-
module references. Thus during assembly all of the modules needed are already known. This
is not the case for languages such as FORTRAN. Mathematical subroutines, I/O procedures,
environment inquiries and the like are almost always supplied by the installation and placed
in a library in target code format. After the rst traversal of the input module list, external
references not corresponding to entry points may be looked up in this library. If a module in
the library has one or more of these symbols as entry points then it is added to the list and
processed just as though it had come from the input. Clearly more than one library may be
searched in the process of satisfying external references; the particular libraries and order of
search are specied by the user.
A library is often quite large, so it would be inecient to scan all of the modules in a
search for entry points. The entry point information is therefore normally gathered into a
catalog during the process of constructing the library, and only the catalog is examined to
select appropriate modules. Since the modules of a library may have a high degree of internal
linkage, the catalog should also specify the external symbols referenced by each module. After
the modules necessary to satisfy user external references have been determined, a transitive
closure operation adds any others required by those already selected.

11.3 Instruction Encoding

After all attributes of target tree nodes have been computed, the information must be con-
verted into target code suitable for execution. This process is similar to the code selection
discussed in Section 10.3, but somewhat dierent specication techniques are appropriate.
After discussing an appropriate interface for the target code converter, we shall present an
encoding mechanism and a specication language.
244 Assembly

11.3.1 Target Code

We regard the target code as an abstract data type dened by eight operations:
Module name (identifier string ): Establish the name of the module being generated.
Module size (length ): Specify the length of the block of contiguous memory locations re-
quired for the module being generated.
Entry point (identifier string ): Establish an entry point to the module being generated.
Set location (relative address ): Specify the load point at which subsequent target code
is to be placed in memory.
Absolute text (target text , length ): Place encoded target text into memory at the cur-
rent load point. The length argument gives the amount of text to be placed. After
the text has been placed, the current load point is the point immediately beyond it.
Internal reference (relative address ): Place an encoded relative address into memory
at the current load point. After the address has been placed, the current load point is
the point immediately beyond it.
External reference (offset , identifier string ): Place an external reference into mem-
ory at the current load point. The offset is the address relative to the external symbol
identifierstring . After the reference has been placed, the current load point is the
point immediately beyond it.
These operations provide the information summarized in Section 11.2, and would consti-
tute the interface for a module that actually produced a target code le. Some manufacturer's
software may place restrictions upon parameter values, and some may provide facilities (such
as repetitions of data values) that cannot be reached via these operations.
Module name , module size and entry point all provide specic information for the
control dictionary. Set location is used to reset the current load point at the beginning of a
code sequence. It embodies the `scatter loading' concept in which the target code is broken up
into a number of compact blocks, each of which carries the address at which it is to be placed.
These addresses need not be contiguous. We shall consider two specic implementations of
this concept in Section 14.2.
Only a small range of length parameters is possible for the absolute text operation on
any given machine: There is a xed set of instruction and instruction fragment lengths, and
most constants have a length dependent only upon their type and not upon their value. One
notable exception is the string constant, which must be broken into smaller units to be used
with the absolute text operation.
There is no length parameter specied for an internal or external reference. On most
computers, relative addresses are only useful as operands of a specic length, and hence that
length is assumed.
Absolute text, internal references and external references are distinguished because they
may be represented in very dierent ways by the manufacturer's software. For a particular
target computer there may even be several operating systems with quite dierent target
code formats. It is therefore wise for the compiler writer to design his target code module
according to the abstract data type given here instead of attempting to merge absolute text ,
internal reference and external reference into one operation and inserting relocation
bits explicitly.
11.3.2 The Encoding Process
Each target tree node represents a label, storage reservation, constant or abstract machine
instruction. Label nodes are ignored by the encoding process, and storage reservation nodes
11.3 Instruction Encoding 245

simply result in invocations of the set location operation. The remaining nodes must be
encoded by invoking one or more of the last three operations dened in the previous section.
Constants may appear as literal values to be incorporated directly into the target code, or
they may be components of address expressions. In the latter case, the result of the expression
could be used as data or as an operand of an instruction. Literal values must be converted
using the internal-to-target conversion operations of the constant table (Section 4.2.2), and
then inserted into the target code by absolute text . An address expression is evaluated
as outlined in Exercise 11.9. If the result is used as data then the appropriate target code
operation is used to insert it; otherwise it is handled by the instruction encoding.
In the simplest case the abstract instructions correspond to unique operation codes of the
real machine. In general, however, the correspondence is not so simple: One abstract opera-
tion can represent several instructions, or one of several operation codes could be appropriate
depending upon the operand access paths. Decisions are thus made during instruction en-
coding on the basis of the abstract operator and the attributes of the operand(s) just as in
the case of code generation.
The basic instruction encoding operations are called formats. They are procedures that
take sets of values and add them to the target code so that the result is a single instruction.
These procedures sometimes correspond to the instruction formats recognized by the target
machine's control unit, and hence their name. In many cases, however, the instruction format
shows regularities that can be exploited to reduce the number of encoding formats. For
example, the ve instruction formats of the IBM 370 (Figure 11.5a) might correspond to only
three encoding formats (Figure 11.5b).
RR opcode R1 R2
RX opcode R1 X2 B2 D2
RS opcode R1 R3 B2 D2
SI opcode I2 B1 D1
SS opcode L1 L2 B1 D1 B2 D2
a) Instruction formats
FR opcode R1 R2
FI opcode I
FM B D
b) Encoding formats
Figure 11.5: IBM 370 Formats
An instruction is encoded by calling a sequence of one or more format-encoding proce-
dures. The process can be described in a language resembling a normal macro assembly
language. Figure 11.6 shows a portion of a description of the IBM 370 instruction encod-
ing cast in this form. Each macro body species the sequence of format invocations, using
constants or macro parameters (denoted by the character `%' followed by the position of the
parameter) as arguments. A separate directive, NAME, is used to associate the macro body
with an instruction because many instructions can often use the same encoding procedure.
246 Assembly

AR NAME 1AH
SR NAME 1BH
MACRO ; Register,Register
FR %0,%1,%2
ENDM
A NAME 5AH
S NAME 5BH
MACRO ; Register,Memory,Index
FR %0,%1,%3
FM %2
ENDM
AP NAME 0FAH
SP NAME 0FBH
MACRO ; Memory,Length,Memory,Length
FR %0,%2,%4
FM %1
FM %3
ENDM
Note : Sux `H' denotes hexadecimal.
Figure 11.6: IBM 370 Instruction Encoding

NAME directives may specify an argument, which becomes parameter 0 of the macro. In
Figure 11.6 the NAME directive has been used to supply the hexadecimal operation code for
each instruction. (A hexadecimal constant begins with a digit and ends with `H'.) We use the
IBM mnemonics to denote the instructions; in practice these macros would be represented
by tables and the node type of an abstract operation would appear in place of the symbolic
operation code.
Formal parameters of the macros in Figure 11.6 are described by comments. (Strings
following `;' on the same line are comments.) The corresponding actual parameters are the
operands of the target tree node, and their values will have been established during code
generation or address resolution. Note that a `memory' operand includes its base register but
not an index register. Thus the `FM' format takes a single memory address and encodes it
as a base and displacement. This re ects the fact that the index register is assigned by the
code generator, while the base register is determined during assembly. In other words, the
abstract IBM 370 from which these macros were derived did not have the concept of a based
access.
Consider the LAX expression a + b " [c]. If a were in register 1, b " in register 2 and
c (multiplied by the appropriate element length) in register 3 then the addition could be
performed by a single IBM 370 add instruction with R1 = 1, B 2 = 2, X 2 = 3 and D2 a
displacement appropriate to the lower bound of the array being referenced. Given the macros
of Figure 11.6, however, this instruction could not be encoded because the abstract machine
has no concept of a based access. Clearly one solution to this problem is to give FM two
arguments and make the base register explicit in the abstract machine; another is to provide
the abstract machine with two kinds of memory address: one in the code sequence and the
other in data memory. We favor the latter solution because these two kinds of memory address
are specied dierently. The code generator denes the former by a label and the latter by
a base register and displacement. The assembler mustpick a base register for the former but
11.3 Instruction Encoding 247

A NAME 5AH
S NAME 5BH
MACRO ,LABEL ; Register,Memory,Index
FR %0,%1,%3
FM1 %2
ENDM
MACRO ; Register,Base,Index,Displacement
FR %0,%1,%3
FM2 %2,%4
ENDM
a) Selection of dierent macros
A NAME 5AH
S NAME 5BH
MACRO ; Either pattern
FR %0,%1,%3
IF @%2=LABEL
FM1 %2
ELSE
FM2 %2,%4
ENDIF
ENDM
b) Conditional within a macro
Figure 11.7: Two Memory Operand Types

not the latter. Because of these dierences it is probably useful to have distinct target node
formats for the two cases.
Figure 11.7 shows a modication of the macros of Figure 11.6 to allow our second solution.
In Figure 11.7a the add instruction is associated with two macro bodies, and the attribute of
one of the parameters of the rst is specied. The specication gives the attribute that the
operand must possess if this macro is to be selected. By convention, the macros associated
with a given name are checked in the order in which they appeared in the denition; param-
eters with no specied attributes match anything. Figure 11.7b combines the two bodies,
using a conditional to select the proper format invocation. Here the operator `@' is used to
select the attribute rather than the value of the parameter. This emphasizes the fact that
there are two components of an operand, attribute and value, which must be distinguished.
What constitutes an attribute of an operand, and what constitutes a value? These ques-
tions depend intimately upon the design of the abstract machine and its relationship to the
actual target instructions. We shall sketch a specic mechanism for dening and dealing with
attributes as an illustration.
The value and attribute of an operand are arbitrary bit patterns of a specied length.
They may be accessed and manipulated individually, using the normal arithmetic and bitwise-
logical operators. Any expression yields a value consisting of a single bit pattern. Two
expressions may be formed into a value/attribute pair by using the quote operator: e1 "e2 .
(See Figure 11.8 for examples.) An operand is compatible with a parameter of a macro if the
following expression yields true :
(@operand and @parameter ) = parameter
Thus the operand R2 would be compatible with the parameters R2, EVENGR and GENREG
248 Assembly

ANY SET 0"0 ; Any operand

LABEL SET 10H"10H ; Code sequence memory operand
EVENGR SET 20H"21H ; Even-numbered general register
ODDGR SET 21H"21H ; Odd-numbered general register
GENREG SET 20H"20H ; Any general register
R0 SET 0"20H ; General register 0
R1 SET 1"21H ; General register 1
R2 SET 2"22H ; General register 2
R3 SET 3"23H ; General register 3
a) Symbol denitions
LABEL = 10H
@LABEL = 10H
R0+1 =1
@R0-1 = 17H
R1+@LABEL = 11H
@R3 and @EVENGR = 21H
@R3 and @ODDGR = 21H
b) Expressions
Figure 11.8: Values and Attributes

in Figure 11.8; it would not be compatible with ODDGR or LABEL. Clearly any operand is
compatible with ANY, and it is this object that is supplied when a parameter specication
is omitted.
Macro languages similar to the one sketched here have been used to specify instruction
encoding in many contexts. Experience shows that they are useful, but if not carefully imple-
mented can lead to very slow processors. It is absolutely essential to implement the formats
by routines coded in the implementation language of the compiler. Macros can be interpreted,
but the interpretive code must be compact and carefully tailored to the interpretation pro-
cess. The normal implementation of a macro processor as a string manipulator is inadequate.
Names should be implemented as a compact set of integers so that access to lists of macro
bodies is direct. Since the number of bodies associated with a name is usually small, linear
search is adequate. Note that a tradeo is possible between selection on the basis of the name
and selection on the basis of attributes.
As a by-product of the encoding, it is possible to produce a symbolic assembly code version
of the program to aid in the debugging and maintenance of the compiler itself. If the macro
names are specied symbolically, as in Figures 11.6 and 11.7, these can be used as symbolic
operation codes in the listing. The uid that appears as an intrinsic attribute of the label
nodes can be converted into a normal identier by prexing a letter. Only constants need
special treatment: a set of target value-to-character conversion procedures must be provided.

11.4 Notes and References

Assembly is seldom provided as a cleanly-separated module that can be invoked by any
compiler. Exceptions to this rule are IBSYS [Talmadge, 1963] and EMAS [Stephens,
1974] both of which contain standard assembly modules. The IBSYS assembler requires the
target code tree to reside on a sequential le, while EMAS makes a collection of assembly
procedures available as part of the standard library. IBM chose not to follow the IBSYS
11.4 Notes and References 249

example in OS/360, probably because of complaints about performance degradation due to

the need to explicitly write the target code tree.
The idea of using separate code sequences instead of specic storage reservation nodes
in the target tree was discussed by Mealy [1963]. Talmadge [1963] shows how complex
addressing relationships among sequences can be implemented. His philosophy was to provide
complete exibility in the assembler (which was written once for each machine) in order to
reduce eort that would otherwise be duplicated in every compiler. In practice, it seems
that the duplicated eort is generally required to support quality code generation. Thus the
complexity does not occur in target code produced by a compiler, but it is often found in
symbolic assembly code produced by human programmers.
Several `meta-assemblers' have been proposed and used to implement symbolic assembly
languages. These systems provide mechanisms for specifying the instruction encoding process
in terms of formats and macros as discussed in Section 11.3.2. Most of the basic ideas are
covered by Graham and Ingerman [1965] but the concept of including attributes in the
pattern match does not occur until much later. [Language Resources, 1981]. The problem
of span-dependence has been studied by a number of authors. Our treatment follows that of
Hangelberger [1977] and Szymanski [1978] and is specially adapted for use in a compiler.
In symbolic assemblers, more complex address expressions may appear and the order of the
algorithm may be altered thereby.

Exercises
11.1 Complete Figure 11.1 by adding rules to describe address expressions and construct
the attribute expression.expr .
11.2 [Galler and Fischer, 1964] Consider the problem of mapping storage described by
FORTRAN COMMON, DIMENSION, EQUIVALENCE and DATA statements onto
a sequence of contiguous blocks of storage (one for each COMMON area and one for
local variables).
(a) How can these statements be translated into a target tree of the form discussed
in Section 4.1.4 and Figure 11.1?
(b) Will the translation you describe in (a) ever produce more than one arbitrary -
origin sequence? Carefully explain why or why not.
(c) Does your target tree require any processing by the assembler in addition to that
described in Section 11.1.1? If so, explain why.
11.3 [Talmadge, 1963] Consider the concatenation of all arbitrary -origin sequences dis-
cussed in Section 11.1.1.
(a) Write a procedure to determine the length of an arbitrary -origin sequence.
(b) Write a procedure to scan origin env , nding two arbitrary -origin sequences
and concatenating them by altering the origin element record for the second.
11.4 Consider the implementation of the span-dependence algorithm of Section 11.1.2.
(a) Show that the algorithm has running time proportional to n2 in the worst case,
where n is the number of span-dependent instructions.
(b) Dene a relocation table entry and write the update routine mentioned in step
(5) of the algorithm.
250 Assembly

11.5 [Szymanski, 1978] Modify the span-dependence analysis to allow target expressions of
the form label constant .
11.6 Consider the code basing problem of Section 11.1.3.
(a) Dene any attributes necessary to maintain the state of q within a code sequence,
and modify the rules of Figures 11.1 and 11.3 to include them.
(b) Explain how the operations too short and lengthen (Section 11.1.2) must be
altered to handle this case. Would you prefer to dene other operations instead?
Explain.
11.7 [Robertson, 1979] The Data General Nova has an 8-bit address eld, addressing
relative to the program counter is allowed, and any address may be indirect. Constants
must be placed in the code sequence within 127 words of the instruction that references
them. If a jump target is further than 127 words from the jump then the address must
be placed in the code sequence as a constant and the jump made indirect. (The size of
the jump instruction is the same in either case.)
(a) Give an algorithm for placing constants that takes advantage of any unconditional
jumps already present in the code, placing constants after them.
(b) Indicate how the constant blocks might be considered span-dependent instruc-
tions, whose size varies depending upon whether or not they contain jump target
addresses.
(c) Show that the problem of optimizing the span-dependence in (b) is NP-complete.
11.8 [Talmadge, 1963] Some symbolic assemblers provide `multiple location counters',
where each location counter denes a sequence in the sense of Section 11.1.1. Pseudo
operations are available that allow the user to switch arbitrarily from one location
counter to another.
(a) Show how a target tree could represet arbitrary sequence changes by using
internally-generated labels to associate `pieces' of the same sequence.
(b) Some computers (such as the Control Data Cyber series) have instructions that
are smaller than a single memory element, but an address refers only to an entire
memory element. How could labels be represented for such a machine? How does
the choice of label representation impact the solution to (a)?
(c) What changes to Figure 11.1 would be needed if we chose not to represent arbitrary
sequence changes by internally-generated labels, but instead gave every `piece' of
the same sequence the same uid ?
(d) If we used the representation for sequences suggested in (c), how would the answer
to (b) change?
11.9 The ultimate value of an address embedded in the target code must be either a number
or a pair (external symbol, number). A number alone may represent either a numeric 14
operand or a relative address.
(a) Suppose that A, B and C are labels. What form does the value of (A+B)-C take?
Why is (A+B)+C a meaningless address expression?
(b) Specify an attribute that could be used to distinguish the cases mentioned in (a).
(c) If A were an external symbol, would your answer to (a) change? Would your
answer to (b) change? How?
(d) Would you allow the expression (A+B)-(A+C), A an external symbol, B and C
labels? What form would its value take?
11.4 Notes and References 251

(e) Use an attribute grammar to dene the language of legal address expressions.

Make the value of the expression an attribute of the root.
11.10 [Hedberg, 1963] What requirements are placed upon the external address resolution
process by FORTRAN COMMON blocks? Quote the FORTRAN standard to support
your position, and then explain how these requirements might be satised.
11.11 Suppose that the target machine provided an instruction to add an immediate value to
a register, but none to subtract an immediate value from a register. The addition is,
however, a 2's complement addition so that subtraction can be accomplished by adding
the complement of an immediate value. How would you provide the complement of a
relative address as an immediate operand?
11.12 [General Electric Company, 1965] Several utility modules may require the same
support functions, but optimizations may arise from integrating these support functions
with the utility modules. The result is that several modules may have identical entry
points for the support functions but dier in other entry points. Devise a library catalog
that will distinguish between primary and secondary entry points: A module will be
selected only if one or more of its primary entry points corresponds to an unsatised
external reference. Once a module has been selected, however, secondary entry points
can be used to satisfy external references. Comment upon any user problems you
foresee.
252 Assembly
Chapter 12
Error Handling
Error handling is concerned with failures due to many causes: errors in the compiler or its
environment (hardware, operating system), design errors in the program being compiled, an
incomplete understanding of the source language, transcription errors, incorrect data, etc.
The tasks of the error handling process are to detect each error, report it to the user, and
possibly make some repair to allow processing to continue. It cannot generally determine
the cause of the error, but can only diagnose the visible symptoms. Similarly, any repair
cannot be considered a correction (in the sense that it carries out the user's intent); it merely
neutralizes the symptom so that processing may continue.
The purpose of error handling is to aid the programmer by highlighting inconsistencies.
It has a low frequency in comparison with other compiler tasks, and hence the time required
to complete it is largely irrelevant, but it cannot be regarded as an `add-on' feature of a
compiler. Its in uence upon the overall design is pervasive, and it is a necessary debugging
tool during construction of the compiler itself. Proper design and implementation of an error
handler, however, depends strongly upon complete understanding of the compilation process.
This is why we have deferred consideration of error handling until now.
It is perhaps useful to make a distinction between the correctness of a system and its
reliability. The former property is derived from certain assumptions regarding both the prim-
itives upon which the system is based and the inputs that drive it. For example, program
verication techniques might be used to prove that a certain compiler will produce correct
object programs for all source programs obeying the rules of the source language. This would
not be a useful property, however, if the compiler collapsed whenever some illegal source
program was presented to it. Thus we are more interested in the reliability of the compiler:
its ability to produce useful results under the weakest possible assumptions about the quality
of the environment, input data and human operator. Proper error handling techniques con-
tribute to the reliability of a system by providing it with a means for dealing with violations
of some assumptions on which its design was based. (Theoretically, of course, this could be
regarded simply as a relaxation of those assumptions; pragmatically, techniques for achieving
correctness and reliability are quite dierent.)
We shall begin this chapter by considering some general principles of error handling. A
distinction will be made between errors detectable at compilation time and errors whose
symptoms do not appear until execution time. The compiler must deal with those in the
former class directly, and must provide support for the run-time system that allows it to
handle those in the latter class. Section 12.2 further classies compiler-detected errors, and
explains methods of recovering from erroneous input in order to obtain as much diagnostic
information as possible from a single run. Support for run-time error handling is considered
in Section 12.3.
253
254 Error Handling

12.1 General Principles

The class of detectable errors is determined by the design of the programming language, not
the design of the compiler. An error handler should recognize and repair all detectable errors
occurring in a program. Unfortunately, this goal often con icts with the principle that a
correct program should pay nothing for error handling. One compromise is to subdivide the
detectable errors into several classes and proceed in a stepwise fashion: The detection of
errors in dierent classes is provided for by distinct options in the compiler or controlled by
additional monitoring code during execution.
Almost by denition, error handling involves a mass of special cases and exceptions to
rules. It is thus very dicult to provide any sort of clean, theoretical foundation for this
aspect of the compilation process. What we shall try to do in this section is to classify errors
and outline the broad strategies useful in dealing with these classes.

12.1.1 Errors, Symptoms, Anomalies and Limitations

We distinguish between the actual error and its symptoms. Like a physician, the error handler
sees only symptoms. From these symptoms, it may attempt to diagnose the underlying
error. The diagnosis always involves some uncertainty, so we may choose simply to report the
symptoms with no further attempt at diagnosis. Thus the word `error' is often used when
`symptom' would be more appropriate.
A simple example of the symptom/error distinction is the use of an undeclared identier
in LAX. The use is only a symptom, and could have arisen in several ways:
The identier was misspelled on this use.
The declaration was misspelled or omitted.
The syntactic structure has been corrupted, causing this use to fall outside of the scope
of the declaration.
Most compilers simply report the symptom and let the user perform the diagnosis.
An error is detectable if and only if it results in a symptom that violates the denition of
the language. This means that the error handling procedure is dependent upon the language
denition, but independent of the particular source program being analyzed. For example,
the spelling errors in an identier will be detectable in LAX (provided that they do not result
in another declared identier) but not in FORTRAN, which will simply treat the misspelling
as a new implicit declaration.
Our goal in implementation should be to report each detectable error at the earliest
opportunity. If the symptom can be noticed at compile time, then we should do so. Some
care must be taken, however, not to report errors before their symptoms occur. For example,
the LAX expression (1/0) conforms to the syntax and static semantics of the language; the
symptom `division by zero' only occurs when the expression is actually evaluated during
execution. It is important that the compiler not report an error in this case, even though it
might detect the problem (say, while folding constants). The reason is that this expression
may never actually be evaluated, and hence the program may not be incorrect at all. (Another
possibility is that the programmer is attempting to force an execution-time error, perhaps to
check out a new recovery mechanism.)
We shall use the term anomaly to denote something that appears suspicious, but that we
cannot be certain is an error. Anomalies cannot be derived mechanically from the language
denition, but require some exercise of judgement on the part of the implementor. As expe-
rience is gained with users of a particular language, one can spot frequently-occurring errors
12.1 General Principles 255

end
i := 1;
a) A legal fragment of an ALGOL 60 program
end;
i := 1;
b) The probable intent of (a)
for i := 1 step 1 until 2 n + 1
c) A probable ineciency in SIMULA
Figure 12.1: Anomalies

and report them as anomalies before their symptoms arise. An example of such a case is the
fragment of ALGOL 60 shown in Figure 12.1a. Since ALGOL 60 treats text following end
as a comment (terminated by else, end or ;), there is no inconsistency here. However, the
appearance of := in the comment makes one suspicious that the user actually intended the
fragment of Figure 12.1b. Many ALGOL 60 compilers will therefore report an anomaly in
this case.
Note that a detectable error may appear as an anomaly before its symptoms arise: A
LAX compiler could report the expression (1=0) as an anomaly even though its symptoms
would not be detected until execution time. Reports of anomalies therefore dier from error
reports in that they are simply warnings that the user may choose to suppress.
Anomalies may be reported even though there is no reason whatever to believe that
they represent true errors; some compilers are quite prepared to simply comment on the
programmer's style. The SIMULA compiler for the Univac 1108, for example, diagnoses
Figure 12.1c as poor style because { as in ALGOL 60 { the upper limit of the iteration is
evaluated 2n + 1 times even though its value probably does not change during execution of
the loop. Such reports may also be used to call the programmer's attention to nonstandard
constructs supported by the particular system on which he is running.
A particular implementation normally places some limitations on the language denition,
due to the nite resources at its disposal. (Examples include the limitation of nite-precision
arithmetic, a limit on the number of identiers in a program, the number of dimensions in
an array or the maximum depth of parentheses in an expression.) Although violations of
implementation-imposed constraints are not errors in the sense discussed above, they have
the same eect for the user. A major design goal is therefore to minimize the number of such
limitations, and to make them as `reasonable' as possible. They should not be imposed lightly,
simply to ease the task of the implementor, but should be based upon a careful analysis of
the cost/benet ratio for user programs.
12.1.2 Responses
We distinguish three possible levels of response to a symptom:
1. Report: Provide the user with an indication that an error has occurred. Specify the
symptom, locate its position precisely, and possibly attempt a diagnosis.
2. Recover: Make the state of the process (compilation, execution) consistent and continue
in an attempt to nd further errors.
3. Repair: On the basis of the observed symptom, attempt a diagnosis of the error. If
condent that the diagnosis is correct, make an appropriate alteration in the program
or data and continue.
256 Error Handling

Both the compiler and the run-time system must at least report every symptom they
detect (level 1). Recovery (level 2) is generally provided only by the compiler, while repair
may be provided by either. The primary criterion for recovery techniques is that the system
must not collapse, since in so doing it may take the error message (and even the precise
location of the symptom) with it. There is nothing more frustrating than a job that aborts
without telling you why!
A compiler that reports the rst symptom detected and then terminates compilation is not
useful in practice, since one run would be needed for each symptom. (In an interactive setting,
however, it may be reasonable for the compiler to halt at the rst symptom, requiring the
programmer to deal with it before continuing.) The compiler should therefore recover from
almost all symptoms, allowing detection of as many as possible in a single run. Some errors
(or restrictions) make it impossible for the compiler to continue; in this case it is best to give
a report and terminate gracefully. We shall term such errors deadly, and attempt to minimize
their number by careful language and compiler design.
Recovery requires that the compiler make some alteration of its state to achieve con-
sistency. This alteration may cause spurious errors to appear in later text that is actually
correct. Such spurious errors constitute an avalanche, and one of the major design criteria
for a recovery scheme is to minimize avalanches. We shall discuss this point in more detail in
Section 12.2.
If the compiler is able to diagnose and repair all errors with a high probability of success,
then the program could safely be executed to permit detection of further errors. We must,
however, be quite clear that a repair is not a correction. Much of the early literature on this
subject used these terms interchangeably. This has unfortunate connotations, particularly for
the novice, indicating that the compiler is capable of actually determining the programmer's
intent.
Repair requires some circumspection, since the cost of execution could be very high and
the particular nature of the repair could render that execution useless or could cause it to
destroy important data les. In general, repair should not be attempted unless the user
specically requests it.
As in the case of recovery, we may classify certain errors as uneconomic or impossible
to repair. These are termed fatal, and may cause us to refuse to execute the program. If a
program containing a fatal error is to be executed, the compiler should produce code to abort
the program when the error location is reached.

12.1.3 Communication with the User

The program listing is the primary document linking the user and the compiler. At a min-
imum, the listing reproduces the source program that the compiler translated; it may also
provide indexes and cross-references to data items, labels and procedures. All error reports
must indicate the relevant position of the symptom on the listing in addition to describing
the symptom.
As indicated in Figure 1.3, the compiler itself should not produce the program listing. A
separate listing editor uses the original source text and a compiler-generated error report le
to create the listing. Each error report species the error number and a source text position.
The reports are sorted according to source text position either by the compiler or by the
listing editor. As the listing editor creates the listing, it inserts the full text of the error
message at the error location. A standard format, which causes the message to stand out
in the listing, should be used: Special characters, printed in some part of the print line that
is normally blank, act as a ag. The position of the symptom is clearly marked, and the
remainder of the line contains a brief description. This description should be readable (in the
12.1 General Principles 257

user's natural language), restrained and polite. It should be stated in terms of what the user
has done (or not done) rather than in terms of the compiler's internal state. If the compiler
has recovered from the error, the nature of the recovery should be made clear so that any
resulting avalanche will be understandable.
Ideally, error reports should occur in two places: at the point where the compiler noticed
the symptom, and in a summary at the end of the program. By placing a report at the point
of detection, the compiler can identify the coordinates of the symptom in a simple manner
and spare the programmer the task of switching his attention from one part of the listing to
another. The summary report directs the programmer to the point of error without requiring
him to scan the entire listing, reducing the likelihood that errors will be missed.
Compiler error reports may be classied into several levels according to severity:
1. Note
2. Comment
3. Warning
4. Error
5. Fatal error
6. Deadly error
Levels 1-3 are reports of anomalies: Notes refer to nonstandard constructs, and are only
important for programs that will be transported to other implementations; comments criticize
programming style; warnings refer to possible errors. The remaining levels are reports of
actual errors or violations of limits. Errors at level 4 can be repaired, fatal errors suppress
production of an executable program (but the compiler will recover from them), and deadly
errors cause compilation to terminate.
The user should be able to suppress messages below a given severity level. Both the default
severity cuto and the number of reports possible on each level will vary with the design goals
of the compiler. A compiler for use in introductory programming courses should probably
have a default cuto of 0 or 1, and produce a plethora of comments and warnings; one for
use in a production operation with a single type of computer should probably have a cuto
of 3, and do very little repair. The ability to vary these characteristics is a key component in
the adaptability of a compiler.
The programmer's ability to cope with errors seems to be inversely proportional to the
density of errors. If the error density becomes very large, the compiler should probably
abandon the program and let the programmer deal with those errors found so far. (There is
always the chance that a job control error has been made, and the `program' is really a data le
or a program in another language!) It is dicult to state a precise criterion for abandonment,
but possibly one should consider this response when the number of errors exceeds one-tenth
of the number of lines processed and is greater than 10.
The error report le is maintained by a module that provides a single operation:
Error (position , severity , code )
position : The source text position for the message.
severity : One of the numbers 1-6, as discussed above.
code : An integer defining the error.

There is no need to supply additional information, such as symbols or context, in the

error report. For example, if the symptom is that a particular symbol is undened, we do
not need to include the symbol. This is because the position is located precisely, and the
message points directly to the symbol for which there is no denition. Further, the position
given by the report need not be the position reached by the lexical analyzer at the time the
258 Error Handling

error was detected. We can retain position information for certain constructs and then use
that information later when we have sucient context to diagnose an error. For example,
suppose that a label was declared in a Pascal program and then never used. The error would
be diagnosed at the end of the procedure declaring the label, but we would give the position
of the declaration in the report and therefore the message `label never used' would point
directly to the declaration.

12.2 Compiler Error Recovery

All errors detected at compile time are detected during analysis of the source program. Dur-
ing program synthesis, we can detect only compiler errors or violations of limits; these are
invariably fatal, and do not interest us in this section. Errors detected during analysis can be
classied by the analysis task being carried out at the time:
Lexical. Errors in token formation, such as illegal characters or misspelled keywords.
Syntactic. Errors in structure formation, such as missing operators or parentheses.
Semantic. Errors in agreement, such as operands whose types are incompatible with
their operator, or undeclared variables.
If recovery is to be achieved, each analysis task must repair the errors it detects and pass
a consistent result to the next task. Unfortunately, this repair may be less than perfect;
it usually leads to a local repair, rather than a repair in the sense of Section 12.1.2 and
often results in detection of related errors by subsequent tasks that have more contextual
information.
Any recovery scheme must be based upon redundant information present in the program.
The higher the redundancy, the easier and more certain recovery will be. Since the amount
of structure available to the error recovery procedure increases signicantly from the lexical
level to the semantic level, competent semantic error recovery is considerably easier than
competent recovery from lexical errors. We shall therefore begin by discussing recovery from
semantic errors and work our way back through syntactic errors to lexical errors.
12.2.1 Semantic Errors
Semantic errors are detected when conditions embedded in the attribute grammar of the
language yield false. Recovery from semantic errors is simply a function of the attribute
grammar itself. In Chapter 8 we emphasized the importance of guaranteeing that all attributes
are dened under all circumstances, and noted that this implied the introduction of special
error values for some attributes.
If the attributes of an item can be determined unambiguously then the compiler can
work with the correct attributes after an error has been detected. This occurs in LAX with
multiple denitions of an identier in a range, possibly as a eld selector or formal parameter.
Operands on the right hand sides of identity declarations and assignments provide another
example, as do situations in which the operator fully determines the type of the required
operand(s). Finally, we have type declarations for which the storage requirements cannot be
determined:
type t = record a : integer ; b : t end.
The recovery is more dicult if several attributes in uence the choice, or if the erroneous
symbol is not unambiguously determined. Consider the case of a binary operator indication,
12.2 Compiler Error Recovery 259

none of whose associated operators is consistent with the pattern of operand types given. This
symptom could result from an error in one of the operand expressions, or from an erroneous
operator indication. There is no way to be certain which error has occurred, although the
probability of the former is enhanced if one of the operands is consistent with some operator
associated with the indication. In this case, the choice of operator should be based upon the
consistent operand, and might take into account the use of the result. If this choice is not
correct, however, spurious errors may occur later in the analysis. To prevent an avalanche
in this case, we should carry along the information that a semantic error has been repaired.
Further error messages involving type mismatches of this result should then be suppressed.
Another important class of semantic error is the undeclared identier. We have already
noted (Section 12.1.1) that this error may arise in several ways. Clearly we should produce
an error message if the problem was that the identier was misspelled on this use, but if
the declaration were misspelled or omitted the messages attached to each use of the variable
constitute an avalanche, and should be suppressed.
In order to distinguish between these cases, we might set up a denition table entry for the
undeclared identier specifying as many properties as could be determined from the context
of the use. Subsequent occurrences could then be used to rene the properties, but error
messages would not be issued unless the properties were inconsistent. This strategy attempts
to distinguish the cases on the basis of frequency of use of an identier: At the rst use an
error will be reported; thereafter we assume that the declaration is missing or erroneous and
do not make further reports. This method works well in practice. It breaks down when the
programmer chooses an identier susceptible to a consistent misspelling, or when the text
is entered into the machine by a typist prone to a certain type of error (usually a character
transposition or replacement).
The specic details of the consistency check are language dependent. As a concrete ex-
ample, consider the algorithm used by the Whetstone Compiler for ALGOL 60 [Randell
and Russell, 1964]. (There the algorithm is not used to suppress avalanches, but rather
to resolve forward references to declared identiers in a one-pass compilation.) The Whet-
stone Compiler created a property set upon the rst use of an (as yet) undeclared identier,
with each element specifying a distinct property that could be deduced from local context
(Table 12.1). The rst three elements of Table 12.1 determine the form of the use, while the
remaining nine elements retain information about its context. For each successive occurrence,
a new set A0 was established and checked for consistency with the old one, A: The union of
the two must be identical to either set (e.g. A must be a subset of A0 or A0 must be a subset
of A). If A0 is a superset of A, then the new use provides additional information.
Suppose that we encounter the assignment p := q where neither p nor q have been seen
before. We deduce that both p and q must have the form of simple variables, and that
values could be assigned to each; the type must therefore be real, integer or Boolean. If
the assignment r := p + s; were encountered later, we could deduce that p must possess an
arithmetic (i.e. real or integer) value. This use of p is consistent with the former use, and
provides additional information. (Note that the same deduction can be applied to q, but this
relationship is a bit too devious to pursue.) Figures 12.2a and 12.2b show the sets established
for the rst and second occurrences of p. If the statement p[i] := 3; were now encountered,
the union of Figure 12.2c with Figure 12.2b would indicate an inconsistency.
If a declaration is available, we are usually not able to accept additional information about
the variable. There is one case in ALGOL 60 (and in many other languages) in which the
declaration does not give all of the necessary information: A procedure used as a formal
parameter might or might not have parameters of its own, so the declaration does not specify
which of the properties fsimple; procg should appear (Figure 12.2d). That decision must be
deferred until a call of the procedure is encountered.
260 Error Handling

Property Meaning
simple The use takes the form of a simple variable.
array The use takes the form of an array reference.
proc The use takes the form of a procedure call.
value The object may be used in a context where a value is required.
variable The object has a Boolean value.
arithmetic The object has an arithmetic (i.e. integer or real) value.
Boolean The object has a Boolean value.
integer The object has an integer value.
location The object is either a label or a switch.
normal The object is not a label, switch or string.
string The object is a string.
nopar The object is a parameterless procedure.
Table 12.1: Identier Properties in the Whetstone ALGOL Compiler
fsimple; value; variableg
a) Property set for both p and q derived from p := q
fsimple; value; variable; arithmeticg
b) Property set for p derived from r := p + s;
farray; value; variableg
c) Property set for p derived from p[i] := 3;
procedure x(p); procedure p;
d) A declaration that leaves properties unspecied
Figure 12.2: Consistency Checks

12.2.2 Syntactic Errors

A syntactic error is one resulting in a program that is not a sentence in the (context-free)
language being compiled. Recovery from syntactic errors can change the structure of the
program and the entire semantic analysis. (Lexical errors with such far-reaching consequences
are considerably rarer.)
Consider the grammar G = (N; T; P; Z ) for the source language L. If we think of the
elements of T as being points in space, we might ask which sentence is `closest' to the
erroneous program. We would then take this sentence as the correct version of the program,
and dene the error as the transformation that carries the correct program into the incorrect
one. This approach is called minimum-distance correction, and it requires that we dene a
metric on the T space. One way of dening this metric is to regard every transformation as
a sequence of elementary transformations, each corresponding to a distance of 1. The usual
elementary transformations are:
Insert one symbol
Delete one symbol
Replace one symbol by another
Global minimum-distance correction, which examines the entire program, is currently
impractical. Moreover, a minimum-distance correction is often not the best: The minimum-
distance correction for an ALGOL 60 statement containing more than one error would be
12.2 Compiler Error Recovery 261

to precede it with comment! For ALGOL-like languages simpler methods that can change
more symbols are often superior. On the other hand, global minimum-distance correction
minimizes avalanches.
The symptom of a syntactic error is termed a parser-dened error. Since we parse a
program deterministically from left to right, the parser-dened error is the rst symbol t such
that ! is a head of some string in the language, but !t is not. For example, the string !
of Figure 12.3a is certainly a head of a legal FORTRAN program, which might continue as
shown in Figure 12.3b. If t is the end-of-statement marker, #, then !t is not the head of
any legal program. Hence # constitutes a parser-dened error. Possible minimum-distance
corrections are shown in Figure 12.3d. From the programmer's point of view, the rst has
the highest probability of being a correct program. This shows that a parser-dened error
may not always coincide with the point of the error in the user's eyes. This is especially true
for bracketing errors, which are generally the most dicult to repair.
DO 10 I = J(K,L
a) A head, !, of a FORTRAN program
!)#
b) A possible continuation (# is end-of-statement)
!#
c) A parser-dened error
DO 10 I = J,K,L
DO 10 I = J(K,L)
d) Two minimum-distance corrections
Figure 12.3: Syntax Errors
Ad hoc parsing techniques, and even some of the older formal methods, may fail to detect
any errors at all in certain strings not belonging to the language. Other approaches (e.g. simple
precedence) may delay the point of detection arbitrarily. The LL and LR algorithms will
detect the error immediately, and fail to accept t. This not only simplies the localization of
the symptom in the listing, but also avoids the need to process any syntactically incorrect text.
Recovery is eased, since the immediate context of the error is still available for examination
and alteration.
If !t 2 (T , L) is an erroneous program with parser-dened error t, then to eect
recovery the parser must alter either ! or t such that !0 t 2 L or !t0 0 2 L. Alteration of !
is unpleasant, since it may involve undoing the eects of connection points. It will also slow
the processing of correct programs to permit backtrack when an error is detected. Thus we
shall only consider alteration of the erroneous symbol t and the following string .
Our basic technique will be to recover from each error by the following sequence of steps:
1. Determine a continuation, , such that ! 2 L.
2. Construct a set of anchors D = fd 2 T j is a head of and !d is a head of some
string in Lg.
3. Find the shortest string 2 T such that t = t00 0 ; t00 2 D.
4. Discard from the input string and insert the shortest string 2 T such that !t00 is
a head of some string in L.
5. Resume the normal parse.
262 Error Handling

This procedure can never cause the error recovery process to loop indenitely, since at
least one symbol (t00 ) of the input string is consumed each time the parser is restarted. Note
also that it is never necessary to actually alter the input string during step (4); the parser
is simply advanced through the required steps. A dummy symbol of the appropriate kind is
created at each symbol connection encountered during this advance.
The sequence of connection points reported by the parser is always consistent when this
error recovery technique is used. Semantic analysis can therefore proceed without checking
for inconsistent input. Generated symbols, however, must be recognized as having arbitrary
attributes. This is guaranteed by using special `erroneous' attribute values as discussed in
the previous section.
It is clear from the example of Figure 12.3 that we can make no claim regarding the
`correctness' of the continuation determined during step (1). The quality of the recovery in the
eyes of the user depends upon the particular continuation chosen, but it seems unlikely that we
will nd an algorithm that `optimizes' this choice at acceptable cost. We therefore advocate
a process that can be incorporated into a parser generator and applied automatically without
any eort on the part of the compiler writer. The most important benet is a guarantee
that the parser will recover from all syntactic errors, presenting only consistent input to the
semantic analyzer. This guarantee cannot be made with ad hoc error recovery techniques.
P = fZ ! E #,
E ! FE 0 ,
E 0 ! +FE 0 , E 0 ! ,
F ! i, F ! (E )g
a) Productions of the grammar
Z ! E#
E ! FE 0
E0 !
F !i
b) Designated productions
q 0 i ! q1 q2 i, q0 (! q1 q2 (,
q1 ! ,
q 2 i ! q3 q4 i, q2 (! q3 q5 (,
q3 # ! q6 q7 #, q3 ) ! q6 q7 ), q3+ ! q6 q8 +,
q 4 i ! q9 ,
q5 (! q10 ,
q6 ! ,
q7 ! ,
q8 + ! q11 ,
q9 ! ,
q10 i ! q12 q2 i, q10 (! q12 q2 (,
q11 i ! q13 q4 i, q11 (! q13 q5 (,
q12 ) ! q14 ,
q13 # ! q15 q7 #, q13 ) ! q15 q7 ), q13 + ! q15 q8 +,
q14 ! ,
q15 !
c) The transitions of the parsing automaton (compare Figure 7.4)
Figure 12.4: Adding Error Recovery to an LL(1) Parser
12.2 Compiler Error Recovery 263

q0 i + #
q1 q2 i + #
q1 q3 q4 i + #
q1 q3 q9 + #
q1 q3 + #
q1 q6 q8 + #
q1 q6 q11 #
a) Parse to the point of error detection
q1 q6q11 D = fi(g
q1 q6q13 q4
q1 q6q13 q9
q1 q6q13 D = fi(#)+g
q1 q6q15 q7
q1 q6q15
q1 q6
q1
b) Continuation to the nal state
q1 q6 q11 #
q1 q6 q13 q4#
q1 q6 q13 q9# i is generated by q4i ! q9
q1 q6 q13 # the normal parse may now continue
c) Continuation to the resume point
Figure 12.5: Recovery Using Figure 12.4c

We begin by designating one production for each nonterminal, such that the set of desig-
nated productions contains no recursion. For example, in the production set of Figure 12.4a
we would designate the productions listed in Figure 12.4b. (With this example the desig-
nation is unique, a condition seldom encountered in larger grammars.) We then reorder the
productions for each nonterminal so that the designated production is rst, and apply the
parser generation algorithms of Chapters 5 and 7. As the transitions of the parsing automata
are derived, certain of them are marked. When an error occurs during the parse, we choose
a valid continuation by allowing the parsing automaton to carry out the marked transitions
until it reaches its nal state. No input is read during this process, but at each step the set
of input symbols that could be accepted is added to the set of anchors.
Construction 5.23, as modied in Section 7.2.1 for strong LL(1) grammars, was used to
generate the automaton of Figure 12.4c. The transitions were marked as follows (marked
transitions are preceded by an asterisk in Figure 12.4c):
Any transition introduced by step 3 or step 4 of the construction was marked.
The elements of H in step 5' are listed in the order discussed in the previous paragraph.
The rst transition q! ! qh[1]! of a group introduced by step 5' was marked.
To see the details of the recovery, consider the erroneous sentence i+#. Figure 12.5a traces
the actions of the automaton up to the point at which the error is detected. The continuation
is traced in Figure 12.5b. Note that the input is simply ignored, and the stack is updated
as though the parser were reading symbols that caused it to make the marked transition. At
each step, all terminal symbols that could be accepted are added to D. Figure 12.5c shows
264 Error Handling

(1) Z ! E #
(2) E ! E + F , (3) E ! F
(4) F ! i, (5) F ! (E )
a) The grammar
0: Z ! E ; # 4: F ! (E ) ; #+)
E ! F ; #+ E ! F ; )+
E ! E + F ; #+ E ! E + F ; )+
F ! i ; #+ F ! i ; )+
F ! (E ) ; #+ F ! (E ) ; )+
1: Z ! E ; # 5: E ! E + F ; #+)
E ! E + F ; #+ F ! i ; #+)
F ! (E ) ; #+)
2: E ! F ; #+)
6: F ! (E ) ; #+)
3: F ! i ; #+) E ! E + F ; )+
7: E ! E + F ; #+)
8: F ! (E ) ; #+)
b) States of the Automaton
i ( ) + # E F
0 -4' 4 . . . 1 -3
1 . . . 5 *1'
4 -4' 4 . . . 6 -3
5 -4' 4 . . . -2
6 . . -5' 5 .
c) The transition function for the parser
Figure 12.6: Error Recovery in an LR(0) Parser

the remainder of the recovery. No symbols are deleted from the input string, since # is in
the set of anchors. The parser now follows the continuation again, generating any terminal
symbols needed to cause it to make the marked transitions. When it reaches a point where
the rst symbol of the input string can be accepted, the normal parse resumes.
Let us now turn to the LR case. Figure 12.6a shows a left-recursive grammar for the same
language as that dened by the grammar of Figure 12.4a. The designated productions are
1, 3 and 4. If we reorder productions 2 and 3 and then apply Construction 5.33, we obtain
the states of Figure 12.6b. The situations are given in the order induced by the ordering of
the productions and the mechanics of Construction 5.33. Figure 12.6c shows the transition
table of the automaton generated from Figure 12.6b, incorporating shift-reduce transitions.
The marked transition in each state (indicated by a prime) was the rst shift, reduce or
shift-reduce transition generated in that state considering the situations in order.
An example of the LR recovery is given in Figure 12.7, using the same format as Fig-
ure 12.5. The erroneous sentence is i+)i#. In this case, ) does not appear in the set of
anchors and is therefore deleted.
One obvious question raised by use of automatic syntactic error recovery is that of provid-
ing meaningful error reports for the user. Fortunately, the answer is also obvious: Describe
12.2 Compiler Error Recovery 265

q0 i+)i#
q0 q1 +)i#
q0 q1 q5 )i#
a) Parse to the point of error detection
q0q1 q5 D = fi (g
q0q1 D = fi ( + #g
b) Continuation to the nal state
q0 q1q5i# the normal parse may now continue
c) Continuation to the resume point
Figure 12.7: LR Error Recovery

the repair that was made! This description requires one error number per token class (Sec-
tion 4.1.1) to report insertions, plus a single error number to report deletions. Since token
classes are usually denoted by a nite type, the obvious choice is to use the ordinal of the
token class as the error number to indicate that a token of that class has been inserted.
Missing or super uous closing brackets always present the danger that avalanches will
occur because brackets are inserted in (globally) unsuitable places. For this reason we must
take cognizance of error recovery when designing the grammar. In particular, we wish to
make bracketed constructs `visible' as such to the error recovery process. Thus the grammar
should be written to ensure that closing brackets appear in the anchor sets for any errors
that could cause them to be deleted from the input string. This condition guarantees that an
opening bracket will not be deleted by mistake and lead to an avalanche error at the matching
closing bracket. It is easy to see that the grammar of Figure 12.4a satises the condition, but
that it would not if F were dened as follows:
F ! i; F ! (F 0 ,
F 0 ! E)
12.2.3 Lexical Errors
The lexical analyzer recognizes two classes of lexical error: Violations of the regular grammar
for the basic symbols and illegal characters not belonging to the terminal vocabulary of the
language or, in languages with stropping conventions, misspelled keywords.
Violations of the regular grammar for the basic symbols (`structural' errors), such as the
illegal LAX oating point number :E 2, are recovered in essentially the same way as syntax
errors. Characters are not usually deleted from the input string, but insertions are made as
required to force the lexical analyzer to either a nal state or a state accepting the next input
character. If a character can neither form part of the current token, nor appear as the rst
character of any token, then it must be discarded. A premature transition to a nal state
can make two symbols out of one, usually resulting in syntactic avalanche errors. A third
possibility is to skip to a symbol terminator like `space' and then return a suitable symbol
determined in an ad hoc manner. This is interesting because in most languages lexical errors
occur primarily in numbers, where the kind of symbol is known.
Invalid characters are usually deleted without replacement. Occasionally these characters
are returned to the parser so it can give a more informative report. This behavior violates
the important basic principle that each analysis task should cope with its own errors.
When keywords are distinguished by means of underlines or bracketed by apostrophes,
the compiler has sucient information available to attempt a more complete recovery by
checking for certain common misspellings. If we restrict ourselves to errors consisting of
266 Error Handling

single-character substitutions, insertions, omissions or transpositions then the length of the

basic symbol cannot change by more than one character. For each erroneous symbol there
exists a (relatively small) set of correct keywords that are identical to it if one of these errors
occurred.
If a spelling-correction algorithm is used, it should form a distinct module that tests a
pair of strings to determine whether they are equivalent under one of the four transformations
listed in the previous paragraph. The two strings should be in a standard form, chosen to
speed the test for equivalence. This module can be used in other cases also, such as to check
whether an undened identier is misspelled. The spelling-correction algorithm should not
be required to scan a list of candidate strings, since dierent callers will generate candidates
in dierent ways.
The decision to provide spelling correction usually has far-reaching eects on the compiler
data structures: Searches for additional candidates to test against a misspelled word often
have a pattern dierent from the normal accesses. This entails additional linkage, as well as
the additional information to facilitate `quick checks'. Such increases in data storage violate
our previously-stated principle that an error-free program should not be required to pay for
error recovery.

12.3 Run-Time Errors

During execution of a program, the values of the data objects obey certain restrictions and
relationships, so that the operations of the program can be carried out. Most relationships
result either implicitly or explicitly from the language denition or implementation restric-
tions. When the validity of these relationships cannot be determined from the context during
compilation, they can be tested at run time with the help of the hardware or by code gen-
erated by the compiler. If such a test fails, then a symptom of a run-time error has been
detected.
Examples of such relationships are given in Figure 12.8. Since c 2 cannot be less than
0, the compiler could prove that both the rst and the third assertions in Figure 12.8b hold;
in the case of 1 + c 2 6= 0, however, this would be costly. Frequently the rst assertion will
be tested again at run time (and consequently the test could be omitted at compile time),
because the computation and test of the storage mapping function is done by a standard
library routine.
A run-time error report should give the symptom and location in the source program. The
compiler must therefore provide at least the information needed by the run-time system to
locate the symptom of the error. If a more exact description or a diagnosis of the cause of the
error is required, the compiler must prepare additional information about the neighborhood
of the error and its dynamic environment. Debugging aids (like traces and snapshots) require
similar information from the compiler's symbol and denition tables.
In this section we shall not consider run-time error handling in detail. Our concern will
be with the information that the compiler must provide to the run-time system to make
competent error handling possible.
12.3.1 Static Error Location
In order to specify the exact location of an error in the program, it must be possible to
determine from the instruction position, z , the position, f (z ), of the corresponding source
text in the program listing. This requires us to establish an appropriate coordinate system
for the listing. The lines of the listing are usually chosen as the basis for this coordinate
system, and are numbered in ascending order of appearance to facilitate location of a position
12.3 Run-Time Errors 267

a : array [1 : 4; 1 : 4] of real;
:::
b := a[3; i]=(1 + c 2)
a) A LAX fragment
134
1i4
1 + c 2 6= 0
b) Relationships implied by the LAX denition and (a)
J=K*L
c) A FORTRAN statement
jK j < 248

d) Relationship implied by the Control Data 6000 FORTRAN implementation and (c)
ASSERT m = n
e) Relationship explicitly stated by the programmer
Figure 12.8: Implicit and Explicit Relationships

in the program. The numbers may be chosen in various ways: One of the simplest is to
use the address of the rst instruction generated by the source line. (This numbering, like
others discussed below, may contain gaps.) The contents of the location counter provides a
direct reference to the program line if the compiler produces absolute code. If the compiler
produces relocatable code and the nal target program is drawn from several sources, then
the conversion f (z ) rst requires identication of the (separately compiled) program unit by
means of a load map produced when the units are linked. This map gives the absolute address
of each program unit. The relative address appearing on the listing is obtained by subtracting
the starting address from the address of the erroneous instruction.
If the compiler has used several areas for instructions (Section 11.2), the monotonicity of
the (relative) addresses is no longer guaranteed and we must use arbitrary sequence numbers.
These numbers could be provided by the programmer himself or supplied by the compiler. In
the latter case the number could be incremented for each line or for each construct of a given
class (for example, assignments).
When arbitrary sequence numbers are used, the compiler must either store f (z ) in tabular
form accessible to the run-time system or insert instructions into the target program to place
the current sequence number into some specied memory location. If a table is given in a
le, a relationship between the table and the program must be established by the run-time
system; no further cost is incurred. In the second case all information is held within the
program and a run-time overhead in both time and space is implied.
The line number, and even the position within the line, can be given for each instruction
if a table is used. For dynamic determination of line numbers, the line number must be set
in connection with a suitable syntactic unit of the source program. The instructions making
up an assignment, for example, do not always occur in the order in which they appear in the
source program. This is noticeable when the assignment is spread over several source lines.
Of course the numbering need only be updated at those syntactic units that might fail; it
may be omitted for the empty statement in ALGOL 60, for example.
268 Error Handling

12.3.2 Establishing the Dynamic Environment

Run-time errors usually lead to symptoms that can be described quite simply. Diagnosis of the
error from these symptoms is considerably more dicult than diagnosis of compile time errors
because it must take account of the dynamic environment of the error: the values of data
objects being manipulated and the path by which control arrived at the failure point. Most
of this information can be recovered from the contents of the memory at the failure point;
the only diculty lies in establishing the correct relationship to the source program. For this
purpose, the compiler should at least provide sucient information in the source program
listing to enable the programmer to locate every data object in a printout of the memory
contents. This information, in conjunction with that discussed in Section 12.3.1, we shall
term cross-reference information ; if it exists in tabular form, these tables are cross-reference
tables.
Analysis of a memory dump is always tedious. In order to provide a more convenient
specication of the data objects, the compiler could generate templates similar to those needed
to support garbage collection (Section 3.3.3). These templates can then be used by a run-time
support routine to print the object in a suitable form. Templates may be incorporated into
the compiled program or written on an auxiliary le. Extra storage is required by the former
approach, cooperation of the loader and the operating system by the latter.
A symbolic dump describes a single state of the computation | it is a `snapshot' of the
program's execution. In order to achieve a full understanding of the symptom we often need
information about how the program reached the failure point. There are two aspects of this
execution history, the call hierarchy, which species the procedures whose invocation has not
yet ended, and the jump history, which denes the path taken through the procedures.
The call hierarchy is embodied in the current state as a chain of procedure activation
records. In order to represent it we extend the symbolic dump by attaching the procedure
name and point of call to each procedure's activation record. (The former is obtained from
the cross-reference tables, the latter from the return address.)
The jump history, represented by the addresses of successful jumps, cannot be obtained
from the environment of the symptom. It must be stored explicitly during execution. Either
the compiler must generate specic instructions for this purpose, or the hardware must store
the addresses of successful jumps automatically (EDSAC 2 [Barron and Hartley, 1963]
and the Siemens 7000 series are examples of such machines). The relevance of the jump history
diminishes with the `age' of the jumps; to save memory we would therefore retain only the
most recent jump addresses. In some debugging systems for machine-oriented languages the
number 4 is chosen, EDSAC 2 chose 41 and the Siemens 7000 chose 64. Loops rapidly ll the
jump history with useless information. It is thus better to store a sequence of identical jumps
as a single address with a cycle count. Cycles of length 2 can be represented in a similar
manner, but recognition of longer cycles does not seem worthwhile.
In a language like LAX, which provides a variety of control structures, source programs
will usually contain no jumps at all. The jump history is thus understandable only if the
sequence of source language constructs that created it can be recovered. For this purpose one
can use the cross-referencing techniques of Section 12.3.1, augmented with information about
the kind of jump (conditional, case clause, repetition of a loop, etc.) The source language
constructs need be determined from the cross-reference tables only when the dump actually
occurs, and then only for the jumps appearing in the jump history.
We must always be aware of the possibility that the state of the memory may have
been corrupted by the error, and that inconsistencies may be present that could cause the
analysis routines to loop or make further errors. During the output of a symbolic dump or
jump history all information must be carefully examined for consistency. The compiler may
12.4 Notes and References 269

provide redundant information, for example special bit patterns in particular places, to aid
in this process.

12.3.3 Debugging Aids

A program can be tested by following its progress to normal termination or to some unusual
event. This can be done by tracing the jump addresses and/or procedure calls, tracing the
values of certain data objects, or taking selective symbolic dumps. When working interac-
tively, one can insert breakpoints to halt execution and permit examination and resetting of
variables. The program can then be restarted at a specied point, possibly after alteration
of the call hierarchy. All of these techniques require the support of the compiler as discussed
in Sections 12.3.1 and 12.3.2.
All supervision mechanisms other than those specic to interactive execution can be pro-
vided by modication and recompilation of the program. With large programs this is quite
costly; in addition, the modication can cause unrecognized side eects in the program's
behavior. By concentrating the facilities in a test system independent of the compiler, this
problem can be avoided. Such a solution increases the demands on the cross-reference ta-
bles, since the test system is now in the position of having to use them to modify the target
program. If the same test system is to be used for several languages, then the structure and
contents of the cross-reference tables becomes a standard interface for all compilers.

12.4 Notes and References

The user orientation of the error handling (understandable error reports, suppression of ava-
lanches, run-time information in terms of the source program), and the principle that the
cost of preventive tests should be as small as possible, obviously represent the main problems
of error handling today. Koster [1973] gives a good overview of the demands placed upon
the error handler. The implementation of PL/C [Conway and Wilcox, 1973] represents an
attempt at extensive error recovery.
Lyon [1974] gives an algorithm for global minimum-distance correction that requires
O(n2 ) space and O(n3 ) time to correct an n-symbol input string. Theoretical results
[Peterson, 1972] indicate that improvement of these bounds is highly unlikely. A back-
tracking method for global repair of syntactic errors is given by Levy [1975]; our approach
is based upon some ideas of Irons [1963a] that were applied to top-down parsers by Gries
[1971]. compiler construction gries 1971 Ro hrich [1978, 1980] formalized these ideas and
extended them to LR parsers. The use of recovery sequences as error messages rst appeared
in the SP/k compiler [Holt et al., 1977]. Damerau [1964] has observed that over 80% of
all spelling errors in a particular retrieval system consisted of single-character substitutions,
insertions, omissions or transpositions. This observation serves as the basis for most spelling
correction algorithms, of which the one described by Morgan [1970] is typical.
Dynamic updating of a variable containing a line number may consume signicant re-
sources. Brinch-Hansen and Hartmann [1975] notes that up to 25% of the generated code
for a Sequential Pascal program may be devoted to line number bookkeeping. Kruseman-
Aretz [1971] considers how this overhead can be minimized in the context of ALGOL 60,
and Klint [1979] suggests that the information be obtained from a static analysis of the
program rather than being maintained dynamically.
Symbolic dumps in source language terms have been available since the early sixties.
The papers by Seegmuller [1963] and Bayer et al. [1967] summarize the information the
compiler must provide to support them. Other descriptions of this information can be found in
270 Error Handling

the literature on symbolic debugging packages [Hall, 1975; Pierce, 1974; Satterthwaite,
1972; Balzer, 1969; Gaines, 1969].

Exercises
12.1 Dene the class of detectable errors for some language available at your installation.
Which of these are detected at compile time? At run time? Are any of the detectable
errors left undetected? Have you made any such errors in your programming?
12.2 We have classied the LAX expression (1=0) as a compile-time anomaly, rather than a
compile-time error. Some authors disagree, arguing that if the expression is evaluated at
run time it will lead to a failure and that if it can never be evaluated then the program
is erroneous for other reasons. Write a cogent argument for or against (whichever you
prefer) our classication.
12.3 The denition of the programming language Euclid species minimum limitations that
may be placed on programs by an implementation. For example, the denition re-
quires that any compiler accept expressions having parentheses nested to depth 7, and
programs having environments nested to depth 31. The danger of setting such min-
imum limits is pointed out by Sale [1977], who demonstrates that the requirement
for environments nested to depth 31 eectively precludes implementation of Euclid
on Burroughs 6700 and 7700 equipment. Comment on the advantages and disadvan-
tages of Euclid approach, indicating the scope of the problem and possible compromise
solutions.
12.4 Consider some compiler running at your installation. How are its error messages com-
municated to the user? If the result gives less information than the model we discussed
in Section 12.1.3, argue for or against its adequacy. Were there any constraints on the
implementor forcing him to his choice?
12.5 Experiment with some compiler running at your installation, attempting to create
an avalanche based upon a semantic error. If you succeed, analyze the cause of the
avalanche. Could it have been avoided? How? At what cost to correct programs?
If you do not succeed, analyze the cause of your failure. Is the language subject to
avalanches from semantic errors? Is the implementation very clever, possibly at some
cost to correct programs?
12.6 Under what conditions might a simple precedence analyzer [Gries, 1971] delay detec-
tion of an error?
12.7 [Ro hrich, 1980] Give an algorithm for designating productions of a grammar so that
there is one production designated for each nonterminal, and the set of designated
productions contains no recursion.
12.8 Apply the syntactic error recovery technique of Section 12.2.2 to a recursive descent
parser based upon extended BNF (Section 7.2.2).
12.9 Apply both the automaton of Figure 12.4c and that of Figure 12.6c to the string
(i(i + i#. Do you feel that the recovery is reasonable?
12.10 [Dunn and Waite, 1981] Consider the modication of Figure 7.9 to support automatic
error recovery.
12.4 Notes and References 271

(a) Assuming that the form of the table entry remained unchanged, how would you
incorporate the denition of the continuation into the tables?
(b) Based upon your answer to (a), write procedures parser error , get anchor and
advance parser to actually carry out the recovery. These procedures should be
nested in parser as follows, and parser should be modied appropriately to
invoke them:
parser
parser_error
get_anchor
advance_parser
(c) Carefully explain your mechanism for generating symbols. Does it require access
to information known only to the lexical analysis module? If so, how do you obtain
this information?
12.11 [Morgan, 1970] Design an algorithm for checking the equivalence of two strings under
the transformations discussed in Section 12.2.3. How would you interface this algorithm
to the analysis process discussed in Chapters 6 and 7? Be specic !
12.12 Consider some compiler running at your installation. How is the static location of
a run-time error determined when using that compiler? To what extent could the
determination be automated without making any change to the compiler? What (if
anything) would such automation add to the cost of running a correct program?
12.13 [Kruseman-Aretz, 1971] A run-time error-reporting system for ALGOL 60 programs
uses a variable lnc to hold the line number of the rst basic symbol of the smallest
statement whose execution has begun but not yet terminated. We wish to minimize
the number of assignments to lnc . Give an algorithm that decides when assignments
to lnc must be generated.
12.14 Consider some compiler running at your installation. How is the dynamic environment
of a run-time error determined when using that compiler? To what extent could the
determination be automated without making any change to the compiler? What (if
anything) would such automation add to the cost of running a correct program?
12.15 [Bayer et al., 1967] Consider some language and machine with which you are familiar.
Dene a reasonable symbolic dump format for that language, and specify the infor-
mation that a compiler must supply to support it. Give a detailed encoding of the
information for the target computer, and explain the cost increase (if any) for running
a correct program.
272 Error Handling
Chapter 13
Optimization
Optimization seeks to improve the performance of a program. A true optimum may be too
costly to obtain because most optimization techniques interact, and the entire process of
optimization must be iterated until there is no further change. In practice, therefore, we
restrict ourselves to a xed sequence of transformations that leads to useful improvement in
commonly-occurring cases. The primary goal is to compensate for ineciencies arising from
the characteristics of the source language, not to lessen the eects of poor coding by the
programmer. These ineciencies are inherent in the concept of a high level language, which
seeks to suppress detail and thereby simplify the task of implementing an algorithm.
Every optimization is based upon a cost function, a meaning-preserving transformation,
and a set of relationships occurring within some component of the program. Code size,
execution time and data storage requirements are the most commonly used cost criteria; they
may be applied individually, or combined according to some weighting function.
The boundary between optimization and competent code generation is fuzzy. We have
chosen to regard techniques based upon processing of an explicit computation graph as opti-
mizations. A computation graph is implicit in the execution-order traversal of the structure
tree, as pointed out at the beginning of Chapter 10, but the code generation methods dis-
cussed so far do not require that it ever appear as an explicit data structure. In this chapter
we shall consider ways in which a computation graph can be manipulated to improve the
performance of the generated code.
Our treatment in this chapter diers markedly from that in the remainder of the text.
The nature of most optimization problems makes computationally ecient algorithms highly
unlikely, so the available techniques are all heuristic. Each has limited applicability and
many are quite complex. Rather than selecting a particular approach and exploring it in
detail, we shall try to explain the general tasks and show how they t together. Citations to
appropriate literature will be given along with the discussion. In Section 13.1 we motivate the
characteristics of the computation graph and sketch its implementation. Section 13.2 focuses
on optimization within a region containing no jumps, while Section 13.3 expands our view
to a complete compilation unit. Finally, Section 13.4 gives an assessment of the gains to be
expected from various optimizations and the costs involved.

13.1 The Computation Graph

Protable optimizations usually involve the implementation of data access operations, and
hence the target form of these operations should be made explicit before optimization be-
gins. Moreover, many optimizations depend upon the execution order, and others may alter
that order. These requirements make the structure tree an unsuitable representation of the
273
274 Optimization

program being optimized. In the rst place, the structure tree re ects the semantics of the
source language and therefore suppresses detail. Secondly, execution-order tree traversals
depend upon the values of specied attributes and hence cannot be generated mechanically
by the tools of Chapter 8.
Data access operations are often implicit in the target machine code as well: They are
incorporated into the access paths of instructions, rather than appearing as separate com-
putations. Because of this, it is dicult to isolate them and discover patterns that can be
optimized. The target tree is thus also an unsuitable representation for use by an optimizer.
To avoid these problems, we dene the computation graph to have the following properties:
All source operations have been replaced by (sequences of) operations from the in-
struction set of the target machine. Coercions appear as machine operations only if
they result in code. Other coercions, which only alter the interpretation of the binary
representation of a value, are omitted.
Every operation appears individually, with the appropriate number of operands.
Operands are either intermediate results or directly-accessible values. Each value has a
specied target type.
All address computations are explicit.
Assignments to program variables are separated from other operations.
Control ow operations are represented by conditional and unconditional jumps.
Although based upon target machine operations, the computation graph is largely
machine-independent because the instruction sets of most Von Neumann machines are very
similar.
We assume that every operation has no more than one result. To satisfy this assumption,
we either ignore any side eects of the machine instruction(s) implementing the operation or
we create a sequence of operations making those side eects explicit. In both cases we rely
upon subsequent processing to generate the proper instructions. For example, the arithmetic
operations of some machines set the condition code as a side eect. We ignore this, producing
comparison operators (whose one result is placed in the condition code) where required.
Peephole optimization (Section 13.2.3) will remove super uous comparisons in cases where a
preceding arithmetic operation has properly set the condition code. The second approach is
used to deal with the fact that on many machines the integer division instruction yields both
the quotient and the remainder. Here we create a sequence of two operations for both div
and mod. The rst operation in each case is divmod ; the second is a unary selector, div or
mod respectively, that operates on the result of divmod . Common subexpression elimination
(Section 13.2.1) will remove any super uous divmod operators.
The atoms of the computation graph are tuples. A tuple consists of an operator of the
(abstract) target machine and one or more operands, each of which is either a value known
to the compiler or the result of a computation described by a tuple. Each appearance of a
tuple in the computation graph is called a program point, and given an integer index greater
than 0.
Let o1 and o2 be operands in a computation graph. These operands are congruent if
they are the same known value, or if they are the results of tuples t1 and t2 with the same
numbers of operands for which operator(t1 ) = operator(t2 ) and operandi (t1 ) is congruent to
operandi(t2 ) for all i. A unique operand identier is associated with each set of congruent
operands, and this identier is used to denote all of the operands in the set.
Figure 13.1b has 12 program points and 9 distinct tuples. Values known to the compiler
have the corresponding source language constructs as their operand identiers. The full
denition of a tuple is given only at its rst occurrence; subsequent occurrences are denoted
13.1 The Computation Graph 275

V.i := aa " * y + V.j ; aa " := aa " + V.j ;

a) A Pascal fragment
t1: aa " t1
t2 : t "
1 t2
t3 : y" t5
t4 : t t
2 3 t8 : t2 + t5
t5 : V.j " t9 : t1 := t8
t6 : t4 + t5
t7 : V.i := t6
b) The tuple sequence resulting from (a)
Figure 13.1: Tuples and Operands
by the operand identier alone. Note that each operand identier denotes a single value. For
example, V.j is the address of the j eld of the record V , relative to the base of the activation
record. This value is the sum of the oset of V from the base of the activation record and the
oset of j from the base of the record. Both osets are known to the compiler, and hence
the sum is known. Also, contrast the representations of the two assignments. In the rst,
the target address (V.i ) is known to the compiler, while in the second it is the content of a
pointer variable.
A module very similar to the symbol table acts as a source of unique operand identiers.
By analogy to Section 4.2.1, this module provides three operations:
initialize : Enter the standard entities.
give operand identifier (tuple spec)operand identifier : Obtain the operand
identier for a specied tuple or known value.
give tuple(operand identifier) tuple spec : Obtain the tuple or known value hav-
ing a specied operand identier.
Tuple spec is a variant record capable of describing any tuple or known value. One
possible representation would be as two major variants, a value descriptor to specify a known
value and an operator plus an array of operand identiers to specify a tuple.
A straight-line segment is a set of tuples, each of which will be executed exactly once
whenever the rst is executed. A straight-line segment of maximal length is called a basic
block. The ow graph of a compilation unit is a directed graph whose nodes are basic blocks
and whose edges specify the possible execution sequences of those basic blocks. We also
sometimes consider extended basic blocks, which are subtrees of the ow graph. (Extended
basic blocks correspond to nested conditional clauses and to the bodies of innermost loops that
contain no jumps.) The value of every tuple depends ultimately upon some set of variables.
If the value of any of these variables changes, then the value computed by the tuple will also
change. Figure 13.2c is a directed acyclic graph illustrating such dependency for the tuples
of Figure 13.2b. A tuple is dependent upon a variable if there is a directed path in the graph
from the node corresponding to the variable to the node corresponding to the tuple. When
the value of a variable is altered, any previously-computed value of a tuple depending upon
that variable becomes invalid. Note that a is treated as a single variable, whose value directly
in uences the value of t4 but not the value of t3 .
In general, evaluation of a particular tuple may use some operand values, dene some
operand values and invalidate some operand values. We can dene the following dependency
sets for each tuple t:
Ut = fo j o is a tuple or program variable operand of tg
Dt = fo j o is an operand dened by tg
Xt = fo j o is an operand invalidated by tg
276 Optimization

w := a[i] ; a[j] := x ; z := a[i] + z ;

a) A Pascal fragment
t1 : i " t6 : j " t1
t2 : t1 4 t7 : t6 4 t2
t3 : a + t2 t8 : a + t7 t3
t4 : t3 " t9 : x " t4
t5 : w := t4 t10 : t8 := t9 t11 : z "
t12 : t4 + t11
t13 : z := t12
b) Tuple sequence resulting from (a)
z 11 12 13

i 1 2 3 4 5

a x

j 6 7 8 9 10

c) Dependency graph for the tuples of (b)

U D X
t1 fi g ft1g fg
t2 ft1 g ft2g fg
t3 ft2 g ft3g fg
t4 ft3 g ft4g fg
t5 ft4 g fw ; t5 g fg
t6 fj g ft6g fg
t7 ft6 g ft7g fg
t8 ft7 g ft8g fg
t9 fx g ft9g fg
t10 ft8 ; t9g ft10 g ft ; t ; t ; t g
4 5 12 13
t11 fz g ft11 g fg
t12 ft3 ; t11 g ft12 g fg
t13 ft12 g fz ; t13 g ft ; t ; t g
11 12 13

d) Dependency sets for the tuples of (b)

Figure 13.2: Analyzing Array References
The rules of the language determine these sets. Figure 13.2d shows the sets for the tuples
of Figure 13.2b. It assumes that there are no other tuples in the program, and might dier
if that assumption were false. For example, suppose that the program contained a tuple
t14 : t8 ". In that case, Dt10 = ft14 g.
The eect of an assignment to a pointer variable is similar to, but more extensive than, that
of an assignment to an array element. Pointer variables in Pascal or Ada potentially access
any anonymous target of any other pointer variable of the same type. In LAX or ALGOL 68,
every object of the given target type is potentially accessible. A reference parameter of a
13.2 Local Optimization 277

procedure has the same properties as a LAX or ALGOL 68 pointer in most languages, except
that the accessibility is limited to objects outside the current activation record. A procedure
call must be assumed to use and potentially modify every variable visible to that procedure,
as well as every variable passed to it as a reference parameter.
To construct the computation graph, we apply the storage mapping, target attribution
and code selection techniques of Sections 10.1-10.3. These methods yield the tuples in an
execution order determined by the target attributes, in particular the register estimate. The
only changes lie in the code selection process (Section 10.3), where the abstract nature of the
computation graph must be re ected.
A new value class , generated , must be introduced in Figure 10.10. If the class of
a value descriptor is generated , the variant part contains a single id eld specifying an
operand identier. Decision tables (such as Figure 10.13) do not have tests of operand value
class in their condition stubs, nor do they generate dierent instructions for memory and
register operands. The result is a signicant reduction in the table size (Figure 13.3). Note
that the gen routine calls in Figure 13.3 still specify machine operation codes, even though
no instruction is actually being produced. This is done to emphasize the fact that the tuple's
operator is actually a machine operator. In this case we have chosen `A' to represent IBM
370 integer addition. A tuple whose operator was A might ultimately be coded using an AR
instruction or appear as an access path of an RX-format instruction, but it would never result
in (say) a oating add.
Result correct Y Y Y Y N N N N
l correct Y Y N N Y Y N N
r correct Y N Y N Y N Y N
swap(l,r) X X
gen(A,l,r) X X X X
gen(S,l,r) X X X X
gen(LCR,l,l) X X
Figure 13.3: Decision Table for +(integer , integer ) integer Based on Figure 10.13
The gen routine's behavior is controlled by the operator and the operand descriptor
classes. When the operands are literal values and the operator is one made available by the
constant table, then the specied computation is performed and the appropriate literal value
delivered as the result. In this case, nothing is added to the computation graph. Memory
operands (either addresses or values) are checked to determine whether they are directly
addressable. If not, tuples are generated to produce the specied results. In any case, the
value descriptors are altered to class generated and an appropriate operand identier is
inserted. Finally a tuple is generated to describe the current operation and the proper operand
identier is inserted into the value descriptor for the left operand.
Although we have not shown it explicitly, part of the input to the gen routine species
the program variables potentially used and destroyed. This information is used to derive the
dependency sets. An example giving the avor of the process can be found in the description
of Bliss-11 [Wulf et al., 1975].

13.2 Local Optimization

The simplest approach to optimization is to treat each basic block as a separate unit, opti-
mizing it without regard to its context. A computation graph is built for the basic block,
transformed, and used to generate the nal machine code. It is then discarded and the next
basic block is considered.
278 Optimization

Our strategy for optimizing a basic block is to carry out the following steps in the order
indicated:
1. Value Numbering : Perform a `symbolic execution' of the block, propagating symbolic
values and eliminating redundant computations.
2. Coding : Collect access paths for program variables and combine them with operations
to form valid target machine instructions, assuming an innite set of registers.
3. Peephole Optimization : Attempt to combine sequences of instructions into single in-
structions having the same eect.
4. Register Allocation : Map the register set resulting from the coding step onto the avail-
able target machine registers, generating spill code (code to save and/or restore registers)
as necessary.
Throughout this section we assume that all program variables are potentially accessed
after the end of the basic block, and that no tuple values are. The latter assumption fails for
an expression-oriented language, and in that case we must treat the tuple representing the
nal value of the expression computed by the block as a program variable. Section 13.3 will
consider the more general case occurring as a result of global optimization.
13.2.1 Value Numbering
Access computations for composite objects are rich sources of common subexpressions. One
classic example is the code for the following FORTRAN statement, used in solving three-
dimensional boundary value problems:

A(I; J; K ) =(A(I; J; K , 1) + A(I; J; K + 1) +

A(I; J , 1; K ) + A(I; J + 1; K ) +
A(I , 1; J; K ) + A(I + 1; J; K ))=6:0
The expression I + d1 (J + d2 K ), where d1 and d2 are the rst two dimensions of A, is
generated (in combination with various constants) seven times. The value of this expression
cannot change during evaluation of the assignment statement if I , J and K are variables, and
hence six of the seven occurrences are redundant.
Value numbering is used to detect and eliminate common subexpressions in a basic block.
The general idea is to simulate the computation described by the tuples, generating a new
tuple if and only if the current one cannot be evaluated at compile time. Pseudo-variables
are kept for all of the tuples and program variables, and are updated to re ect the state of
the computation. Figure 13.4 denes the algorithm, and the example of Figure 13.5 gives the
avor of the process. (Operand identiers of the form vi have been used in Figure 13.5c to
emphasize the fact that a new set of tuples is being generated.)
Simulation of t1 requires generation of v1 , and sets the pseudo-variable a to 2. Tuple t2
can then be evaluated by the compiler, setting pseudo-variable t2 to 2. No value is known for
pseudo-variable X , so v2 must be generated. When we reach t7 , the value of pseudo-variable
t3 is v2 and hence the required computation is 2*v2 . But a tuple for this computation will have
already been executed, and we have called its result v3 . Thus 2 v2 is a common subexpression
that may be eliminated; the only result of the simulation is to set pseudo-variable t7 to v3 .
Xt8 requires us to invalidate four pseudo-variables (the other three elements of Xt8 corre-
spond to pseudo-variables that have never been given values), and resets the value of pseudo-
variable a to v3 . Then t2 and t10 can be fully simulated, while t9 is eliminated. Finally, t11
13.2 Local Optimization 279

invalid := initialize_vn ;
for o
S
2 [U t D t ][ do
PV[o] := invalid ;
for to last
tuple do
t
t := first tuple
begin
if (t = "v"") and (PV[v] 6= invalid ) then
for o 2 D do PV[o] := PV[v]
t
else
begin
T := evaluate (t );
if notis_value (T , PV[t] ) then
begin
result := new_value (T) ;
for o 2
Xt do
PV[o] := invalid ;
for o 2
Dt do
PV[o] := result ;
end
end
end;
a) The algorithm
Operation Meaning
initialize vn : value number Clear the output block and return the rst
value number.
evaluate (tuple ) : tuple Create a new tuple by replacing each t in
the argument by PV[t] . Return the newly-
created tuple.
is value (tuple , operand ) : boolean If the last occurrence of tuple in the out-
put block was associated with PV[operand]
then return true , otherwise return false .
new value (tuple ) : value number Add tuple to the output block, associating
it with a new value number. Return the
new value number.
b) Operations of the output module
Figure 13.4: Value Numbering

and t12 result in the last two tuples of Figure 13.5c. As can be seen from this example, value
numbering recognizes common subexpressions even when they are written dierently in the
source program.
In more complex examples than Figure 13.5, the precise identity of the accessed object
may not be known. For example, the value of a[i] in Figure 13.2a might be altered even
though none of the assignment tuples in the corresponding straight-line segment has a[i] as
a target. The analysis uses Xt10 to account for this phenomenon, yielding the basic block
of Figure 13.6. Note that the algorithm correctly recognizes the address of a[i] as being a
common subexpression.
The last step in the value numbering process is to delete redundant assignments to program
variables (such as v1 in Figure 13.5c) and, as a byproduct, to develop use counts for all of the
tuples. Figure 13.7 gives the algorithm. Since each tuple value is dened exactly once, and
never used before it is dened, USECOUNT [v] will give the number of uses of v at the end
of the algorithm. The entries for program variables, on the other hand, may not be accurate
because they include potential uses by procedures and pointer assignments.
280 Optimization

a := 2;
b := a X + 1;
a := 2 X ;
c := a + 1 + b;
a) A sequence of assignments
Tuple U D X
t1 : a := 2 fg fag ft ; t ; t ; t ; t ; t ; t g
2 4 5 6 9 11 12
t2 : a" fag ft g2 fg
t3 : X" fX g ft g3 fg
t4 : t2 t3 ft ; t g
2 3 ft g4 fg
t5 : t4 + 1 ft g
4 ft g5 fg
t6 : b := t5 ft g
5 fb; t g 6 ft ; t ; t g
10 11 12
t3
t7 : 2 t3 ft g
3 ft g7 fg
t8 : a := t7 ft g
7 fa; t g ft ; t ; t ; t ; t ; t ; t g
8 2 4 5 6 9 11 12
t2
t9 : t2 + 1 ft g
2 ft g 9 fg
t10 : b" fbg ft g 10 fg
t11 : t9 + t10 ft ; t g ft g
9 10 11 fg
t12 : c := t11 ft g
11 fc; t g 12 fg
b) Tuples and sets for (a)
v1 : a := 2 v5 : b := v4
v2 : X " v6 : a := v3
v3 : 2 v2 v7 : v4 + v4
v4 : v3 + 1 v8 : c := v7
c) Transformed computation graph
Figure 13.5: Common Subexpression Elimination

The analysis discussed in this section can be easily generalized to extended basic blocks.
Each path through the tree of basic blocks is treated as a single basic block; when the control
ow branches, we save the current information in order to continue the analysis on the other
branch. Should constant folding determine that the condition of a conditional jump is xed,
we replace this conditional jump by an unconditional jump or remove it. In either case one of
the alternatives and the corresponding basic block is super uous and its code can be deleted.
These situations arise most frequently in automatically-generated code, or when the if : : :
then : : : else construct, controlled by a constant dened at the beginning of the program, is
used for conditional compilation.
To generalize Figure 13.7, we begin by analyzing the basic blocks at the leaves of the
extended basic block. The contents of USECOUNT are saved, and analysis restarted on a

v1 : i" v6 : j" v11 : v3 "

v2: v1 4 v7 : v6 4 v12 : z"
v3: a + v2 v8 : a + v7 v13 : v11 + v12
v4: v3 " v9 : x" v14 : z := v13
v5: w := v4 v10 : v8 := v9
Figure 13.6: Value Numbering Applied to Figure 13.2
13.2 Local Optimization 281

for o 2 S [U [ D ] do USECOUNT[o] := 0;
v v v
for o 2 fProgram variablesg do USECOUNT[o] := 1;
for v := last tuple downto first tuple do
begin
c := 0;
for 2
o Dv do
begin
c := c + USECOUNT[o] ;
if o is a program variable then
USECOUNT[o] := 0;
end ;
if c = 0 then
delete tuple v
else kfor o 2
Uv do
USECOUNT[o] := USECOUNT[o] + 1;
end ;
Figure 13.7: Redundant Assignment Elimination and Use Counting

predecessor block by resetting each element of USECOUNT to the maximum of the saved values
for the successors. We cannot guarantee consistency in the use counts by this method, since
not all of the use counts must reach their maxima along the same execution path. It turns
out, however, that this inconsistency is irrelevant for our purposes.

13.2.2 Coding
The coding process is very similar to that of Section 10.3. We maintain a value descriptor
for each operand identier, and simulate the action of the target computer using these value
descriptors as a data base. There is no need to maintain register descriptors, since we are
assuming an innite supply.
Figure 13.8 gives two possible codings of Figure 13.1a for the IBM 370. Our notation for
describing the instructions is essentially that of Davidson and Fraser [1980]: `R[: : : ]' means
`contents of register : : : ' and `M[: : : ]' means `contents of the memory location addressed by
: : : '. Register numbers greater than 15 represent `abstract registers' of the innite-register
machine, while those less than 15 represent actual registers whose usage is prescribed by the
mapping specication. (As discussed in Section 10.2.1, register 13 is used to address the local
activation record.)
The register transfer notation of Figure 13.8 is independent of the target machine (al-
though the particular descriptions of Figure 13.8b are specic to the IBM 370), and is useful
for the peephole optimization discussed at the end of this section. Figure 13.8b is not a
complete description of the register transfers for the given instructions, but it suces for the
current example. Later we shall show an example that uses a more complete description.
The dierences between the left and right columns of Figure 13.8b stem from the choice
of the left operand of the multiply instruction, made when the second line was generated.
Because the multiply is a two-address instruction, the value of the left operand will be replaced
by the value of the result. Wulf et al. [1975] calls this operand the target path.
In generating the left column of Figure 13.8b, we used Wulf's criterion: Operand v2 has a
use count greater than 1, and consequently it cannot be destroyed by the operation because it
will be needed again. It should not lie on the target path, because then an extra instruction
would be needed to copy it. Since v3 is only used once, no extra instructions are required
when it is chosen as the target path. Nevertheless, the code in the right column is two bytes
shorter { why? The byte counts for the rst six rows re ect the extra instruction required to
preserve v2 when it is chosen as the target path. However, that instruction is an LR rather
282 Optimization

Tuple Use count

v1 : aa " 2
v2 : v1 " 2
v3 : y" 1
v4 : v 2 v3 1
v5 : V:j " 2
v6 : v 4 + v5 1
v7 : V:i := v6
v8 : v 2 + v5 1
v9 : v1 := v8
a) Result of value numbering
R[16] := M[R[13]+aa] R[16] := M[R[13]+aa]
R[17] := M[R[13]+y] R[17] := M[R[16]+0]
R[18] := R[17]
R[17] := R[17]*M[R[16]+0] R[18] := R[18]*M[R[13]+y]
R[17] := R[17]+M[R[13]+V:j ] R[18] := R[18]+M[R[13]+V:j ]
M[R[13]+V:i] := R[17] M[R[13]+V:i] := R[18]
R[18] := M[R[16]+0]
R[18] := R[18]+M[R[13]+V:j ] R[17] := R[17]+M[R[13]+V:j ]
M[R[16]+0] := R[18] M[R[16]+0] := R[17]
32 bytes 30 bytes
3 registers 4 registers
b) Two possible codings
Figure 13.8: Coding Figure 13.1 for the IBM 370

than an L and thus its cost is only two bytes. It happens that the last use of v2 involves an
operation with two memory operands, one of which must be loaded at a cost of 4 bytes! If
the last use involved an operation whose other operand was in a register, we could use an RR
instruction for that operation and hence the byte counts of the two codings would be equal.
This example points up the fact that the criteria for target path selection depend strongly
upon the target computer architecture. Wulf's criterion is the proper one for the DEC PDP11,
but not for the IBM 370.
Figure 13.8b does not account for the fact that the IBM 370 multiply instruction requires
the multiplicand to be in an odd register and leaves the product in a register pair. The
register allocation process must enforce these conditions in any event, and it does not appear
useful to introduce extra notation for them at this stage. We shall treat the problem in detail
in Section 13.2.4.

13.2.3 Peephole Optimization

Every tuple of the computation graph corresponds to some instruction of the target machine.
It may be, however, that a sequence of several tuples can be implemented as a single instruc-
tion. The purpose of peephole optimization is to combine such tuples, reducing the size of
the basic block and the number of intermediate values. There are two basic strategies:
Each instruction of the target machine is dened in terms of register transfers. The
optimizer determines the overall register transfer of a group of instructions and seeks a
single instruction with the same eects [Davidson and Fraser, 1980].
13.2 Local Optimization 283

A set of patterns describing instruction sequences is developed, and a single instruction

associated with each. When the optimizer recognizes a given pattern in the basic block,
it performs the associated substitution [Tanenbaum et al., 1982].

Instruction Register transfers

MOV s; d d := s; CC:=s?0
ADD s; d d := d + s; CC:=d + s?0
CMP s; d CC:=s?d
Bc l if CC=c then PC := l
INC d d := d + 1; CC:=d + 1?0
d and s match any PDP11 operand address.
c matches any condition.
l matches any label.
a) DEC PDP11
Instruction Register transfers
L r; x r := x;
A r; x r := r + x; CC:=r + x?0
C r; x CC:=r?x
Bc l if CC=c then PC:=l
r matches any register.
x matches any RX-format operand.
c matches any condition.
l matches any label.
b) IBM 370
Figure 13.9: Register Transfer Descriptions
Figure 13.9 illustrates register transfer descriptions of PDP11 and IBM 370 instructions;
no attempt at completeness has been made in either case. Upper-case identiers and special
characters are matched as they stand, while lower-case identiers represent generic patterns
as indicated. (Note that in Figure 13.9b the description of an add instruction ts both A
and AR; there is no need to distinguish these instructions until assembly, when they could
be encoded by the technique of Section 11.3.2.) Literal characters in the patterns are chosen
simply for their mnemonic value. The optimizer needs no concept of machine operations;
optimization is carried out solely on the basis of pattern matching and replacement. Thus
the process is machine-independent { all machine dependence is concentrated in the register
transfer descriptions themselves.
In Section 13.1 we asserted that extra comparisons introduced to allow us to ignore the
side eect of condition code setting in arithmetic instructions could easily be removed. The
example of Figure 13.10 illustrates the steps involved. (Abstract registers have numbers
larger than 7, and we assume that register 5 addresses the local activation record.) Note
that the combined eect of the move and compare instructions (Figure 13.10d) is identical to
the eect of the move instruction (line 3 of Figure 13.10c). The optimizer discovers this by
pattern matching, and replaces the pair (move, compare) by the single move.
A two-instruction `window' was sucient to detect the redundant comparison in the ex-
ample of Figure 13.10. When a computer provides memory updating instructions that are
equivalent to simple load/operate/store sequences, the optimizer needs to examine instruction
triples rather than pairs.
284 Optimization

a := b + c; if a< 0 then goto L;

a) A straight-line segment involving local variables
t1 : b "
t2 : c "
t3 : t1 + t2
t4 : a := t3
t5 : t3 ?0
t6 : JGT (t5 )L
b) The tuple sequence for (a) after value numbering
R[8] := M[R[5]+b]; CC := M[R[5]+b]?0;
R[8] := R[8]+M[R[5]+c]; CC := R[8]+M[R[5]+c]?0;
M[R[5]+a] := R[8]; CC := R[8]?0;
CC := R[8]?0;
if CC = GT then PC := L;
c) Register transfers for instructions implementing (b)
R[8] := M[R[5]+b];
R[8] := R[8]+M[R[5]+c];
M[R[5]+a] := R[18];
CC := R[8]?0;
if CC = GT then PC := L;
d) After eliminating redundant transfers from (c)
M[R[5]+a] := R[8]; CC := R[8]?0;
e) The combined eect of lines 3 and 4 in (d)
Figure 13.10: Comparison

:= + 1
a) Incrementing an arbitrary location
ti : tj " tj is the address : : :
tk : ti + 1 Increment the value
tl : tj := tk Store the result
b) The tuple sequence for (a) after value numbering
R[8] := M[R[9]];
R[8] := R[8]+1;
M[R[9]]:=R[8];
c) Registers transfers for (b) after redundant transfer elimination
M[R[9] := M[R[9]]+1;
d) The overall eect of (c)
Figure 13.11: Generating an Increment
13.2 Local Optimization 285

Figure 13.11 shows how an increment instruction is generated. The `: : : ' in Figure 13.11a
stands for an arbitrarily complex address expression that appears on both sides of the assign-
ment. This expression is recognized as common during value numbering, and the address it
describes appears as an operand identier (Figure 13.11b).
Davidson and Fraser [1980] assert that windows larger than 3 are not required. Ad-
ditional evidence for this position comes from Tanenbaum's 1982 table of 123 optimization
patterns. Only seven of these were longer than three instructions, and none of the seven
resulted in just a single output instruction. Three of them converted addition or subtraction
of 2 to two increments or decrements, the other four produced multi-word move instructions
from successive single-word moves when the addresses were adjacent. All of these patterns
were applied rather infrequently.
The optimizations of Figures 13.10 and 13.11 could be specied by the following patterns if
we used the second peephole optimization method mentioned at the beginning of this section:
MOV a; b CMP a; 0 ) MOV a; b
MOV a; b ADD 1,b MOV b; a ) INC a
(The second pattern assumes that b is not used elsewehere.)
Any nite-state pattern matching technique, such as that of Aho and Corasick [1975],
can be modied to eciently match patterns such as these. (Modication is required to
guarantee that the item matching the rst occurrence of a or b also matches subsequent
occurrences.) A complete description of a particular algorithm is given by Ramamoorthy
and Jahanian [1976]. As indicated earlier, an extensive set of patterns may be required.
(Tanenbaum and his coauthors 1982 give a representative example.) The particular set of
patterns that will prove useful depends upon the source language, compiler code generation
and optimization strategies, and target machine. It is developed over time by examining
the code output by the compiler and recognizing areas of possible improvement. There is
never any guarantee that signicant optimizations have not been overlooked, or that useless
patterns have not been introduced. On the other hand, the processing is signicantly faster
than that for the rst method because it is unnecessary to `rediscover' the patterns for each
pair of instructions.
13.2.4 Local Register Allocation
The classical approach to register allocation determines the register assignment `on the y' as
the nal code is being output to the assembler. This determination is based upon attributes
calculated by previous traversals of the basic block, and uses value descriptors to maintain
the state of the allocation. We solve the register pair problem by computing a size and
alignment for each abstract register. (Thus the abstract register becomes a block in the sense
of Section 10.1.) In the right column of Figure 13.8b, R[16] and R[17] each have size 1 and
alignment 1 but R[18] has size 2 and alignment 2 because of its use as a multiplicand. Other
machine-specic attributes may be required. For example, R[16] is used as a base register
and thus cannot be assigned to register 0 on the IBM 370.
A register assignment algorithm similar to that described in Section 10.3.1 can be used.
The only modication lies in the choice of a register to free. In Figure 10.12 we chose the
least-recently accessed register; here we should choose the one whose next access is furthest in
the future. (Belady [1966] has shown this strategy to be optimal in the analogous problem
of determining which page to replace in a virtual memory system.) We can easily obtain
this information at the same time we compute the other attributes mentioned in the previous
paragraph. Note that all of the attributes used in register allocation must be computed after
peephole optimization; the peephole optimizer, by combining instructions, may alter some of
the attribute values.
286 Optimization

Figure 10.12 makes use of a register state copy that indicates existence of a memory
copy of the register content. If it has been necessary to spill a register then the assignment
algorithm knows that it is in the copy state. However, as the example of Figure 13.8 shows,
a register (e.g. R[16]) may be in the copy state because it has been loaded from a memory
location whose content will not be altered. In order to make use of this fact, we must guarantee
that no side eect will invalidate the memory copy. The necessary information is available in
the sets D and X associated with the original tuples, and must be propagated by the value
numbering and coding processes.
When we are dealing with a machine like the IBM 370, the algorithm of Figure 10.12
should make an eort to maximize the number of available pairs by appropriate choice of a
free register to allocate. Even when this is done, however, we may reach a situation in which
no pair is free but at least two registers are free. We can therefore free a pair by freeing one
register, and we might free that register by moving its content to the second free register
at a cost of two bytes. If the state of one of the candidate registers is copy , then it can
be freed at a cost of two bytes if and only if its next use is the proper operand of an RR
instruction (either operand if the operation is commutative). It appears that we cannot lose
by using an LR instruction. However, suppose that the value being moved must ultimately
(due to other con icts) be saved in memory. In that case, we are simply paying to postpone
the inevitable! We conclude that the classical strategy cannot be guaranteed to produce an
optimum assignment on a machine with double-length results.

13.3 Global Optimization

Code is ultimately produced by the methods discussed in Section 13.2, one basic block at
a time. The purposes of global optimization are to perform global rearrangement of the
computation graph and to provide contextual information at the basic block boundaries. For
example, in Section 13.2 we assumed that all program variables were potentially accessed
after the end of each basic block. Thus the algorithm of Figure 13.7 initialized USECOUNT[v]
to 1 for all program variables v . A global analysis of the program might show, however, that
there was no execution path along which certain of these variables were used before being
reset. USECOUNT[v] could be initialized to 0 for those variables, and this might result in
eliminating more tuples.
We shall rst sketch the process by which information is collected and disseminated over
the computation graph, and then discuss two common global transformations. The last
section considers ways of allocating registers globally, thus increasing register utilization and
avoiding mismatches at basic block boundaries.
It is important to emphasize that none of the algorithms discussed in Section 13.2 should
precede global optimization. Papers appearing in the literature often combine value number-
ing with the original generation of tuples, but doing so may prevent global optimization by
destroying congruence of tuples in dierent basic blocks.
13.3.1 Global Data Flow Analysis
The information derived by global data ow analysis consists of sets dened at particular
program points. Two types of set may be interesting: a set of operand identiers and a set of
program points. For example, we might dene a set LIVE (b) at the end of each basic block
b as the set of operand identiers that were used after the end of b before being reset. This
set could then be used in initializing USECOUNT as discussed above.
Sets of program points are useful when we need to nd all the uses of an operand that
could be aected by a particular denition of that operand, and vice-versa. Global constant
13.3 Global Optimization 287

propagation is a good example of this kind of analysis. As the computation graph is being
built, we accumulate a list of all of the program points at which an operand is given a constant
value. During global data ow analysis we dene a set USES (o, p) at each program point
p as the set of program points potentially using the value of o dened at p . Similarly, a set
DEFS (o, p) is the set of program points potentially dening the value of operand o used at
program point p . For each element of the list of constant denitions, we can then nd all of
the potential uses. For each potential use, in turn, we can nd all other potential denitions.
If all denitions yield the same constant then this constant can be substituted for the operand
use in question. Finally, if we substitute constants for all operand uses in a tuple then the
tuple can be evaluated and its program point added to the list. The process terminates when
the list is empty.
For practical reasons, global data ow analysis is carried out in two parts. The rst
part gathers information within a single basic block, summarizing it in sets dened at the
entry and/or exit points. This drastically reduces the number of sets that must be processed
during the second part, which propagates the information over the ow graph. The result of
the second part is then again sets dened at the entry and/or exit points of basic blocks. These
sets are nally used to distribute the information within the block. A complete treatment of
the algorithms used to propagate information over the ow graph is beyond the scope of this
book. Kennedy [1981] gives a good survey, and Hecht [1977] covers the subject in depth.
As an example, consider the computation of LIVE (b) . We characterize the ow graph
for this computation by two sets:
PRED(b) = fh j h is an immediate predecessor of b in the ow graphg
SUCC (b) = fh j h is an immediate successor of b in the ow graphg
An operand is then live on exit from a block b if it is used by any block in SUCC (b) before
it is either dened or invalidated. Moreover, if a block h 2 SUCC (b) neither denes nor
invalidates the operand, then it is live on exit from b if it is live on exit from h . Symbolically:
[
LIV E (b) = [IN (h) [ THRU (h) \ LIV E (h)] (13.1)
h2SUCC (b)
IN (h) is the set of operand identiers used in h before being dened or invalidated, and
THRU (h) is the set of operand identiers neither dened nor invalidated in h .
We can solve the system of set equations (13.1) iteratively as shown in Figure 13.12. This
algorithm is O(n2 ), where n is the number of basic blocks: At most n , 1 executions of
the repeat statement are needed to make a change in a basic block b available to another
arbitrary basic block b' . The actual number of iterations depends upon the sequence in which
the basic blocks are considered and the complexity of the program. For programs without
explicit jumps the cost can be reduced to two iterations, if the basic blocks are ordered so
that inner loops are processed before the loops in which they are contained.
Computation of the sets USES (o, p) and DEFS (o, p) provides a more complex exam-
ple of global ow analysis. We begin by computing REACHES (b) , the set of program points
that dene values valid at the entry point of basic block b . Let DEF (b) be the set of program
points within b whose denitions remain valid at the end of b , and let VALID (b) be the
set of program points whose denitions are not changed or invalidated in b . REACHES (b) is
then dened by:
[
REACHES (b) = [DEF (h) [ V ALID(h) \ REACHES (h)] (13.2)
h2PRED(b)
288 Optimization

for all basic blocks b do

begin
;
IN (b ) := ; THRU (b ) := fall operand identifiersg;
fort := last tuple downto first tuple do
begin
IN (b ) := (IN (b ) - Dt - Xt ) Ut ; [
THRU (b ) := THRU (b ) - Dt - Xt ;
end
;
LIVE (b ) := ;
end;
repeat
changed := false ;
for all basic blocks b do
begin
old := LIVE (b );
LIVE (b ) :=
S
[IN (h) [ THRU (h) \ LIVE (h)];
h 2 SUCC (b )
changed := changed or (LIVE 6 old );
(b ) =
end ;
until not changed ;
Figure 13.12: Computation of LIVE (b)
Note the similarity between equations (13.1) and (13.2). It is clear that essentially the
same algorithm can be used to solve both sets of equations. Similar systems of equations
appear in most global data ow analysis problems, and one can show that a particular prob-
lem can be handled by a standard algorithm simply by showing that the sets and rules for
combining them at junctions satisfy the axioms of the algorithm.
The computation of DEF (b) and VALID (b) is described in Figure 13.13a. It uses
auxiliary sets DF (o) which specify, for each operand identier o , the program points whose
denitions of o reach the ends of the basic blocks containing those program points. Once
DEF (b) and VALID (b) are known for every basic block, REACHES (b) can be computed
by solving the system of set equations (2). Finally, a simple scan (Figure 13.13b) suces
to dene DEFS (o, p) at each program point. USES (o, p) is computed by scanning the
entire program and, for each tuple p that uses o , adding p to USES (o, q) for every q 2
DEFS (o, p) .

13.3.2 Code Motion

The address expression for a[i; j ] in the Pascal fragment of Figure 13.14a is common to both
branches of the conditional statement, although there is no path from one to the other over
which the value remains unchanged. The second implementation of Figure 13.14b shows how
we can move the computation, with the assignment, forming an epilogue to the conditional.
This code motion transformation reduces the code size but leaves the execution time un-
changed. In the third implementation of Figure 13.14b we have moved a computation whose
value does not change in the inner loop to the prologue of that loop. Here the execution time
is reduced and the code size is increased slightly.
A key consideration in code motion is safety : The transformation is allowed when the
transformed program will deliver the same result as the original, and will terminate ab-
normally only if the original would have terminated abnormally. (Note that the abnormal
termination may occur in a dierent place.) In Figure 13.14, the value of i div k does not
change in the inner loop. Moving that computation to the prologue of the inner loop would be
13.3 Global Optimization 289

unsafe, however, because if k were zero the transformed program would terminate abnormally
and the original would not.
We can think of code motion as a combination of insertions and deletions. An insertion
is safe if the expression being inserted is available at the point of insertion. An expression is
available at a given point if it has been computed on every path leading to that point and
none of its operands have been altered since the last computation. Clearly the program's
result will not be changed by the inserted code if the inserted expression is available, and
if the inserted code were to terminate abnormally then the original program would have
terminated abnormally at one of the earlier computations. This argument guarantees the
safety of the rst transformation in Figure 13.14b. We rst insert the address computation
and assignment to a[i; j ], making it an epilogue of the conditional. The original computations
in the two branches are then redundant and may be removed.
C : array [operand_identifier] of program_point ;
for all operand identifiers o do DF (o ) := ;;
for all basic blocks b do
begin
for all operand identifiers o do C[o] := 0;
for i := first program point of b to last program point of b do
begin
for o 2 X (i) do C[o] := 0;
t
for o 2 D (i) do C[o] := i;
t
end;
DEF (b ) := ;;
for all operand identifiers o do
if C[o] =6 0 then
begin
DEF (b ) := DEF (b ) [ fC[o] g;
DF (o ) := DF (o ) [ fC[o] g;
end;
end;
for all basic blocks b do
begin
VALID (b ) := ;;
for all operand identifiers o do
if o 2 THRU (b ) then VALID (b ) := VALID (b ) [ DF (o );
end;
a) Computation of DEF (b) and VALID (b)

TR := REACHES (b );
fori := first program point of b to last program point of b do
begin
DEFS (o , i ) := ;;
for o 2Ut (i) do DEFS (o , i ) := TR \ DF (o );
for o 2Dt (i) [ Xt (i) do TR := TR - DF (o );
for o 2Dt (i) do TR := TR [ fi g;
end;
b) Computation of DEFS (o, p)
Figure 13.13: Computing a Set of Program Points
290 Optimization

The second transformation in Figure 13.14b involves an insertion where the inserted ex-
pression is not available, but where it is anticipated. An expression is anticipated at a given
point if it appears on every execution path leaving that point and none of its operands could
be altered between the point in question and the rst computation on each path. In our
example, (i , 1) n is anticipated in the prologue of the j loop, but i div k is not. Therefore
it is safe to insert the former but not the latter. Once the insertion has been made, the
corresponding computation in the epilogue of the conditional is redundant because its value
is available.
Let AVAIL (b) be the set of operand identiers available on entry to basic block b and
ANTIC (b) be the set of operand identiers anticipated on exit from b . These sets are dened
by the following systems of equations:
\
AVAIL(b) = [OUT (h) [ THRU (h) \ AVAIL(h)]
h2PRED(b)
\
ANTIC (b) = [ANLOC (h) [ THRU (h) \ ANTIC (h)]
h2SUCC (b)
Here OUT (b) is the set of operand identiers dened in b and not invalidated after their last
denition, and ANLOC (b) is the set of operand identiers for tuples computed in b before
any of their operands are dened or invalidated.
The main task of the optimizer is to nd code motions that are safe and protable (reduce
the cost of the program according to the desired measure). Wulf et al. [1975] considers
` , !' code motions that move computations from branched constructs to prologues and
epilogues. (The center column of Figure 13.14 illustrates an ! motion; an motion would
have placed the computation of a[i; j ] before the compare instruction.) He also discusses
the movement of invariant computations out of loops, as illustrated by the right column of
Figure 13.14. If loops are nested, invariant code is moved out one region at a time. Morel
and Renvoise [1979] present a method for moving a computation directly to the entrance
block of the outermost strongly-connected region in which it is invariant.

13.3.3 Strength Reduction

Figure 13.15 gives yet another implementation of Figure 13.14a for the IBM 370. The code
is identical to that of the right-hand column of Figure 13.14b, except that the expression
(i , 1) n has been replaced by an initialization and increment of R5. It is easy to see that in
both cases the sequence of values taken on by R5 is 0, n, 2n, 3n, : : : This strength reduction
transformation reduces the execution time, but its eect on the code size is unpredictable.
Allen et al. [1981] gives an extensive catalog of strength reductions. The major im-
provement in practice comes from simplifying access to arrays, primarily multidimensional
arrays, within loops. We shall therefore consider only strength reductions involving expres-
sions of this kind. All of these transformations are based upon the fact that multiplication is
distributive over addition.
Let S be a strongly-connected component of the computation graph. A region constant
is an expression whose value is unchanged in S , and an induction value is one dened only
by tuples having one of the following forms:
jk
,j
i := j
i"
13.3 Global Optimization 291

for i := 1 to n do
for j := 1 to n do
if j > k then a[i, j] := 0 else a[i, j] := i div k;

a) A Pascal fragment
LA R0,1 LA R0,1 LA R0,1
C R0,n(R13) C R0,n(R13) C R0,n(R13)
BH ENDI BH ENDI BH ENDI
B BODI B BODI B BODI
INCI A R0,=1 INCI A R0,=1 INCI A R0,=1
BODI ST R0,i(R13) BODI ST R0,i(R13) BODI ST R0,i(R13)
C R0,n(R13) C R0,n(R13) C R0,n(R13)
BH ENDJ BH ENDJ BH ENDJ
L R5,i(R13)
S R5,=1
M R4,n(R13)
B BODJ B BODJ B BODJ
INCJ A R0,=1 INCJ A R0,=1 INCJ A R0,=1
BODJ ST R0,j (R13) BODJ ST R0,j (R13) BODJ ST R0,j (R13)
C R0,k(R13) C R0,k(R13) C R0,k(R13)
BNH ELSE BNH ELSE BNH ELSE
SR R1,R1 SR R1,R1 SR R1,R1
L R3,i(R13)
S R3,=1
M R2,n(R13)
A R3,j (R13)
SLA R3,2
ST R1,a-4(R3,R13)
B ENDC B ENDC B ENDC
ELSE L R0,i(R13) ELSE L R0,i(R13) ELSE L R0,i(R13)
SRDA R0,32 SRDA R0,32 SRDA R0,32
D R0,k(R13) D R0,k(R13) D R0,k(R13)
L R3,i(R13) ENDC L R3,i(R13)
S R3,=1 S R3,=1
M R2,n(R13) M R2,n(R13)
A R3,j (R13) A R3,j (R13)
ENDC L R3,j (R13)
AR R3,R5
SLA R3,2 SLA R3,2 SLA R3,2
ST R1,a-4(R3,R13) ST R1,a-4(R3,R13) ST R1,a-4(R3,R13)
ENDC L R0,j (R13) L R0,j (R13) L R0,j (R13)
C R0,n(R13) C R0,n(R13) C R0,n(R13)
BL INCJ BL INCJ BL INCJ
ENDJ L R0,i(R13) ENDJ L R0,i(R13) ENDJ L R0,i(R13)
C R0,n(R13) C R0,n(R13) C R0,n(R13)
BL INCI BL INCI BL INCI
ENDI ENDI ENDI
(142 bytes) (118 bytes) (120 bytes)
b) IBM 370 implementations
Figure 13.14: Code Motion
292 Optimization

LA R0,1
C R0,n(R13)
BH ENDI
SR R5,R5 (i , 1) n initially 0
B BODI
INCI A R0,=1
A R5,=n Increment (i , 1) n
BODI ST R0,i(R13)
LA R0,1
C R0,n(R13)
BH ENDJ
B BODJ
INCJ A R0,=1
BODJ ST R0,j (R13)
C R0,k(R13)
BNH ELSE
SR R1,R1
B ENDIF
ELSE L R0,i(R13)
SRDA R0,32
D R0,k(R13)
ENDIF L R3,j (R13)
AR R3,R5
SLA R3,2
ST R1,a-4(R3,R13)
L R0,j (R13)
C R0,n(R13)
BL INCJ
ENDJ L R0,i(R13)
C R0,n(R13)
BL INCI
ENDI
(118 bytes)
Figure 13.15: Strength Reduction Applied to Figure 13.14b
Here j and k are either induction values or region constants and i is an induction variable.
The set of induction values is determined by assuming that all values dened in the region
are induction values, and then deleting those that do not satisfy the conditions [Allen et al.,
1981]. The induction values in Figure 13.16 are i, t2 , t3 and t7 .
To perform a strength reduction transformation on Figure 13.16, we dene a variable
V1 to hold the value t9 . An assignment must be made to this variable prior to entering
the strongly-connected region, and at program points where t9 has been invalidated and yet
t2 d1 is anticipated. For example, t9 is invalidated by t8 in Figure 13.16, and yet t2 d1
is anticipated at that point. An assignment V1 := t2 d1 should therefore be inserted just
before l2 . Since t2 is the value of i ", i := t7 ; V1 := t2 d1 is equivalent to V1 := (t2 + 1) d1 ;
i := t7. Using the distributive law, and recalling the invariant that V1 always holds the value
of t9 (= t2 d1 ), this sequence can be written as V1 := V1 + d1 ; i := t7 . Figure 13.17 shows the
result of the transformation, after appropriate decomposition into tuples.
We could now apply exactly the same reasoning to Figure 13.17, noting that V1 , t28 , t29 ,
t31 , t35 and t49 are now induction values. The obvious variables then hold t32 , t36 and t41 .
13.3 Global Optimization 293

for i := 1 to n do a[j; i] := a[k; i] + a[m; i];

a) A Pascal fragment
t1: i := 1 l2 : t2 t2
t2 : i " t9 : t2 d1 t9
t3 : n " t10 : k " t21 : j "
t4 : t2 ?t3 t11 : t10 + t9 t22 : t21 + t9
t5 : JGT (t4 )l3 t12 : t11 4 t23 : t22 4
t6 : JMP l2 t13 : a + t12 t24 : a + t23
l1 : t2 t14 : t13 " t25 : t24 := t20
t7 : t2 + 1 t2 t2
t8 : i := t7 t9 t3
t15 : m " t4
t16 : t15 + t9 t26 : JLT (t4 )l1
t17 : t16 4 l3 :
t18 : a + t17
t19 : t18 "
t20 : t14 + t19
b) Computation graph for (a)
Figure 13.16: Finding Induction Values
t1 l2: t28 t28
t2 t10 t21
t3 t31 : t10 + t28 t40 : t21 + t28
t4 t32 : t31 4 t41 : t40 4
t5 t33 : a + t32 t42 : a + t41
t27 : V1 := d1 t34 : t33 " t43 : t42 := t39
t6 t28 t2
l1 : t28 : V1 " t15 t3
t29 : t28 + d1 t35 : t15 + t28 t4
t30 : V1 := t29 t36 : t33 4 t26
t2 t37 : a + t36 l3 :
t7 t38 : t37 "
t8 t39 : t34 + t38
Figure 13.17: Figure 13.16b After One Strength Reduction
Unfortunately, none of these variables have simple recurrence relations. Four more variables,
to hold t28 4, t10 4, t15 4 and t21 4 must be dened. Although tedious, the process
is straightforward; a complete algorithm is given by Allen et al. [1981]. As can be seen
from this simple example, the number of variables introduced grows rapidly. Many of these
variables will later be eliminated because their functions have been eectively taken over by
other variables. This is the case after further processing of Figure 13.17, where the function
of V1 is taken over by the variable implementing t28 4. In fact, the program variable i can be
omitted in this loop if the test for termination is changed to use one of the derived induction
variables.
Clearly strength reduction must precede code motion. The strength reduction process gen-
erates many extra tuples that are constant within the strongly connected region and hence
should be moved to its prologue. It is also clear that strength reduction must be iterated if it
is to be eective. The proliferation of derived induction variables, with concomitant initializa-
tion and incrementing, may cause a signicant increase in code size. Thus strength reduction
is strictly an execution time optimization, and usually involves a time/space tradeo. Scar-
294 Optimization

borough and Kolsky [1980] advocate judicious preprocessing of subscript expressions in

an eort to reduce the growth due to strength reduction.

13.3.4 Global Register Allocation

As discussed in Section 13.2.4, local register allocation considers each basic block in isolation.
Values that live across basic block boundaries are generally program variables, and are stored
in memory. Thus it is unnecessary to retain values in registers from one basic block to the
next. The global optimizations discussed so far alter this condition. They tend to increase the
number of operands whose lifetimes include more than one basic block, and if such operands
must be kept in memory then much of the advantage is lost. It is absolutely essential that we
take a more global view in allocating registers in order to minimize the number of additional
fetch, store and copy register instructions.
Most global register allocation strategies allow program variables to compete equally for
registers with other operands. Some care must be taken, however, since program variables may
be accessible over paths that are eectively concealed from the compiler. It is probably best
to exclude program variables from the allocation when such paths are available. As indicated
in Section 13.1, this is a property of the source language and the necessary restrictions will
vary from compiler to compiler.
Day [1970] discusses the general register allocation problem and gives optimal solutions
for the basic strategies. These solutions provide standards for measuring the eectiveness of
heuristics, but are themselves too expensive for use in a production compiler. Two faster,
non-optimal procedures are also discussed. All of these algorithms assume a homogeneous
set of registers. Late in the paper, Day mentions that the problem of register pairs might
be solved by running the allocation twice. The rst run would be given only the values that
must be assigned to one register of a pair (or both). Input to the second run would include
all items, but attach a very high prot to each assignment made by the rst run.
One of the problems with global register allocation is the large number of operands that
must be considered. In spite of the previous global optimizations, the majority of these
operands have lifetimes contained within a basic block. We would like to perform the expen-
sive global allocation procedure on only those operands whose lifetimes cross a basic block
boundary, allocating the remainder by the cheaper methods of Section 13.2.4. If we do this,
however, we run the risk of allocating all registers globally and hence generating very poor
local code. Beatty [1974] suggests that we divide the local register allocation process into
two phases, determining the number of registers required (`allocation') and deciding which
registers will be used (`assignment'). The requirements set by the rst phase are used in
determining global register usage, and then the unclaimed registers are assigned in each basic
block individually.
All data items that live across basic block boundaries are initially assumed to be in
memory, but all instructions that can take either register or memory operands are assumed
to be in their register-register form. Explicit loads and stores are inserted where required,
and the processes of Sections 13.2.1-13.2.3 are carried out. The methods of Section 13.2.4
are applied to determine the number of registers required locally. With this information, a
global analysis [Beatty, 1974] is used to guide load-store motion (code motion involving
only the loads and stores of operands live across basic block boundaries) and global register
assignment. As the assignment proceeds, some (but not necessarily all) loads and stores
will become redundant and be deleted. When the global analysis is complete, we apply the
allocation of Section 13.2.4 to assign local registers.
Real computers usually have annoying asymmetries in register capability that wreak havoc
with uniform register allocation schemes. It is necessary to provide a mechanism for incor-
13.4 Ecacy and Cost 295

porating such asymmetries in order to avoid having to exclude certain registers from the
allocation altogether. One allocation scheme [Chaitin et al., 1981; Chaitin, 1982] that
avoids the problem is based on graph coloring (Section B.3.3). The constraints on allocation
are expressed as an interference graph, a graph with one node for each register, both abstract
and actual. An edge connects two nodes if they interfere (i.e. if they exist simultaneously).
Clearly all of the machine registers interfere with each other. In the left column of Figure 13.8,
R[17] and R[18] do not interfere with each other, although they both interfere with R[16];
all abstract registers interfere with each other in the right column. If there are n registers, a
register assignment is equivalent to an n-coloring (Section B.3.3) of the interference graph.
Many asymmetry constraints are easily introduced as interferences. For example, any
abstract register used as a base register on the IBM 370 interferes with machine register
0. Similarly, we can solve a part of the multiplication problem by making the abstract
multiplicand interfere with every even machine register and dening another abstract register
that interferes with every odd machine register and every abstract register that exists during
the multiply. This guarantees that the multiplicand goes into an odd register and that an
even register is free, but it does not guarantee that the multiplicand and free register form a
pair.
The coloring algorithm [Chaitin et al., 1981] used for this problem diers from that of
Section B.3.3 because the constraints are dierent: There we are trying to nd the minimum
number of colors, assuming that the graph is xed; here we are trying to nd an n-coloring,
and the graph can be changed to make that possible. (Spilling a value to memory removes
some of the interferences, changing the graph.) Any node with fewer than n interferences does
not aect the coloring, since there will be a color available for it regardless of the colors chosen
for its neighbors. Thus it (and all edges incident upon it) can be deleted without changing
whether the graph can be n-colored. If we can continue to delete nodes in this manner until
the entire graph disappears, then the original was n-colorable. The coloring can be obtained
by adding the nodes back into the graph in the reverse order of deletion, coloring each as it
is restored.
If the coloring algorithm encounters a node with n or more interferences, it must make a
decision about which node to spill. A separate table is used to give the cost of spilling each
register, and the register is chosen for which cost/(incident edges) is as small as possible.
Some local intelligence is included: When a computation is local to a basic block, and no
abstract register lifetimes end between its denition and last use, the cost of spilling it is set
to innity. The cost algorithm also accounts for the facts that some computations can be
redone instead of being spilled and reloaded, and that if the source or target of a register
copy operation is spilled then that operation can be deleted. It is possible that a particular
spill can have negative cost!
Unfortunately, the introduction of spill code changes the conditions of the problem. Thus,
after all spill decisions are made, the original program is updated with spill code and the
allocation re-run. Chaitin claims that the second iteration usually succeeds, but it may
be necessary to insert more spill code and try again. To reduce the likelihood of multiple
iterations, one can make the rst run with n , k registers instead of n registers.

13.4 Ecacy and Cost

We have discussed a number of transformations in this chapter. Do they provide an improve-
ment commensurate with the cost of performing them? In some sense this is a meaningless
question, because it is too broad. Each user has a denition of `commensurate', which will
vary from one program to another. The best we can do is to try to indicate the costs and
296 Optimization

benets of some of the techniques we have discussed and leave it to the compiler writer to
strike, under pressure from the marketplace, a reasonable balance.
By halving the code size required to implement a language element that accounts for 1%
of a program we reduce the code size of that program by only 0.5%, which certainly does not
justify a high compilation cost. Thus it is important for the compiler writer to know the milieu
in which his compiler will operate. For example, elimination of common subexpressions, code
motion and strength reduction might speed up a numerical computation solving a problem
in linear algebra by a factor of 2 or 3. The same optimizations often improve non-numeric
programs by scarcely 10%. Carter's 1982 measurements of 95,000 lines of Pascal, primarily
non-numeric code, shows that the compiler would typically be dealing with basic blocks
containing 2-4 assignments, 10-15 tuples and barely 2 common subexpressions!
Static analysis does not, of course, tell the whole story. Knuth [1971a] found in his study
of FORTRAN that less than 4% of a program generally accounts for half of its running time.
This phenomenon was exploited by Dakin and Poole [1973] to implement an interactive text
editor as a mixture of interpreted and directly-executed code. Their measurements showed
that in a typical editing session over 97% of the execution involved less than 10% of the
code, and more than half of the code was never used at all. Finally, Knuth discovered that
over 25% of the running times of the FORTRAN programs he proled was spent performing
input/output.
Measure Ratios
Local/None Global/None Global/Local
Compilation time Min. 0.8 1.0 1.2
Avg. 0.9 1.4 1.4
Max. 1.0 1.6 1.6
Code space Min. 0.42 0.38 0.89
Avg. 0.54 0.55 1.02
Max. 0.69 0.66 1.19
Execution time Min. 0.32 0.19 0.58
Avg. 0.50 0.42 0.82
Max. 0.72 0.61 0.94
Table 13.1: Evaluation of PL/1L [Cocke and Markstein, 1980]
Actual measurements of optimization ecacy and cost are rare in the literature, and the
sample size is invariably small. It is thus very dicult to draw general conclusions. Table 13.1
summarizes a typical set of measurements. Cocke and Markstein [1980] PL/1L, an ex-
perimental optimizing compiler for a PL/1-like language, was run over each of four programs
several times. A dierent level of optimization was specied for each compilation of a given
program, and measurements made of the compilation time, code space used for the resulting
object program, and execution time of the resulting object program on a set of data. At
every level the compiler allocated registers globally by the graph coloring algorithm sketched
in Section 13.3.4. No other optimizations were performed at the `None' level. The `Local' op-
timizations were those discussed in Section 13.2.1, and the `Global' optimizations were those
discussed in Sections 13.3.1 through 13.3.3. It is not clear what (if any) peephole optimization
was done, although the global register allocation supposedly deleted redundant comparisons
following arithmetic operations by treating the condition code as another allocatable register
[Chaitin et al., 1981]. The reduction in compilation time for local optimization clearly illus-
trates the strong role that global register allocation played in the compilation time gures.
Local optimization reduced the number of nodes in the interference graph, thus more than
covering its own cost. One of the test programs was also compiled by the standard optimizing
13.4 Ecacy and Cost 297

PL/1 compiler in a bit less than half of the time required by the PL/1L compiler. OPT=0
was selected for the PL/1 compiler, and local optimization for the PL/1L compiler. This
ratio changed slightly in favor of the PL/1 compiler (0.44 to 0.38) when OPT=2 and `global'
were selected. When the same program was rewritten in FORTRAN and compiled using
FORTRAN H, the ratios OPT=0/local and OPT=2/global were almost identical at 0.13.
(Section 14.2.3 discusses the internals of FORTRAN H.)
In the late 1970's, Wulf and his students attempted to quantitatively evaluate the size of
the object code produced by an optimizing compiler. They modeled the optimization process
by the following equation:
Y
K (C; P ) = Ku (C; P ) Oi (C )
i
K (C; P ) is the cost (code space) of program P compiled with compiler C , and Ku is the cor-
responding unoptimized cost. Each Oi (C ) is a measure of how eectively compiler C applies
optimization i to reduce the code size of a typical program, assuming that all optimizations
1; : : : ; i , 1 have already been done. They were never able to validate this model to their
satisfaction, and hence the work never reached publication. They did, however, measure the
factors Oi (C ) for Bliss-11 [Wulf et al., 1975] (Table 13.2).
Index Description Factor
1 Evaluating constant expressions 0.938
2 Dead code elimination 0.98
3 Peephole optimization 0.88
4 Algebraic laws 0.975
5 CSE in statements 0.987
6 CSE in basic blocks 0.973
7 Global CSE 0.987
8 Global register allocation 0.975
9 Load/store motion 0.987
10 Cross jumping 0.972
11 Code motion 0.985
12 Strength reduction -
Table 13.2: Optimization Factors for Bliss-11 Wulf et al. [1975]
We have considered optimizations 1 and 4 of Table 13.2 to precede formation of the com-
putation graph; the remainder of 1-6 constitute the local optimizations of Section 13.2. Thus
the product of these factors (roughly 0.76) should approximate the eect of local optimization
alone. Similarly, the product of factors 7-12 (roughly 0.91) should approximate the additional
improvement due to global optimization. Comparing this latter gure with the last column
of Table 13.1 shows the deleterious eect of strength reduction on code space discussed in
Section 13.3.3.
The rst column of Table 13.1 shows a code size improvement signicantly better than
0.76, implying that the PL/1L compiler generates poorer initial code than Bliss-11, leaving
more to be gained by simple optimizations. This should not be taken as a criticism. After
all, using a sophisticated code generator with an optimizer is a bit like vacuuming the oce
before the cleaning crew arrives! Davidson and Fraser [1980] take the position that code
generation should be trivial, producing instructions to simulate a simple stack machine on
an innite-register analog of the target computer. They then apply the optimizations of
Section 13.2, using a fragment bounded by labels (i.e. a path in an extended basic block) in
lieu of a basic block.
298 Optimization

Exercises
13.1 Show how the dependency sets would be derived when building a computation graph
that represents a LAX program for a target machine of your choice.
13.2 Assume that the FORTRAN assignment statement
A(I; J; K ) =(A(I; J; K , 1) + A(I; J; K + 1) +
A(I; J , 1; K ) + A(I; J + 1; K ) +
A(I , 1; J; K ) + A(I + 1; J; K ))=6:0
constitutes a single basic block.
(a) Write the initial tuple sequence for the basic block.
(b) Derive a new tuple sequence by the algorithm of Figure 13.4a.
(c) Code the results of (b), using register transfers that describe the instructions of
some machine with which you are familiar.
13.3 Give an example, for some machine with which you are familiar, of a common subex-
pression satisfying each of the following conditions. If this is impossible for one or more
of the conditions, carefully explain why.
(a) Always cheaper to recompute than save.
(b) Never cheaper to recompute than save.
(c) Cheaper to recompute i it must be saved in memory.
13.4 Explain how the rst method of peephole optimization described in Section 13.2.3 could
be used to generate patterns for the second. Would it be feasible to combine the two
methods, backing up the second with the rst? Explain.
13.5 Assume that the register management algorithm of Figure 10.12 is to be used in an
optimizing compiler. Dene precisely the conditions under which all possible changes
in register state will occur.
13.6 Show how the D and X sets are propagated through the value numbering and coding
processes to support the decisions of Exercise 13.5, as described in Section 13.2.4.
13.7 Give examples of safe code motions in which the following behavior is observed:
(a) The transformed program terminates abnormally in a dierent place than the
original, but with the same error.
(b) The transformed program terminates abnormally in a dierent place than the
original, with a dierent error.
13.8 Consider a Pascal if statement with integer constant bounds. Assume that the lower
bound is smaller than the upper bound, which is smaller than maxint. Instead of using
the schema of Figure 3.10c, the implementor chooses the following:
i := e1 ; t := e3 ;
l1 : : : : (* Body of the loop *)
i := i + 1;
if i t then goto l1 ;
(a) Explain why no strength reduction can be carried out in this loop.
(b) Suppose that we ignore the explanation of (a) and carry out the transformation
anyway. Give a specic example in which the transformed program terminates
abnormally but the original does not. Restrict the expressions in your example
to those arising from array subscript calculations. Your array bounds must be
reasonable (i.e. arrays with maxint elements are unreasonable).
Chapter 14
Implementation
In earlier chapters we have developed a general framework for the design of a compiler. We
have considered how the task and its data structures could be decomposed, what tools and
strategies are available to the compiler writer, and what problems might be encountered.
Given a source language, target machine and performance goals for the generated code we
can design a translation algorithm. The result of the design is a set of module specications.
This chapter is concerned with issues arising out of the implementation of these specica-
tions. We rst discuss the decisions that must be made by the implementors and the criteria
that guide these decisions. Unfortunately, we can give no quantitative relationship between
decisions and criteria! Compiler construction remains an art in this regard, and the successful
compiler writer must simply develop a feel for the inevitable compromises. We have there-
fore included three case studies of successful compilers that make very dierent architectural
decisions. For each we have tried to identify the decisions made and show the outcome.

14.1 Implementation Decisions

Many valid implementations can generally be found for a set of module specications. In fact,
an important property of a module is that it hides one or more implementation decisions. By
varying these decisions, one obtains dierent members of a `family' of related programs. All of
the members of such a family carry out the same task (dened by the module specications)
but generally satisfy dierent performance criteria. In our case, we vary the pass structure
and data storage strategies of the compiler to satisfy a number of criteria presented in Sec-
tion 14.1.1. Despite this variation, however, the module specications remain unchanged.
This point is an extremely important one to keep in mind, especially since many implemen-
tation languages provide little or no support for the concept of a module as a distinct entity.
With such languages it is very easy to destroy the modular decomposition during develop-
ment or maintenance, and the only protection one has against this is eternal vigilance and a
thorough understanding of the design.

14.1.1 Criteria
Maintainability, performance and portability are the three main criteria used in making im-
plementation decisions. The rst is heavily in uenced by the structure of the program, and
depends ultimately on the quality of the modular design. Unfortunately, given current imple-
mentation languages, it is sometimes necessary to sacrice some measure of maintainability
to achieve performance goals. Such tradeos run counter to our basic principles. We do not
lightly recommend them, but we recognize that in some cases the compiler will not run at all
299
300 Implementation

unless they are made. We do urge, however, that all other possibilities be examined before
such a decision is taken.
Performance includes memory requirements, secondary storage requirements and process-
ing time. Hardware constraints often place limits on performance tradeos, with time the
only really free variable. In Sections 14.1.2 and 14.1.3 we shall be concerned mainly with
tradeos between primary and secondary storage driven by such constraints.
Portability can be divided into two sub-properties often called rehostability and retar-
getability. Rehosting is the process of making the compiler itself run on a dierent machine,
while retargeting is the process of making it generate code for a dierent machine. Rehosta-
bility is largely determined by the implementation language and the performance tradeos
that have been made. Suppose, for example, that we produce a complete design for a Pascal
compiler, specifying all modules and interfaces carefully. If this design is implemented by
writing a FORTRAN program that uses only constructs allowed by the FORTRAN standard,
then there is a good chance of its running unchanged on a wide variety of computers. If, on
the other hand, the design is implemented by writing a program in assembly language for the
Control Data 6000 series then running it on another machine would involve a good deal of
eort.
Even when we x both the design and the implementation language, performance consid-
erations may aect rehostability. For example, consider the use of bit vectors (say as parser
director sets or error matrices, or as code generator decision table columns) when the imple-
mentation language is Pascal. One possible representation is a set, another is a packed array
of Boolean. Unfortunately, some Pascal implementations represent all sets with the same
number of bits. This usually precludes large sets, and the bit vectors must be implemented
as arrays of sets or packed arrays of Boolean. Other implementations only pack arrays to the
byte level, thus making a packed array of Boolean eight times as large as it should be. Clearly
when the compiler is rehosted from a machine with one of these problems to a machine with
the other, dierent implementations of bit vectors may be needed to meet performance goals.
Neither of the situations in the two previous paragraphs aected the design (set of mod-
ules and interfaces). Rehostability is thus quite evidently a property of the implementation.
Retargetability, on the other hand, is more dependent upon the design. It requires a clean
separation between the analysis and synthesis tasks, since the latter must be redesigned in
order to retarget the compiler. If the target machine characteristics have been allowed to
in uence the design of the analysis task as well as the synthesis task, then the redesign will
be more extensive. For example, suppose that the design did not contain a separate constant
table module. Operations on constants were carried out wherever they were needed, following
the idiosyncrasies of the target machine. Retargeting would then involve redesign of every
module that performed operations on constants, rather than redesign of a single module.
Although the primary determinant of retargetability is the design, implementation may
have an eect in the form of tradeos between modularity and performance that destroy the
analysis/synthesis interface. Such tradeos also degrade the maintainability, as indicated at
the beginning of this section. This should not be surprising, because retargeting a compiler is,
after all, a form of maintenance: The behavior of the program must be altered to t changing
customer requirements.

14.1.2 Pass Structure

It often becomes obvious during the design of a compiler that the memory (either actual
or virtual) available to a user on the host machine will not be sucient for the code of the
compiler and the data needed to translate a typical program. One strategy for reducing the
memory requirement is analogous to that of a dentist's oce in which the patient sits in
14.1 Implementation Decisions 301

Data Structure Tasks Reference

Symbol table Lexical analysis Chapter 6
Parse table Parsing Chapter 7
Denition table Name analysis Chapter 9
Semantic analysis Chapter 9
Memory mapping Section 10.1
Target attribution Section 10.2
Decision tables Code selection Section 10.3
Address table Assembly Chapter 11
Table 14.1: Decomposition via Major Data Structures

a chair and is visited in turn by the dentist, hygienist and x-ray technician: The program
is placed in the primary storage of the machine and the phases of the compiler are `passed
by the program', each performing a transformation of the data in memory. This strategy is
appropriate for systems with restricted secondary storage capability. It does not require that
intermediate forms of the program be written and then reread during compilation; a single
read-only le to hold the compiler itself is sucient. The size of the program that can be
compiled is limited, but it is generally possible to compile programs that will completely ll
the machine's memory at execution time. (Source and intermediate encodings of programs
are often more compact than the target encoding.)
Another strategy is analogous to that of a bureau of motor vehicles in which the applicant
rst goes to a counter where application forms are handed in, then to another where written
tests are given, and so on through the eye test, driving test, cashier and photographer: The
compiler `passes over the program', repeatedly reading and writing intermediate forms, until
the translation is complete. This strategy is appropriate for systems with secondary storage
that can support several simultaneously-open sequential les. The size of the program that
can be compiled is limited by the ling system rather than the primary memory. (Of course
primary memory will limit the complexity of the program as discussed in Chapter 1.)
Either strategy requires us to decompose the compilation into a sequence of transforma-
tions, each of which is completed before the next is begun. One fruitful approach to the
decomposition is to consider relationships between tasks and large data structures, organiz-
ing each transformation around a single data structure. This minimizes the information ow
between transformations, narrowing the interfaces. Table 14.1 illustrates the process for a
typical design. Each row represents a transformation. The rst column gives the central data
structure for the tasks in the second column. It participates in only the transformation cor-
responding to its row, and hence no two of these data structures need be held simultaneously.
Our second strategy places an extra constraint upon the intermediate representations of
the program: They must be linear, and each will be processed sequentially. The transforma-
tions are carried out by passes, where a pass is a single scan, in either direction, of a linear
intermediate representation of the program. Each pass corresponds to a traversal of the
structure tree, with forward passes corresponding to depth-rst, left-to-right traversals and
backward passes corresponding to depth-rst, right-to-left traversals. Under this constraint
we are limited to AAG(n) attribution; the attribute dependencies determine the number of
passes and the tasks carried out in each. It is never necessary to build an explicitly-linked
structure tree unless we wish to change traversals. (An example is the change from a depth-
rst, left-to-right traversal of an expression tree to an execution-order traversal based upon
register counts.)
The basic Pascal le abstraction is a useful one for the linear intermediate representations
of the program. A module encapsulates the representation, providing an element type and a
302 Implementation

single window variable of that type. Operations are available to empty the sequence, add the
content of the window to the sequence, get the rst element of the sequence into the window,
get the next element of the sequence into the window, and test for the end of the sequence.
This module acts as a `pipeline' between the passes of the compiler, with each operating
directly on the window. By implementing the module in dierent ways we can cause the
communicating passes to operate as coroutines or to interact via a le.
While secondary storage is larger than primary storage, constraints on space are not
uncommon. Moreover, a signicant fraction of the passes may be I/O-bound and hence
any reduction in the size of an intermediate representation will be re ected directly in the
compilation time. Our communication module, if it writes information to a le, should
therefore encode that information carefully to avoid redundancy. In particular, the element
will usually be a variant record and the communication module should transmit only the
information present in the stated variant (rather than always assuming the largest variant).
Further compression may be possible given a knowledge of the meanings of the elds. For
example, in the token of Figure 4.1 the line number eld of coordinates changes only rarely,
and need be included only when it does change. The fact that the line number is present can
be encoded by the classication eld in an obvious way. Because most tokens are completely
specied by the classication eld alone, this optimization can reduce the size of a token le
by 30%.

14.1.3 Table Representation

We have seen how the requirements for table storage are reduced by organizing each pass
around a table and then discarding that table at the end of the pass. Further reduction can
be based upon the restricted lifetime of some of the information contained in the table. For
example, consider a block-structured language with a left-to-right attribute grammar (such
as Pascal). The denition table entries for the entities declared locally are not used after the
range in which those entities were declared has been left. They can therefore be thrown away
at that point.
Pascal is admittedly a simple case, but even in languages with more complex attribute re-
lationships denition table entities are only accessed during processing of a program fragment.
One purpose of the denition table is to abstract information from the program, making it
more accessible during processing. This purpose can only be served if the entry is, in fact,
accessed. Thus it is often reasonable to destroy denition table entries when the fragment
in which they are accessed has been left, and re-create them when that fragment is entered
again.
A table entry can only be destroyed if its information is no longer needed, can be recom-
puted from other information, or can be stored in the structure tree in a position where it can
be recovered before it is needed next. The last condition is most easily satised if forward
and backward passes alternate, but it can also occur in other situations. We shall see several
examples of this `distribution' of attribute information in Section 14.2.1.
Unfortunately, many implementation languages do not support freeing of storage. Even
for those where it is nominally supported, the implementation is often poor. The compiler
writer can avoid this problem by managing his own dynamic storage, only making requests
for storage allocation and never returning storage to the system. The basic strategy for a
block-structured language is quite simple: All storage allocated for a given table is held in a
single one-way list. A pointer indicates the most-recently delivered element. When a program
fragment that will add elements to the table is entered, this pointer is remembered; when
the fragment is left, its value is restored. If a new element is needed then the pointer of the
current element is checked. If it is nil, storage allocation is requested and a pointer to the
14.2 Case Studies 303

resulting block placed in the current element. In any case the pointer to the most-recently
delivered element is advanced along the list. Thus the list acts like a stack, and its nal length
is the maximum number of entries the table required at one point in the compilation.
The disadvantage of this strategy is that the storage requirements are those that would
obtain if all tables in each pass reached their maximum requirement simultaneously. Often
this is not the case, and hence larger programs could have been accommodated if storage for
unused entries had been returned to the operating system.
Every pass that manipulates constant values must include the necessary operations of
the abstract data type constant table discussed in Section 4.2.2. Constant table denes an
internal representation for each type of value. This representation can be used as an attribute
value, but any manipulation of it (other than assignment) must be carried out by constant
table operations. We pointed out in Section 4.2.2 that the internal representation might sim-
ply describe an access function for a data structure within the constant table module. This
strategy should be used carefully in a multipass compiler to avoid broadening the interface
between passes: The extra data structure should usually not be retained intact and trans-
mitted from one pass to the next via a separate le. Instead, all of the information about
a constant should be added to the linearized form of the attributed structure tree at an ap-
propriate point. The extra data structure is then reconstituted as the linearized tree is read
in.
The string table is a common exception to the approach suggested above. Careful design
of the compiler can restrict the need for string table access to two tasks: lexical analysis
and assembly. (This is true even though it may be used to store literal strings and strings
representing the fractions of oating point numbers as well as identiers.) Thus the string
table is often written to a separate le at the completion of lexical analysis. It is only retrieved
during assembly when the character representations of constants must be converted to target
code, and identiers must be incorporated into external symbol dictionaries.

14.2 Case Studies

We have discussed criteria for making implementation decisions and indicated how the pass
structure and table representation are aected by such decisions. This section analyzes three
compilers, showing the decisions made by their implementors and the consequences of those
decisions. Our interest is to explore the environment in which such decisions are made and to
clarify their interdependence. We have tried to choose examples that illustrate the important
points, and that have been used routinely in a production setting. Pragmatic constraints such
as availability of design or maintenance documentation and understandability of the compiler
itself were also in uential.
14.2.1 GIER ALGOL
This compiler implements ALGOL 60 on GIER, a machine manufactured by Regnecentralen,
Copenhagen. The decision to develop the compiler was taken in January, 1962 and the nal
product was delivered in February, 1963. It implemented all of ALGOL 60 except integer
labels, arrays as value parameters, and own arrays. The compiler was intended to run on a
minimum GIER conguration consisting of 1024 40-bit words of 8.8 microsecond core memory
and a 128,000 word drum (320 tracks of 40 words each).
Previous experience with ALGOL compilers led the designers to predict a code size of
about 5000 words for the GIER compiler. They chose to organize the compiler as a sequence
of passes over linearized representations of the program. Each intermediate representation
consists of a sequence of 10-bit bytes. The interpretation of this sequence depends upon the
304 Implementation

Pass Task(s) Description

1 Lexical analysis Analysis and check of hardware representation. Con-
version to reference language. Strings are assembled.
2 Lexical analysis Identier matching. In the output, each distinct iden-
tier is associated with an integer between 512 and
1022.
3 Syntactic analysis Analysis and check of delimiter structure. Delimiters
of multiple meaning are replaced by distinctive delim-
iters. Extra delimiters are inserted to facilitate later
scanning.
4 Collection of declarations and specications at the be-
gin of blocks and in procedure headings. Rearrange-
ments of procedure calls.
5 Name analysis Distribution of identier descriptions. Storage map-
ping Storage allocation for variables.
6 Semantic analysis Check the types and kinds of identiers and other
operands. Conversion to reverse polish notation.
7 Code generation Generation of machine instructions for expressions.
Allocation of working variables.
8 Assembly Final addressing of the program. Segmentation into
drum tracks. Production of nal machine code.
9 Rearrangement of the program tracks on the drum
Table 14.2: Pass Structure for the GIER ALGOL Compiler
passes accessing it; it is a unique encoding of a specic data structure. Use of relatively
small, uniform units improves the eciency of the encoding and allows the implementors
to use common basic I/O routines for all passes. The latter consideration is perhaps most
important for compilers implemented in machine code. As we indicated in Section 14.1.2,
however, a multi-pass compiler is often I/O bound and hence specially tailored machine code
I/O routines might result in a signicant performance improvement. We should emphasize
that such a decision should only be made on the basis of careful measurement, but the
implementor should make it possible by an appropriate choice of representation.
Assuming that about half of the core memory would be used for code in each pass, simple
arithmetic shows that 10 passes will be required. This value was not taken as a target to
be met, but merely as an indication of the number to be expected. Passes were generally
organized around major data structures, with the additional proviso that large tables should
be combined with simple code and vice-versa.
Table 14.2 shows the nal structure, using the descriptions given by Naur [1964] and the
corresponding tasks discussed in this book.
Lexical analysis is divided into two passes in order to satisfy the code size/table size
relationship mentioned in the last paragraph: Since up to 510 identiers are allowed, and
there is no restriction on identier length, it is clear that the maximum possible space must
be made available for the symbol table. Thus the remainder of the lexical analysis was placed
in another pass. Here we have a decision that should be validated by measurements made on
the running compiler. In the nal system, each pass had 769 words of core memory available
(the remainder was occupied by the control code). Pass 1 used 501 words of program and
132 words of data, plus a 40-word buer for long character strings; pass 2 used 89 words for
program and 62 words for data. Unless the pass 1 code could be reduced signicantly by
using a dierent algorithm or data structure, or the allowance of 510 identiers was found to
be excessive, the decision to split the two tasks stands.
14.2 Case Studies 305

Note the interdependence of the decisions about representation of tokens and form of the
intermediate code. A 10-bit byte allows values in the range [0,1023]. By using the subrange
[512,1022] for identiers, one eectively combines the classification and symbol elds of
Figure 4.1. Values less than 512 classify non-identier tokens, in most cases characterizing
them completely. Only constants need more than a single byte using this scheme, and we
know that constants occur relatively infrequently. Interestingly, only string constants are
handled in pass 1. Those whose machine representations do not exceed 40 bits are replaced
by a marker byte followed by 4 bytes holding the representation. Longer strings are saved
on the drum and replaced in the code by a marker byte followed by 4 bytes giving the drum
track number and relative address. In the terminology of Section 4.2.2, the constant table has
separate xed-length representations for long and short strings. Numeric constants remain in
the text as strings of bytes, one corresponding to each character of the constant.
Pass 3 performs the normal syntactic analysis, and also converts numeric and logical
constants to a ag byte followed by 4 bytes giving the machine representation. Again in the
terminology of Section 4.2.2, the internal and target representations of numeric constants are
identical. (The ag byte simply serves as the classification eld of Figure 4.1; it is not part
of the constant itself.) Naur's description of the compiler strongly suggests that parsing is
carried out by the equivalent of a pushdown automaton while the lexical analysis of pass 1
is more ad-hoc. As we have seen, numeric constants can be handled easily by a pushdown
automaton. The decision to process numeric and logical constants in pass 3 rather than in
pass 1 was therefore probably one of convenience.
The intermediate output from pass 3 consists of the unchanged identiers and constants,
and a transformed set of delimiters that precisely describe the program's structure. It is
eectively a sequence of connection point numbers and tokens, with the transformed delimiters
specifying structure connections and each identier or constant specifying a single symbol
connection plus the associated token.
Attribute ow is generally from declaration to use. Since declaration may follow use
in ALGOL 60, reverse attribute ow may occur. Pass 4 is a reverse pass that collects all
declarative information of a block at the head of the block. It merely simplies subsequent
processing.
In pass 5, the denition table is actually distributed through the text. Each identier is
replaced by a 4-byte group that is the corresponding denition table entry. It gives the kind
(e.g. variable, procedure), result type, block number, relative address and possibly additional
information. Thus GIER ALGOL does not abstract entities as proposed in Section 4.2.3, but
deposits the necessary information at the leaves of the structure tree. This example empha-
sizes the fact that possessions and denitions are separate. GIER ALGOL uses possessions
virtually identical to those discussed in connection with Figure 9.21 to control placement of
the attributes during pass 5, but it has no explicit denition table at all.
Given the attribute propagation performed by passes 4 and 5, the attribution of pass 6 is
LAG(1). This illustrates the interaction between attribute ow and pass structure. Given an
attribute grammar, we must attempt to partition the relationships and semantic functions
so that they fall into separable components that can be t into the overall implementation
model. This partitioning is beyond the current state of the art for automatic generators.
We can only carry out the partitioning by hand and then use analysis tools based upon the
theorems of Chapter 8 to verify that we have not made any mistake.
Address calculations are carried out during both pass 7 and pass 8. Backward references
are resolved by pass 7; pass 8 is backward over the program, and hence can trivially resolve
forward references. Literal pooling is also done during pass 7. All of the constants used in
the code on one drum track appear in a literal pool on that track.
306 Implementation

14.2.2 Zurich Pascal

The rst Pascal compiler was developed during the years 1969-71 for Control Data 6000
series hardware at the Institut fur Informatik, Eidgenossische Technische Hochschule, Zurich.
Changes were made in Pascal itself as a result of experience with the system, and a new
implementation was begun in July, 1972. This project resulted in a family of two compilers,
Pascal-P and Pascal-6000, having a single overall design. Pascal-P is a portable compiler
that produces code for a hypothetical stack computer; the system is implemented by writing
an interpreter for this machine. Pascal-6000 produces relocatable binary code for Control
Data 6000 series machines. The two compilers were completed in March, 1973 and July,
1974 respectively. Descendants of these two compilers comprised the bulk of the Pascal
implementations in existence in 1982, ten years after their development was initiated.
Written in Pascal itself, the Zurich compilers have a one-pass, recursive descent architec-
ture that re ects the freedom from storage constraints aorded by the Control Data machine.
6000 series processors permit a user direct access to 131,072 60-bit words of 1 microsecond
core memory. Even the more common conguration installed at the time Zurich Pascal was
developed provided each user with a maximum of about 40,000 words. (This is almost 60
times the random-access memory available for the GIER ALGOL compiler.)
Pascal provides no linguistic mechanisms for dening packages or abstract data types,
and hence all explicit modules in the compilers are procedures or variables. The eect of a
package must be obtained by dening one or more variables at a given level and providing a
collection of procedures to manipulate them. Encapsulation can be indicated by comments,
but cannot be enforced. Similarly, an abstract data type is implemented by dening a type
and providing procedures to manipulate objects of that type. Lack of linguistic support for
encapsulation encourages the designer to consider a program as a single, monolithic unit.
Control of complexity is still essential, however, and leads to an approach known as stepwise
renement. This technique is particularly well-suited to the development of recursive descent
compilers.
Stepwise renement is subtly dierent from modular decomposition as a design methodol-
ogy. Instead of dividing the problem to be solved into a number of independent subproblems,
it divides the solution into a number of development steps. A painter uses stepwise renement
when he rst sketches the outlines of his subject and then successively lls in detail and adds
color; an automobile manufacturer uses modular decomposition when he combines engine,
power train and coachwork into a complete product. Table 14.3 lists the development steps
used in the Zurich Pascal project, with the descriptions given by Ammann [1975] and the
corresponding tasks discussed in this book.
Step Task(s) Description
1 Lexical analysis Syntactic analysis Syntax analysis for syntactically cor-
rect programs
2 Syntactic error recovery Treatment of syntactic errors
3 Semantic analysis Analysis of the declarations
4 Semantic analysis Treatment of declaration errors
5 Memory mapping Address allocation
6 Code selection Assembly Code generation
7 Optimization Local improvement of the generated
code
Table 14.3: Development Steps for the Zurich Pascal Compilers
14.2 Case Studies 307

basic symbol
program
block
constant
type
simple type
field list
label declaration
constant declaration
type declaration
variable declaration
procedure declaration
parameter list
body
statement
selector
variable
call
expression
simple expression
term
factor
assignment
compound statement
goto statement
if statement
case statement
while statement
repeat statement
for statement
with statement
Figure 14.1: The Structure of the Zurich Pascal Compilers
The overall structure of the compiler was established in step 1; Figure 14.1 shows this
structure. Each line represents a procedure, and nesting is indicated by indentation. At this
step the procedure bodies had the form discussed in Section 7.2.2, and implemented an EBNF
description of the language.
Lexical analysis is carried out by a single procedure that follows the outline of Chapter 6.
It has no separate scanning procedures, and it incorporates the constant table operations
for conversion from source to internal form. Internal form and target form are identical. No
internal-to-target operators are used, and the internal form is manipulated directly via normal
Pascal operations.
There is no symbol table. Identiers are represented internally as packed arrays of 10
characters { one 60-bit word. If the identier is shorter than 10 characters then it is padded
on the right with spaces; if it is longer then it is truncated on the right. (We have already
deplored this strategy for a language whose denition places no constraints upon identier
length.) Although the representation is xed-length, it still does not dene a small enough
address space to be used directly as a pointer or table index. Name analysis therefore requires
searching and, because there may be duplicate identiers in dierent contexts, the search space
may be larger than in the case of a symbol table. Omission of the symbol table does not save
308 Implementation

much storage because most of the symbol table lookup mechanism must be included in the
name analysis.
Syntactic error recovery is carried out using the technique of Section 12.2.2. A minor
modication was needed because the stack is not accessible when an error is detected: Each
procedure takes an anchor set as an argument. This set describes the anchors after reduction
of the nonterminal corresponding to the procedure. Symbols must be added to this set
to represent anchors within the production currently being examined. Of course all of the
code to update the anchors, check for errors, skip input symbols and advance the parse was
produced by hand. This augmentation of the basic step 1 routines constituted step 2 of
the compiler development. The basic structure of Figure 14.1 remained virtually unchanged;
common routines for error reporting and skipping to an anchor were introduced, with the
former preceding the basic symbol routine (so that lexical errors could be reported) and the
latter following it (so that the basic symbol routine could be invoked when skipping).
Step 3 was concerned with building the environment attribute discussed in Section 9.1.1.
Two record types, identrec and structrec, were added to the existing compiler. The envi-
ronment is a linked data structure made up of records of these types. There is one identrec
per declared identier, and those for identiers declared in the same range are linked as an
unbalanced binary tree. An array of pointers to tree roots constitutes the denition of the
current addressing environment. Three of the denition table operations discussed in Sec-
tion 9.2 (add a possession to a range, search the current environment, search a given range)
are implemented as common routines while the others are coded in line. Entering and leaving
a range are trivial operations, involving pointer assignment only, while searching the current
environment is complex. This is exactly the opposite of Figure 9.21, which requires complex
behavior on entry to and exit from a range with simple access to the current environment.
The actual discrepancy between the two techniques is reduced, however, when we recall that
the Zurich compiler does not perform symbol table lookups.
Each identrec carries attribute information as well as the linkages used to implement the
possession table. Thus the possessions and denitions are combined in this implementation.
The type attribute of an identier is represented by a pointer to a record of type structrec,
and there is one such record for every dened type. Certain types (as for example scalar
types) are dened in terms of identiers and hence a structrec may point to an identrec. The
identrec contains an extra link eld, beyond those used for the range tree, to implement lists
of identiers such as scalar constants, record elds and formal parameters.
The procedures of Figure 14.1 can be thought of as carrying out a depth-rst, left-to-
right traversal of the parse tree even though that tree never has an explicit incarnation.
Since only one pass is made over the source program, the attribution rules must meet the
LAG(1) condition. They were simply implemented by Pascal statements inserted into the
procedures of Figure 14.1 at the appropriate points. Thus at the conclusion of step 3 the
bodies of these procedures still had the form of Section 7.2.2, but contained additional Pascal
code to calculate the environment attribute. As discussed in Section 8.3.2, attribute storage
optimization led to the representation of the environment attribute as a linked, global data
structure rather than an item stored at each parse tree node. The interesting part of the
structure tree is actually represented by the hierarchy of activation records of the recursive
descent procedures. Attribute values attached to the nodes are stored as values of local
variables of these procedures.
During step 4 of the renement the remainder of the semantic analysis was added to
the routines of Figure 14.1. This step involved additional attribution and closely followed
the discussion of Chapter 9. Type denitions were introduced for the additional attributes,
global variables were declared for those attributes whose storage could be optimized, and local
variables were declared for the others. The procedures of Figure 14.1 were augmented by the
14.2 Case Studies 309

Pascal code for the necessary attribution rules, and functions were added to implement the
recursive attribute functions.
Ammann [1975] reports that steps 1-4 occupied a bit more than 6 months of the 24-month
project and accounted for just over 2000 of the almost 7000 lines in Pascal-6000. Steps 5 and
6 for Pascal-P were carried out in less than two and a half months and resulted in about 1500
lines of Pascal, while the corresponding numbers for Pascal-6000 were thirteen months and
4000 lines. Step 7 added another three and a half months to the total cost of Pascal-6000,
while increasing the number of lines by less than 1000.
The abstract stack computer that is the target for the Pascal-P compiler is carefully
matched to Pascal. Its elementary operators and data types are those of Pascal, as are
its memory access paths. There are special instructions for procedure entry and exit that
provide exactly the eect of a Pascal procedure invocation, and an indexed jump instruction
for implementing a case selection. Code generation for such a machine is clearly trivial, and
we shall not consider this part of the project further.
Section 10.1 describes storage allocation in terms of blocks and areas. A block is an object
whose size and alignment are known, while an area is an object that is still growing. In Pascal,
blocks are associated with completely-dened types, whereas areas are associated with types
in the process of denition and with activation records. Thus Pascal-6000 represents blocks
by means of a size eld in every structrec. The actual form of this eld varies with the
type dened by the structrec; there is no uniform "size" attribute like that of Figure 10.1.
Because of the recursive descent architecture and the properties of Pascal, the lifetime of an
area coincides with the invocation of one of the procedures of Figure 14.1 in every case. For
example, an area corresponding to a record type grows only during an invocation of the eld
list procedure. This means that the specication of an area can be held in local variables
of a procedure. Step 5 added these local variable declarations and the code to process area
growth to the procedures of Figure 14.1. The size eld was also added to structrec in this
step.
Step 6 was the rst point at which a `foreign' structure { the structure of the target
machine { appeared. This renement was thus the rst that added a signicant number
of procedures to those of Figure 14.1. The added procedures eectively act as modules for
simulation and assembly.
As we pointed out earlier, no explicit structure tree is ever created by Pascal-6000. This
means that the structure tree cannot be decorated with target attributes used to determine
an improved execution order and then traversed according to this execution order for code
selection. Pascal-6000 thus computes no target attributes other than the value descriptors of
Section 10.3.1. They are used in conjunction with a set of register descriptors and register
allocation operations to perform a machine simulation exactly as discussed in Section 10.3.1.
The recursive descent architecture once again manifests itself in the fact that global storage
is provided for only one value descriptor. Most value descriptors are held as local variables of
procedures appearing in Figure 14.1, with the global variable describing the `current' value {
the one that would lie at the `top of the stack'.
The decision tables describing code selection are hand-coded as Pascal conditionals and
case statements within the analysis procedures. Code is generated by invoking register alloca-
tion procedures, common routines such as load and store, and assembly interface procedures
from Table 14.4.
The rst four operations of Table 14.4 assemble target code sequentially; Pascal-6000 does
not have the concept of separate sequences discussed in Section 11.1.1. A `location counter'
holds the current relative address, which may be accessed by any routine and saved as a
label. The third operand of a 30-bit instruction may be either an absolute value or a relative
address, and gen30 has a fourth parameter to distinguish these cases. Forward references are
310 Implementation

Procedure Description
noop Force code alignment to a word boundary
gen15 Assemble a 15-bit instruction
gen30 Assemble a 30-bit instruction
gen60 Assemble a 60-bit constant
searchextid Set up an external reference
ins Satisfy a given forward reference
lgohead Output PIDL and ENTR
lgotext Output TEXT
lgoend Output XFER and LINK
Table 14.4: Pascal-6000 Assembly Operations
handled by ins, which allows a relative address to be stored at a given position in the code
already assembled.
In keeping with the one-pass architecture, Pascal-6000 retains all of the code for a single
procedure. The assembly `module' is initialized when the `body' procedure (Figure 14.1) is
invoked, and a complete relocatable deck is output at the end of this invocation to nalize
the `module'. Pascal-6000 uses Control Data's standard relocatable binary text as its target
code, in keeping with our admonition at the beginning of Section 11.2. We shall discuss the
layout of that text here in some detail as an illustration; another example, the IBM 370 object
module, will be given at the end of the next section.
PIDL 34 1 0 TEXT 40 n + 1 address
name length relocation bits
text1
XFER 46 1 0 :::
Start symbol 0 textn
ENTR 36 2n 0 LINK 44 n 0
symbol1 0 symbol1
0 address1 field1;1 :::
::: field1;i sym
symboln 0 symbol2 field2;1
0 addressn :::
Figure 14.2: Control Data 6000 Series Relocatable Binary Code
A relocatable subprogram is a logical record composed of a sequence of tables (Figure 14.2),
which are simply blocks of information with various purposes. The rst word of each table
contains an identifying code and species the number of additional 60-bit words in the table.
As with any record, a relocatable subprogram may be preceded by a prex table containing
arbitrary information (such as the date compiled, version of the compiler, etc.), but the rst
component of the subprogram proper is the program identication and length (PIDL) table.
PIDL is conventionally followed by an entry point (ENTR ) table that associates entry point
symbols with the locations they denote (Section 11.2.1), but in fact the loader places no
constraints on either the number or the position(s) of any tables other than PIDL.
The body of the subprogram is made up of TEXT tables. Each TEXT table species a
block of up to 15 words, the rst of which should be loaded at the specied address. Four
relocation bits are used for each text word (hence the limit of 15 text words). References to
external symbols are not indicated by the relocation bits, which only distinguish absolute and
signed relative addresses. External references are specied by LINK tables: For each external
14.2 Case Studies 311

symbol, a sequence of operand eld denitions is given. The loader will add the address of
the external symbol to each of the elds so dened. Thus a call of "sqrt", for example, would
appear in the TEXT table as an RJ (return jump ) instruction with the absolute value 0 as
its operand. This 0-eld would then be described in a LINK table by one of the operand eld
denitions following the symbol sqrt. When the loader had determined the address of sqrt it
would add it to the 0-eld, thus changing the instruction into RJ sqrt. There is no restriction
on the number of LINK tables, the number of times a symbol may appear or the number of
eld denitions that may follow a single symbol. As shown in Figure 14.2, each eld denition
occupies 30 bits, each symbol occupies 60 bits, and a symbol may be split between words.
The transfer (XFER ) table is conventionally associated with a main program. It gives
the entry point to which control is transferred after the loader has completed loading the
program. Again, however, the loader places no restriction on the number of XFER tables
or the subprograms with which they are associated. An XFER table is ignored if its start
symbol begins with a space, or if a new XFER whose start symbol does not begin with a
space is encountered. The only requirement is that, by the time the load is completed, a start
symbol that is an entry point of some loaded subprogram has been specied.
Internal and external references, either of which may occur in a 30-bit instruction, are
represented quite dierently in the target code. This is re ected at the assembly interface
by the presence of searchextid. When a 30-bit instruction is emitted, gen30 checks a global
pointer. If it is not nil then it points to an external symbol, and gen30 adds the target
location of the current instruction's third operand to a list rooted in that symbol. This list
will ultimately be used by lgoend to generate a LINK table. The global pointer checked by
gen30 is set by searchextid and cleared to nil by gen30. When the code generator emits a
30-bit instruction containing an external reference it therefore rst invokes searchextid with
the external identier and then invokes gen30 with the absolute value 0 as the third operand.
Section 11.3.1 gives an alternative strategy.

14.2.3 IBM FORTRAN H

The major design goal for FORTRAN H was production of ecient object code. IBM began
development of the compiler in 1963, using FORTRAN as the implementation language on
the 7094. The initial version was used to compile itself for System/360, producing over
half a million 8-bit bytes of code. Running on System/360, the compiler optimized itself,
reducing its size by about 25%. It was then rewritten to take advantage of language extensions
permitting ecient bit manipulation and introducing a form of record access. This reduced
compilation time by about 35% and allowed the compiler to compile itself on a 262,140 byte
conguration. Major development of FORTRAN H was completed in 1967, but modication
and enhancement has been a continuous process since then. The details presented in this
section correspond to release 17 of the compiler [IBM, 1968]. The entire program unit being
compiled is held in main storage by the FORTRAN H compiler. This is done to simplify
the optimizer, which accesses the program text randomly and rearranges it. It does imply
limitations on the size of a compilable unit, but such limitations are less serious for FORTRAN
than for ALGOL 60 or Pascal because the language design supports separate compilation of
small units.
As shown in Table 14.5, the compiler has ve major phases. Code for these phases is
overlaid, with a total of 13 overlay segments. A maximum of about 81,000 bytes of code is
actually in the memory at any one time (this maximum occurs during phase 20), and the
minimum storage in which a compilation can be carried out is about 89,000 bytes.
FORTRAN is a rather unsystematic language, and Phase 10 re ects this. The unit of
processing is a complete statement, which is read into a buer, packed to remove super uous
312 Implementation

Phase Task(s) Description

10 Lexical analysis, Syntactic Convert source text to operator-operand pairs
analysis; Semantic analysis and information table entries. Detect syntac-
tic errors.
15 Syntactic analysis; Seman- Convert operator-operand pairs to quadru-
tic analysis; Memory map- ples. Operator identication and consistency
ping; Target attribution checks. Convert constants and assign relative
addresses to constants, variables and arrays.
20 Target attribution; Optimi- Eliminate common subexpressions, perform
zation live / dead analysis and strength reduction,
and move constant expressions out of loops.
Assign registers and determine the sizes of
code blocks. Optimize jump targets.
25 Code selection; Assembly Convert quadruples into System/360 machine
code. Create an object module.
30 Error reporting Record appropriate messages for errors en-
countered during previous phases.
Table 14.5: Phase Structure of the IBM FORTRAN H Compiler

spaces, and then classied. Based upon the classication, ad hoc analysis routines are used to
deal with the parts of the statement. All of these routines have similar structures: They scan
the statement from left to right, extracting each operand and making an entry for it in the
denition table if one does not already exist, and building a linear list of operator/operand
pairs. The operator of the pair is the operator that preceded the operand; for the rst pair
it is the statement class. An operand is represented by a pointer to the denition table plus
its type and kind (constant, simple variable, array, etc.) The type and kind codes are also in
the denition table entry, and are retained in the list solely to simplify access.
Phase 10 performs only a partial syntactic analysis of the source program. It does not
determine the tree structure within a statement, but it does extract the statement number
and classify some delimiters that have multiple meaning. For example, it replaces `(' by `left
arithmetic parenthesis', `left subscript parenthesis' or `function parenthesis' as appropriate.
Name analysis is rudimentary in FORTRAN because the meaning of an identier is inde-
pendent of the structure of a program unit. This means that no possessions are required, and
the symbol and denition tables can be integrated without penalty. Symbol lookup uses a
simple linear scan of the chained denition table entries, but the organization of the chains is
FORTRAN-specic: There is one ordered chain for each of the six possible identier lengths,
and each chain is doubly-linked with the header pointing to the center of the chain. Thus a
search on any chain only involves half the entries. (The header is moved as entries are added
to a chain, in order to maintain the balance.) Constants, statement numbers and common
block names also have entries in the denition table. Three chains are used for constants, one
for each allowable length (4, 8 or 16 bytes), and one each for statement numbers and common
block names.
The only semantic analysis done during Phase 10 is `declaration processing'. Type, di-
mension, common and equivalence statements are completely processed and the results sum-
marized in the denition table. Because FORTRAN does not require that identiers be
declared, attribute information must also be gathered from applied occurrences. A minor use
of the attribute information is in the classication of left parentheses (mentioned above), be-
cause FORTRAN does not make a lexical distinction between subscript brackets and function
parentheses.
14.2 Case Studies 313

Phase 15 completes the syntactic analysis, converting the lists of operator/operand pairs
to lists of quadruples where appropriate. Each quadruple consists of an operator, a target type
and three pointers to the denition table. This means that phase 15 also creates a denition
table entry for every anonymous intermediate result. Such `temporary names' are treated
exactly like programmer-dened variables in subsequent processing, and may be eliminated by
various optimizations. The quadruples are chained in a correct (but not necessarily optimum )
execution order and gathered into basic blocks.
Semantic analysis is also completed during phase 15, with all operator identication and
consistency checking done as the quadruples are built. The target type is expressed as a
general type (logical, integer, real) plus an operand type (short, long) for each operand and
for the result.
The syntactic and semantic analysis tasks of phase 15 are carried out by an overlay segment
known as PHAZ15, which also gathers dened/used information for common subexpression
and dead variable analysis. This information is stored in basic block headers as discussed
in Chapter 13. Finally, PHAZ 15 links the basic block headers to both their predecessors
and their successors, describing the owgraph of the program and preparing for dominance
analysis.
CORAL is the second overlay segment of phase 15, which carries out the memory mapping
task. The algorithm is essentially that discussed in Section 10.1, but its only function is to
assign addresses to constants and variables (in other words, to map the activation record).
There are no variant records, but equivalence statements cause variables to share storage. By
convention, the activation record base is in register 13. The layout of the activation record is
given in Figure 14.3. It is followed immediately by the code for the program unit. (Remember
that storage allocation is static in FORTRAN.) The size of the save area (72 bytes) and its
alignment (8) are xed by the implementation, as is the size of the initial contents for register
12 (discussed below). Storage for the computed GOTO tables and the parameter lists has
already been allocated storage by Phase 10. CORAL allocates storage for constants rst, then
for simple variables and then for arrays. Local variables and arrays mentioned in equivalence
statements come next, completing this part of the activation record. Finally the common
blocks specied by the program unit are mapped as separate areas.
Save area
Initial contents for register 12
Branch tables for computed GOTO's
Parameter lists
Constants and local variables
Address values (`adcons')
Namelist dictionaries
Compiler-generated temporaries
Label addresses
Figure 14.3: FORTRAN H Activation Record
System/360 access paths limit the maximum displacement to 4095. When a larger dis-
placement is generated during CORAL processing, the compiler denes an adcon variable
{ a new activation record base { and resets the displacement to normal variable for further
processing. CORAL does not place either adcons or temporaries into the activation record
at this time, because they may be deleted during optimization.
Phase 20 assigns operands to registers. If the user has specied optimization level 0, the
compiler treats the machine as having one accumulator, one base register and one register for
specifying jump addresses (Table 14.6). Machine simulation (Section 10.3.1) is used to avoid
314 Implementation

redundant loads and stores, but no change is made in the execution order of the quadruples.
Attributes are added to the quadruples, specifying the register or base register used for each
operand and for the result.
Level 1 optimization makes use of a pool of general-purpose registers, as shown in Ta-
ble 14.6. Register 13 is always reserved as the base of the activation record. A decision
about whether to reserve some or all of registers 9-12 is made on the basis of the number of
quadruples output by phase 15. This statistic is available prior to register allocation, and it
predicts the size of the subprogram code. Once the register pool is xed, phase 20 performs
local register assignment within basic blocks and global assignment over the entire program
unit. Again, the order of the quadruples is unchanged and attributes giving the registers used
for each operand or memory access path are added to the quadruples.
Common subexpression elimination, live/dead analysis, code motion and strength reduc-
tion are all performed at optimization level 2. The register assignment algorithms used on
the entire program unit at level 1 are then applied to each loop of the modied program,
starting with the innermost and ending with the entire program unit. This guarantees that
the register assignment within an inner loop will be determined primarily by the activity
of operands within that loop, whereas at level 1 it may be in uenced by operand activity
elsewhere in the program.
The basic implementation used for a branch is to load the target address of the branch
into a register and then execute an RR-format branch instruction. This requires an adcon for
every basic block whose rst instruction is a branch target. If a register already happened to
hold an address less than 4096 bytes lower than the branch target, however, both the load and
the adcon would be unnecessary. A single RX-format branch instruction would suce. Thus
the compiler reserves registers to act as code bases. To understand the mechanism involved,
we must consider the layout of information in storage more carefully.
Assignment at optimization level
Register 0 1,2
0 Operands and results
1
2
3 Not used
4 Operands and results
5 Branch addresses Selected
logical operands
6 Operands representing in-
dex values
7 Base addresses
8
9 Not used
10 Code bases or operands and
results
11
12 Adcon base
13 Activation record base
14 Computed GOTO Logical Operands and results
results of comparisons
15 Computed GOTO
Table 14.6: General-Purpose Register Assignment by FORTRAN H
14.2 Case Studies 315

We have already seen that phase 15 allocates activation record storage for constants
and programmer-dened variables, generating adcons as necessary to satisfy the displace-
ment limit of 4095. When register allocation is complete, all adcons and temporary vari-
ables that have not been eliminated are added to the activation record. The adcons
must all be directly addressable, since they must be loaded to provide base addresses for
memory access. If they are not all within 4095 bytes of the activation record base then
the reserved register 12 is assumed to contain either the address of the rst adcon or
(base address of the activation record + 4096), whichever is larger. It is assumed that
the number of adcons will never exceed 1024 (although this is theoretically possible, given
the address space of System/360) and hence all adcons will be directly accessible via either
register 12 or register 13. (Note that a fail-safe decision to reserve register 12 can be made
on the basis of the phase 15 output, without regard to the number of quadruples.)
If the number of quadruples output from phase 15 is large enough, register 11 will be
reserved and initialized to address the 4096th byte beyond that addressed by register 12.
Similarly, for a larger number of quadruples, register 10 will be reserved and initialized to an
address 4096 larger than register 11. Finally, register 9 will be reserved and initialized for an
even larger number of quadruples. Phase 20 can calculate the maximum possible address of
each basic block. Those lying within 4096 bytes of one of the reserved registers are marked
with the register number and displacement. The adcon corresponding to the basic block label
is then deleted. (These deletions, plus the ultimate shortening of the basic blocks due to
optimization of the branch instructions, can never invalidate the addressability conditions on
the basic blocks.)
The branch optimization described in the previous paragraphs is carried out only at
optimization levels 1 and 2. At optimization level 0 the basic implementation is used for all
branches.
Phase 25 uses decision tables to select the proper sequence of machine instructions. The
algorithm is basically that of Section 10.3.2, except that the action stub of the decision table
is simply a sequence of instruction templates. Actions such as swap and lreg (Figure 10.13)
have already been carried out during phase 20. There is conceptually one table for every
quadruple operator. Actually, several tables are associated with families of operators, and
the individual operator modies the skeletons as they are extracted. The condition is selected
by a 4-bit status, which may have somewhat dierent meanings for dierent operators. It is
used as an index to select the proper column of the table, which in turn identies the templates
to be used in implementing the operator.
FORTRAN H generates System/360 object modules, which are sequences of 80-character
card images (Figure 14.4). Each card image is output by a normal FORTRAN formatted
write statement. The rst byte contains 2, which is the communication control character
STX (start of text ). All other elds left blank in Figure 14.4 are unused. Columns 2-4 and
73-80 contain alphanumeric information as indicated, with the serial number consisting of
a four-character deck identier and a four-digit sequence number. The remaining columns
simply contain whatever character happens to have the value of the corresponding byte as its
EBCDIC code. Thus 24-bit (3-byte) addresses occupy three columns and halfword (2-byte)
integers occupy two columns. Even though the length eld n has a maximum value of 56, it
occupies a halfword because System/360 has no byte arithmetic.
Comparing Figure 14.4 with Figure 14.2, we see that essentially the same elements are
present. END optionally carries a transfer address, thus subsuming XFER. ESD plays the
roles of both PIDL and ENTR, and also species the symbols from LINK. Its purpose is
to describe the characteristics of the control sections associated with global symbols, and
to dene short, xed-length representations (the esdid's) for those symbols. The esdid in
columns 15-16 identies a deck or external; only one symbol of these types may appear on
316 Implementation

1 2-4 5 6-8 9-10 11-12 13-14 15-16 17-72 73-80

ESD n esdid symbols serial
TXT address n esdid text serial
RLD n Relocations serial
END address esdid serial
a) Object module card images
1-8 9 10-12 13 14-16
Deck characters 0 offset length
Entry characters 1 address ldid
External characters 2
b) Symbols
1-2 3-4 5 6-8
Position Relocation f address
esdid esdid
c) Relocations
Figure 14.4: IBM System/360 Relocatable Binary Code

an ESD card. Entry symbols identify the control sections to which they belong (ldid), and
therefore they may be placed on any ESD card where space is available.
RLD provides the remaining function of LINK, and also that of the relocation bits in
TEXT. Each item of relocation information modies the eld at the absolute location specied
in the position esdid and address by either adding or subtracting the value identied by the
relocation esdid. Byte f determines whether the value will be added or subtracted, and also
species the width of the eld being modied (which may be 1, 2, 3 or 4 bytes). If a sequence
of relocations involve the same esdid's then these specications are omitted from the second
and subsequent relocations. (The rightmost bit of f is 1 if the following relocation does not
specify esdid's, 0 otherwise.)
The decision to use relocation bits on the Control Data machine and the RLD mechanism
on System/360 re ects a fundamental dierence in the instruction sets: 30-bit instructions
on the 6000 Series often reference memory directly, and therefore relocatable addresses are
common in the text. On System/360, however, all references to memory are via values in
registers. Only the adcons are relocatable and therefore relocatable addresses are quite rare
in the text.

14.3 Notes and References

Most implementation decisions are related to performance in one way or another, and must
either be made on the basis of hard data or validated on that basis when the compiler is
running. It is well known that performance problems are elusive, and that most programmers
14.3 Notes and References 317

have incorrect ideas about the source of bottlenecks in their code. Measurement of critical
parameters of the compiler as it is running is thus imperative. These parameters include the
sizes of various data structures and the states of various allocation and lookup mechanisms,
as well as an execution histogram [Waite, 1973b]. The only description of GIER ALGOL in
the open literature is the paper by Naur [1964] cited earlier, but a very similar compiler for
a variant of Pascal was discussed in great detail by Hartmann [1977].
Ammann [1975] gives an excellent account in German of the development of Zurich Pascal,
and partial descriptions are available in English Ammann [1974, 1977].
In addition to the Program Logic Manual, [IBM, 1968] descriptions of FORTRAN H have
been given by Lowry and Medlock [1969] and Scarborough and Kolsky [1980]. These
treatments concentrate on the optimization performed by Phase 20, however, and give very
little information about the compiler as a whole.
318 Implementation
Appendix A
The Sample Programming
Language LAX
In this Appendix we dene the sample programming language LAX (LAnguage eX ample),
upon which the concrete compiler design examples in this book are based. LAX illustrates
the fundamental problems of compiler construction, but avoids uninteresting complications.
We shall use extended Backus-Naur form (EBNF ) to describe the form of LAX. The
dierences between EBNF and normal BNF are:
Each rule is terminated by a period.
Terminal symbols of the grammar are delimited by apostrophes. (Thus the metabrackets
`<' and `>' of BNF are super uous.)
The following abbreviations are permitted:
Abbreviation Meaning
X ::= ( ) : X ::= Y : Y ::= :
X ::= [ ] : X ::= j ( ) :
X ::= u+ : X ::= Y : Y ::= u j Y u:
X ::= u : X ::= [u+ ] :
X ::= jj t: X ::= (t) :
Here , and are arbitrary right-hand sides of rules, Y is a symbol that does not
appear elsewhere in the specication, u is either a single symbol or a parenthesized
right-hand side, and t is a terminal symbol.
For a more complete discussion of EBNF see Section 5.1.4.
The axiom of the grammar is program. EBNF rules marked with an asterisk in this
Appendix are included to aid in the description of the language, but they do not participate
in the derivation of any sentence. Thus they dene useless nonterminals in the sense of
Chapter 5.

A.1 Basic Symbols

An identier is a freely-chosen representation for a type, label, object, procedure, formal pa-
rameter or eld selector. It is given meaning by a construct of the program. The appearances
at which an identier is given a meaning are called dening occurrences of that identier. All
other appearances of the identier are called applied occurrences.
319
320 The Sample Programming Language LAX

A.1.0.1 * basic symbol ::= identifier j denotation j delimiter:

A.1.0.2 identifier ::= letter([0 0 ](letter j digit)) :
A.1.0.3 letter ::= 0a0 j 0 b0 j 0c0 j 0 d0 j 0 e0 j 0 f 0 j 0 g0 j 0 h0 j 0 i0
j 0j 0 j 0k0 j 0l0 j 0m0 j 0 n0 j 0 o0 j 0p0 j 0q0 j 0 r0
j 0s0 j 0t0 j 0 u0 j 0v0 j 0w0 j 0x0 j 0y0 j 0 z0:
A.1.0.4 digit ::= 0 00 j 010 j 0 20 j 0 30 j 0 40 j 0 50 j 060 j 0 70 j 080 j 0 90 :
A.1.0.5 denotation ::= integer j floating point:
A.1.0.6 integer ::= digit+ :
A.1.0.7 floating point ::= digit+ scale j digit 0 :0digit+ [scale]:
A.1.0.8 scale ::= 0 e0 [ 0 +0 j 0 ,0 ] integer:
A.1.0.9 * delimiter ::= special j keyword:
A.1.0.10 * special ::= 0 +0 j 0 ,0 j 0 0 j 0 =0 j 0 <0 j 0 >0 j 0 =0 j 0 "0
j 0 :0 j 0 ;0 j 0:0 j 0;0 j 0(0 j 0 )0 j 0 [0 j 0]0 j 0==0 j 0 :=0 j 0 ==0 :
A.1.0.11 * keyword ::= 0 and0 j 0 array0 j 0 begin0 j 0 case0
j 0declare0 j 0 div0 j 0do0 j 0else0 j 0end0
j 0for0 j 0 from0 j 0goto0 j 0if 0 j 0is0
j 0mod0 j 0new0 j 0not0 j 0 of 0 j 0 or0
j 0procedure0 j 0record0 j 0 ref 0 j 0then0 j 0 to0 j 0type0 j 0while0 :
A.1.0.12 * comment ::= 0 (0 arbitrary 0 )0 :
Note: arbitrary does not contain `)'
Integer and oating point denotations have the usual meaning.
Keywords are reserved identiers that can only be used as indicated by the rules of the
EBNF specication. We have used boldface type to represent keywords in the book only to
enhance readability. This convention is not followed in the grammar, where the keywords are
simply strings to be processed.
Comments, spaces and newlines may not appear within basic symbols. Two adjacent
basic symbols must be separated by one or more comments, spaces or newlines unless one of
the basic symbols is a special. Otherwise comments, spaces and newlines are meaningless.
An upper case letter is considered to be equivalent to the corresponding lower case letter.

A.2 Program Structure

A.2.0.1 program ::= block:
A.2.0.2 * range ::= block j statement list j iteration j record type j procedure:
A.2.0.3 block ::= 0 declare0 (declaration jj 0;0 ) 0 begin0 (statement jj 0;0 ) 0 end0 :
A.2.0.4 statement list ::= statement jj 0 ;0 :
A.2.0.5 statement ::= label definition (expression j iteration j jump):
A.2.0.6 label definition ::= identifier 0 :0 :
A.2.0.7 iteration ::= 0 while0 expression loop
j 0for0 identifier 0from0 expression 0to0 expression loop:
A.2.0.8 loop ::= 0do0 statement list 0 end0 :
A.2.0.9 jump ::= 0goto0 identifier:
See Section A.3 for declarations, record types and procedures, and Section A.4 for expres-
sions.
A.2.1 Programs
A program species a computation by describing a sequence of actions. A computation
specied in LAX may be realized by any sequence of actions having the same eect as the
A.2 Program Structure 321

one described here for the given computation. The meaning of constructs that do not satisfy
the rules given here is undened. Whether, and in what manner, a particular implementation
of LAX gives meaning to undened constructs is outside the scope of this denition.
Before translation, a LAX program is embedded in the following block, which is then
translated and executed:
declare standard declarations begin program end
The standard declarations provide dening occurrences of the predened identiers given
in Table A.1. These declarations cannot be expressed in LAX.
Identier Meaning
boolean Logical type
false Falsity
integer Integer type
nil Reference to no object
real Floating point type
true Truth
Table A.1: Predened Identiers

A.2.2 Visibility Rules

The text of a range, excluding the text of ranges nested within it, may contain no more
than one dening occurrence of a given identier. Every applied occurrence of an identier
must identify some dening occurrence of that identier. Unless otherwise stated, the dening
occurrence D identied by an applied occurrence A of the identier I is determined as follows:

1. Let R be the text of A, and let B be the block in which the LAX program is embedded.
2. Let R0 be the smallest range properly containing R, and let T be the text of R0 excluding
the text of all ranges nested within it.
3. If T does not contain a dening occurrence of I , and R0 is not B , then let R be R0 and
go to step (2).
4. If T contains a dening occurrence of I then that dening occurrence is D.

Identifier is a dening occurrence in the productions for label definition (A.2.0.6), itera-
tion (A.2.0.7), variable declaration (A.3.0.2), identity declaration (A.3.0.7), procedure dec-
laration (A.3.0.8), parameter (A.3.0.10), type declaration (A.3.0.12) and field (A.3.0.14).
All other instances of identifier are applied occurrences.

A.2.3 Blocks
The execution of a block begins with a consistent renaming : If an identier has dening
occurrences in this block (excluding all blocks nested within it) then those dening occurrences
and all applied occurrences identifying them are replaced by a new identier not appearing
elsewhere in the program.
After the consistent renaming, the declarations of the block are executed in the sequence
they were written and then the statements are executed as described for a statement list
(Section A.2.4). The result of this execution is the result of the block. The extent of the
result of a block must be larger than the execution of that block.
322 The Sample Programming Language LAX

A.2.4 Statement Lists

Execution of a statement list is begun by executing the rst statement in the list. The
remaining statements in the list are then executed in the sequence in which they were written
unless the sequence is altered by executing a jump (Section A.2.6). If a statement is followed
by a semicolon then its result (if any) is discarded when its execution is nished. The result
of the last statement in a statement list is the result of the statement list; if the last statement
does not deliver a result then the statement list does not deliver a result.

A.2.5 Iterations
The iteration
while expression do statement list end
is identical in meaning to the conditional clause:

if expression then
statement list;
while expression do statement list end
end

The iteration
for identifier from initial value to final value do statement list end
is identical in meaning to the block:

declare a : integer; b : integer

begin
a := initial value; b := final value;
if not (a > b) then
declare identifier is a : integer begin statement list end;
while a < b do
a := a + 1;
declare identifier is a : integer begin statement list end
end (* while *)
end (* if *)
end
Here a and b are identiers not appearing elsewhere in the program.
An iteration delivers no result.

A.2.6 Labels and Jumps

If an identier has an applied occurrence in a jump then the dening occurrence identied
must be in a label denition. A jump breaks o the execution of the program at the point of
the jump, and resumes execution at the labelled expression, iteration or jump.
A jump delivers no result.
A.3 Declarations 323

A.3 Declarations
A.3.0.1 declaration ::= variable declaration
j identity declaration
j procedure declaration
j type declaration:
A.3.0.2 variable declaration ::= identifier 0 :0 type specification
j identifier 0 :0
0 array0 0 [0 (bound pair jj 0 ;0 ) 0 ]0 0 of 0 type specification:
A.3.0.3 type specification ::= identifier
j 0 ref 0 type specification
j 0 ref 0 array type
j procedure type:
A.3.0.4 bound pair ::= expression 0 :0 expression:
A.3.0.5 array type ::= 0array0 0 [0 0 ;0 0 ]0 0 of 0 type specification:
A.3.0.6 procedure type ::=
0 procedure0 [ 0 (0 (type specification jj 0 ;0 ) 0 )0 ] [result type]:
A.3.0.7 identity declaration ::=
identifier 0 is0 expression 0 :0 type specification:
A.3.0.8 procedure declaration ::= 0 procedure0 identifier procedure:
A.3.0.9 procedure ::= [0 (0 (parameter jj 0 ;0 ) 0 )0 ] [result type] 0 ;0 expression:
A.3.0.10 parameter ::= identifier 0 :0 type specification:
A.3.0.11 result type ::= 0 :0 type specification:
A.3.0.12 type declaration ::= 0 type0 identifier 0 =0 record type:
A.3.0.13 record type ::= 0 record0 (field jj 0 ;0 ) 0 end0 :
A.3.0.14 field ::= identifier 0 :0 type specification:
A.3.0.15 * type ::= type specification j array type j procedure type:
See Section A.4 for Expressions.
A.3.1 Values, Types and Objects
Values are abstract entities upon which operations may be performed, types classify values
according to the operations that may be performed upon them, and objects are the concrete
instances of values that are operated upon. Two objects are equal if they are instances of the
same value. Two objects are identical if references (see below) to them are equal. Every object
has a specied extent, during which it can be operated upon. The extents of denotations,
the value nil (see below) and objects generated by new (Section A.4.4) are unbounded; the
extents of other objects are determined by their declarations.
The predened identiers boolean, integer and real represent the types of truth values,
integers and oating point numbers respectively. Values of these types are called primitive
values, and have the usual meanings.
An instance of a value of type ref t is a variable that can refer to (or contain ) an object
of type t. An assignment to a variable changes the object to which the variable refers, but
does not change the identity of the variable. The predened identier nil denotes a value of
type ref t, for arbitrary t. Nil refers to no object, and may only be used in a context that
species the referenced type t uniquely.
Values and objects of array and record types are composite. The immediate components
of an array are all of the same type, and the simple selectors are integer tuples. The immediate
components of a record may be of dierent types, and the simple selectors are represented by
identiers. No composite object may have a component of its own type.
324 The Sample Programming Language LAX

Values of a procedure type are specications of computations. If the result type is omitted,
then a call of the procedure yields no result and the procedure is called a proper procedure ;
otherwise it is called a function procedure.
If two types consist of the same sequence of basic symbols and, for every identier in that
sequence, the applied occurrences in one type identify the same dening occurrence as the
applied occurrences in the other, then the two types are the same. In all other cases, the two
types are dierent.
A.3.2 Variable Declarations
A variable referring to an undened value (of the specied type) is created, and the identier
represents this object. The extent of the created variable begins when the declaration is
executed and ends when execution of the smallest range containing the declaration is complete.
If the variable declaration has the form
identifier : t
then the created variable is of type ref t, and may refer to any value of type t. If, on the
other hand, it has the form
identifier : array [l1 : u1 ; : : : ; ln : un ] of t
then the created variable is of type ref array type, and may only refer to values having
the specied number of immediate components. The type of the array is obtained from
the variable declaration by deleting ìdentifier:' and each bound pair e1 : e2 ; array [l1 :
u1; : : : ; ln : un] of t species an array of this type with (u1 , l1 + 1) (un , ln + 1)
immediate components of type t. The bounds li and ui are integers with li ui .
A.3.3 Identity Declarations
A new instance of the value (of the specied type) resulting from evaluation of the expression
is created, and the identier represents this object. If the expression yields an array or
reference to an array, the new instance has the same bounds. The extent of the created
object is identical to the extent of the result of the expression.
A.3.4 Procedure Declarations
A new instance of the value (of the specied procedure type) resulting from copying the basic
symbol sequence of the procedure is created, and the identier represents this object. The
extent of the created object begins when the declaration is executed and ends when execution
of the smallest block containing the declaration is complete.
Evaluation of the expression of a function procedure must yield a value of the given
result type.
The procedure type is obtained from the procedure declaration by deleting ìdentifier'
and `; expression', and removing ìdentifier :' from each parameter.
A.3.5 Type Declarations
The identier represents a new record type dened according to the given specication.

A.4 Expressions
A.4.0.1 expression ::= assignment j disjunction:
A.4.0.2 assignment ::= name 0 :=0 expression:
A.4.0.3 disjunction ::= conjunction j disjunction 0 or0 conjunction:
A.4 Expressions 325

A.4.0.4 conjunction ::= comparison j conjunction 0 and0 comparison:

A.4.0.5 comparison ::= relation[eqop relation]:
A.4.0.6 eqop ::= 0 =0 j 0 ==0 :
A.4.0.7 relation ::= sum [relop sum]:
A.4.0.8 relop ::= 0 <0 j 0 >0 :
A.4.0.9 sum ::= term j sum addop term:
A.4.0.10 addop ::= 0 +0 j 0 ,0 :
A.4.0.11 term ::= factor j term mulop factor:
A.4.0.12 mulop ::= 0 0 j 0 =0 j 0 div0 j 0 mod0 :
A.4.0.13 factor ::= primary j unop factor:
A.4.0.14 unop ::= 0 +0 j 0 ,0 j 0 not0 :
A.4.0.15 primary ::= denotation j name j 0 (0 expression0)0 j block j clause:
A.4.0.16 name ::= identifier
j name 0 :0 identifier
j name 0 [0 (expression jj 0;0 ) 0]0
j name 0 "0
j 0 new0 identifier
j procedure call:
A.4.0.17 procedure call ::= name 0 (0 (argument jj 0;0 ) 0 )0 :
A.4.0.18 argument ::= expression:
A.4.0.19 clause ::= conditional clause j case clause:
A.4.0.20 conditional clause ::= 0 if 0 expression 0 then0 statement list 0 end0
j 0 if 0 expression 0then0 statement list 0else0 statement list 0end0 :
A.4.0.21 case clause ::=
0 case0 expression 0 of 0
(case label 0 :0 statement list jj 0 ==0 )
0 else0 statement list 0 end0 :
A.4.0.22 case label ::= integer:

A.4.1 Evaluation of Expressions

This grammar ascribes structure to an expression in the usual way. Every subexpression
(assignment, disjunction, conjunction, etc.) may be evaluated to yield a value of a certain
type. The operands of an expression are evaluated collaterally unless the expression is a
disjunction or a conjunction (see Section A.4.3). Each operator indication denotes a set of
possible operations, with the particular one meant in a given context being determined by the
operand types according to Table A.2. When the type of value delivered by an operand does
not satisfy the requirements of a operation, a coercion sequence can be applied to yield a value
that does satisfy the requirements. Any ambiguities in the process of selecting computations
and coercions is resolved in favor of the choice with the shortest total coercion sequence
length.
It must be possible to determine an operation for every operator indication appearing in
a program.

A.4.2 Coercions
The context in which a language element (statement, argument, expression, operand, name
as a component of an indexed object, procedure call, etc.) appears may permit a stated
set of types for the result of that element, prescribe a single type, or require that the result
be discarded. When the a priori type of the result does not satisfy the requirements of the
326 The Sample Programming Language LAX

Indication Operand Type Result Operation

Left Right Type
:= ref t t ref t assignment
or boolean boolean disjunction
and conjunction
== ref t ref t boolean identity
= m m equality
< less than
> greater than
+ a a addition
, a subtraction
multiplication
div integer integer integer division
mod remainder
= real real real division
not boolean boolean complement
+ a a no operation
, negation
Here t denotes any type, m denotes any non-reference type and a denotes integer or real type.
Table A.2: Operator Identication

context, coercion is employed. The coercion consists of a sequence of coercion operations

applied to the result. If several types are permitted by the context then the one leading to
the shortest coercion sequence will be selected.
Coercion operations are:
Widen : Convert from integer to oating point.
Deprocedure : Invoke a parameterless procedure (see Section A.4.5). This is the only
coercion that can be applied to the left-hand side of an assignment.
Dereference : Replace a reference by the object to which it refers. Dereferencing may
also be specied explicitly by using the content operation (see Section A.4.4). Nil
cannot be dereferenced.
Void : Discard a computed value. If the value to be discarded is a parameterless proce-
dure or a reference to such a procedure, the procedure must be invoked and its result
(if any) discarded.
A.4.3 Operations
An assignment causes the variable yielded by the left operand to refer to a new instance
of the value yielded by the right operand. The result of the assignment is the reference
yielded by the left operand. Assignments to nil are not permitted, nor are assignments of
references or procedures in which the extent of the value yielded by the right operand is
smaller than the extent of the reference yielded by the left operand. Assignment of composite
objects is carried out by collaterally assigning the components of the value yielded by the
right operand to the corresponding components of the reference yielded by the left operand.
For array assignments, the reference and value must have the same number of dimensions and
corresponding dimensions must have the same numbers of elements.
A.4 Expressions 327

The expression a or b has the meaning if a then true else b.

The expression a and b has the meaning if a then b else false.
The expression not a has the meaning if a then false else true.
Identity yields true if the operand values are identical variables.
Equality has the usual meaning. Composite values are equal if each element of one is
equal to the corresponding element of the other. Arrays can only be equal if they have the
same dimensions, each with the same number of elements. Procedure values are equal if they
are identical.
Relational operators for integer and real types are dened as usual.
The arithmetic operators +; , (unary and binary), ; = have the usual meaning as long as
the values of all operands and results lie in the permitted range and division by zero does not
occur. div (integer division) and mod (remainder) are dened only when the value of the
right operand is not 0. Their results are then the same as those of the following expressions:
8 j k
<, i

j if ji < 0
i div j =

j k
i: otherwise
j
i mod j = (i , (i div j ) j )
Here jxj is the magnitude of x and bxc is the largest integer not larger than x.
A.4.4 Names
Identiers name objects of specied types created by declarations. If an applied occurrence
of an identier is a name then the dening occurrence identied by it may not be in a type
declaration, label denition or eld.
In the eld selection name:identifier the name must (possibly after coercion) yield a
record or reference to a record. The record type must contain a field that provides a dening
occurrence of the identier, and it is this dening occurrence which is identied by identifier.
If the name yields a record then the result of the eld selection is the value of the eld selected;
otherwise the result of the eld selection is a reference to this eld.
In the index selection name[j1 ; : : : ; jn ] the name must (possibly after coercion) yield an
n-dimensional array or a reference to an n-dimensional array. The name and subscript ex-
pressions ji are evaluated collaterally. If the name yields an array then the result of the index
selection is the value of the element selected; otherwise the result of the index selection is a
reference to this element.
In the content operation name " the name must (possibly after coercion) yield a variable.
The result of the content operation is the value referred to.
The generator new t yields a new variable that can reference objects of type t.
A.4.5 Procedure Calls
In the procedure call p(a1 ; : : : ; an ) the name p must (possibly after coercion) yield an object
of procedure type having n parameters (n 0). The name p and argument expressions ai are
evaluated collaterally. Let P = (p1 : t1 ; : : : ; pn : tn): expression be the result of evaluating
the name, and let ri be the result of evaluating ai . The procedure call is then evaluated as
follows (copy rule ):
1. If n = 0 then the procedure call is replaced by (expression), otherwise the procedure
call is replaced by the block
declare p1 is r1 : t1; : : : ; pn is rn : tn begin expression end
328 The Sample Programming Language LAX

2. The block (or parenthesized expression) is executed. If it is not left by a jump, the
result is coerced to the result type of P (or voided, in the case of a proper procedure).
3. As soon as execution is completed, possibly by a jump, the substitution of step 1 is
reversed (i.e. the original call is restored).
The value yielded by the coercion in step (2) is the result of the procedure call.
A.4.6 Clauses
The expression in a conditional clause must deliver a Boolean result. If this result is true
then the rst statement list will be executed and its result will be taken as the result of the
conditional clause; otherwise the second statement list will be executed and its result will be
taken as the result of the conditional clause. The rst alternative of a one-sided conditional
clause, in which the second alternative is omitted, is voided.
The expression in a case clause must deliver an integer result. When the value of the
expression is i and one of the case labels is i, the statement list associated with that case
label will be executed and its result will be taken as the result of the case clause; otherwise
the statement list following else will be executed and its result will be taken as the result of
the case clause. All case labels in a case clause must be distinct.
The component statement lists of a clause must be balanced to ensure that the type of
the result yielded is the same regardless of which alternative was chosen. Balancing involves
coercing the result of each component statement list to a common type. If there is no one
type to which all of the result types are coercible then all the results are voided. When the
type returned by the clause is uniquely prescribed by the context then this type is chosen
as the common result type for all alternatives. If the context of the expression is such that
several result types are possible, the one leading to the smallest total number of coercions is
chosen.
Appendix B
Useful Algorithms For Directed
Graphs
The directed graph is a formalism well-suited to the description of syntactic derivations, data
structures and control ow. Such descriptions allow us to apply results from graph theory
to a variety of compiler components. These results yield standard algorithms for carrying
out analyses and transformations, and provide measures of complexity for many common
tasks. In this appendix we summarize the terminology and algorithms most important to the
remainder of the book.

B.1 Terminology
B.1 Definition
A directed graph is a pair (K; D); where K is a nite, nonempty set and D is a subset of
K K . The elements of K are called the nodes of the graph, and the elements of D are the
edges.
Figure B.1a is a directed graph, and Figure B.1b shows how this graph might be represented
pictorially.

K = f1; 2; 3; 4g
D = f(1; 2); (1; 3); (4; 4); (2; 3); (3; 2); (3; 4)g
a) The components of the graph
1 4

2 3
b) Pictorial representation
K[1] K[2] K[3]

c) The condensation graph

Figure B.1: A Directed Graph
329
330 Useful Algorithms For Directed Graphs

In many cases, a label function, f , is dened on the nodes and/or edges of a graph. Such
a function associates a label, which is an element of a nite, nonempty set, with each node or
edge. We then speak of a graph with node or edge labels. The labels serve as identication of
the nodes or edges, or indicate their interpretation. This is illustrated in Figure B.1b, where
a function has been provided to map K into the set f1; 2; 3; 4g.
B.2 Definition
A sequence (k0 ; : : : ; kn ) of nodes in a directed graph (K; D), n 1, is called a path of length
n if (ki , 1; ki ) 2 D; i = 1; : : : ; n. A path is called a cycle if k0 = kn .
An edge may appear more than once in a path: In the graph of Figure B.1, the sequence of
edges (2,3), (3,2), (2,3), (3,4), (4,4), (4,4) denes the path (2,3,2,3,4,4,4) of length 6.
B.3 Definition
Let (K; D) be a directed graph. Partition K into equivalence classes Ki such that nodes u
and v belong to the same class if and only if there is a cycle to which u and v belong. Let Di
be the subset of edges connecting pairs of nodes in Ki . The directed graphs (Ki ; Di ) are the
strongly connected components of (K; D).
The graph of Figures B.1a and B.1b has three strongly connected components:
K1 = f1g; D1 = fg
K2 = f4g; D2 = f(4; 4)g
K3 = f2; 3g; D3 = f(2; 3); (3; 2)g
Often we deal with graphs in which all nodes of a strongly connected component are
identical with respect to some property of interest. When dealing with this property, we can
therefore replace the original graph with a graph having one node for each strongly connected
component.
B.4 Definition
Let P = fK1 ; : : : ; Kn g be a partition of node set of a directed graph (K; D). The reduction
of (K; D) with respect to the partition P is the directed graph (K 0 ; D0 ) such that K 0 =
fk1 ; : : : ; kn g and D0 = f(ki ; kj ) j i 6= j; and (u; v) is an element of D for some u 2 Ki and
v 2 Kj g.
We term the subsets Ki of an (arbitrary) partition blocks. The reduction with respect to
strongly connected components is the condensation graph.
The condensation graph of Figure B.1b is shown in Figure B.1c. Since every cycle lies
wholly within a single strongly connected region, the condensation graph has no cycles.
B.5 Definition
A directed acyclic graph is a directed graph that contains no cycles.
B.6 Definition
A directed acyclic graph is called a tree with root k0 if for every node k 6= k0 there exists
exactly one path (k0 ; : : : ; k).
These two special classes of graphs are illustrated in Figure B.2.
If a tree has an edge (k; k0 ); we say that k0 is a child of k and k is the parent of k0 . Note
that Denition B.6 permits a node to have any number of children. Because the path from
the root is unique, however, every node k 6= k0 has exactly one parent. The root, k0 ; is the
only node with no parent. A tree has at least one leaf, which is a node with no children. If
B.1 Terminology 331

1 2 3

4 5 6 7 8

9
a) A directed acyclic graph
0

1 2 3

4 5 6 7 8
b) A tree
Figure B.2: Special Cases of Directed Graphs

there is a path in a tree from node k to node k0 , we say that k0 is a descendant of k and k is
an ancestor of k0 .
B.7 Definition
A tree is termed ordered if, for every node, a linear order is dened on the children of that
node.
If we list the children of a node k0 in an ordered tree, we shall always do so in the sense of
the ordering; we can therefore take the enumeration as the ordering. The rst child of k0 is
also called the left child ; the child node that follows k in the order of successors of k0 is called
the right sibling of k. In Figure B.2b, for example, we might order the children of a node
according to the magnitude of their labels. Thus 1 would be the left child of 0, 2 would be
the right sibling of 1, and 3 the right sibling of 2. 3 has no right siblings and there is no
relationship between 6 and 7.
In an ordered tree, the paths leaving the root can be ordered lexicographically: Consider
two paths x = (x0 ; : : : ; xm ) and y = (y0 ; : : : ; yn ) with m n and x0 = y0 being the root.
Because both paths begin at the root, there exists some i 0 such that xj = yj ; j = 0; : : : ; i.
We say that x < y either if i = m and i < n; or if xi+1 < yi+1 according to the ordering of the
children of xi (= yi ). Since there is exactly one path from the root to any node in the tree,
this lexicographic ordering of the paths species a linear ordering of all nodes of the tree.
B.8 Definition
A cut in a tree (K; D) is a subset, C , of K such that for each leaf km 2 (K; D) exactly one
element of C lies on the path (k0 ; : : : ; km ) from the root k0 to that leaf.
Examples of cuts in Figure B.2b are f0g; f1; 2; 3g; f1; 2; 7; 8g and f4; 5; 6; 7; 8g.
In an ordered tree, the nodes of a cut are linearly-ordered on the basis of the ordering of
all nodes. When we describe a cut in an ordered tree, we shall always write the nodes of that
cut in the sense of this order.
332 Useful Algorithms For Directed Graphs

B.9 Definition
A spanning forest for a directed graph (K; D) is a set of trees f(K1 ; D1 ); : : : ; (Kn ; Dn )g such
that the Ki 's partition K and each Di is a (possibly empty) subset of D.
All of the nodes of a directed graph can be visited by traversing the trees of some spanning
forest. The spanning forest used for such a traversal is often the one corresponding to a
depth-rst search :
procedure depth_first_search (k : node );
begin mark k as having been visited;
for each successor k' of k do
if k' has not yet been visited then depth_first_search (k' )
end; (* depth_first_search *)
To construct a spanning forest, this procedure is applied to an arbitrary unvisited node and
repeated so long as such nodes exist.
A depth-rst search can also be used to number the nodes in the graph:
B.10 Definition
A depth-rst numbering is a permutation (k1 ; : : : ; kn ) of the nodes of a directed graph (K; D)
such that k1 is the rst node visited by a particular depth-rst search, k2 the second and so
forth.
Once a spanning forest f(K1 ; D1 ); : : : ; (Kn ; Dn )g has been dened for a graph (K; D) the set
D can be partitioned into four subsets:
Tree edges, elements of D1 [ [ Dn.
Forward edges, (kp ; kq ) such that kp is an ancestor of kq in some tree Ki ; but (kp; kq ) is
not an element of Di .
Back edges, (kq ; kp) such that either kp is an ancestor of kq in some tree Ki or p = q.
Cross edges, (kp ; kq ) such that kp is neither an ancestor nor a descendant of kq in any
tree Ki .
These denitions are illustrated by Figure B.3.
Figure B.3b shows a spanning forest and depth-rst numbering for the graph of Fig-
ure B.3a. The forest has two trees, whose roots are nodes 1 and 7 respectively. All edges
appearing in Figure B.3b are tree edges. In Figure B.3a (using the numbers of Figure B.3b),
(1,3) is a forward edge, (4,2) and (7,7) are back edges, and (5,4), (7,3) and (7,6) are cross
edges.

B.2 Directed Graphs as Data Structures

Directed graphs can be implemented as data structures in dierent ways. It can be shown
that the eciency of graph algorithms depends critically upon the representation chosen, and
that some form of list is usually the appropriate choice for applications in compilers. We
shall therefore use the abstract data type of Figure B.4 for the algorithms described in this
Appendix.
A directed graph is instantiated by a variable declaration of the following form:
g : graph (node count , max edges );
The structure of the graph is then established by a sequence of calls g.define edge (: : : ).
Note that the module embodies only the structure of the graph; further properties, such as
node or edge labels, must be stored separately.
B.2 Directed Graphs as Data Structures 333

a) A directed graph
2 4

1 3 5

6 7
b) A depth-rst numbering and spanning forest for (a)
Figure B.3: Depth-First Numbering
A directed graph that is a tree can, of course, be represented by the abstract data type
of Figure B.4. In this case, however, a simpler representation (Figure B.5) could also be
used. This simplication is based upon the fact that any node in a tree can have at most one
parent. Note that the edges do not appear explicitly, but are implicit in the node linkage.
The abstract data structure is set up by instantiating the module with the proper number
of nodes and then invoking define edge once for each edge to specify the nodes at its head
and tail. If it is desired that the order of the sibling list re ect a total ordering dened on
the children of a node, then the sequence of calls on define edge should be the opposite of
this order.
A partition is dened by a collection of blocks (sets of nodes) and a membership relation
node 2 block . The representation of the partition must be carefully chosen so that operations
upon it may be carried out in constant time. Figure B.6 denes such a representation.
When a partition module is instantiated, its block set is empty. Blocks may be created by
invoking new block , which returns the index of the new block. This block has no members
initially. The procedure add node is used to make a given node a member of a given block.
Since each node can be a member of only one block, this procedure must delete the given
node from the block of which it was previously a member (if such exists).
The status of a partition can be determined by invoking number of blocks , block cont-
aining , node count , first node and next node . If a node does not belong to any block,
then block containing returns 0; otherwise it returns the number of the block of which the
node is a member. Application of the function node count to a block yields the number of
nodes in that block. The procedures first node and next node work together to access all
of the members of a block: A call of first node returns the rst member of a specic block.
(If the block is empty then first node returns 0.) Each subsequent invocation of next node
returns the next member of that block. When all members have been accessed, next node
returns 0.
334 Useful Algorithms For Directed Graphs

module graph (n , e : publicinteger )

(* Representation of a directed graph
n = Number of nodes in the graph
e = Maximum number of edges in the graph *)
var node : array [1 .. n] of
record inward ,outward ,next_in ,next_out : integer end;
edge : array [1 .. e] of
record head ,tail ,next_in ,next_out : integer end
;
i , edge_count : integer ;
procedure next_succ (n , e : integer ) : integer ;
(* Obtain the next successor of a node
On entry - n = Node for which a successor is desired
e = First unexplored edge *)
begin (* next_succ *)
if e = 0then next_succ := 0
else begin
node[n].next_out :=edge[e].next_out ;next_succ :=edge[e].tail
end;
end; (* next_succ *)
procedure next_pred (n , e : integer ) : integer ;
(* Obtain the next predecessor of a node
On entry - n = Node for which a predecessor is desired
e = First unexplored edge *)
begin (* next_pred *)
if e = 0 then next_pred := 0
else begin
node[n].next_in :=edge[e].next_in ;next_pred :=edge[e].head
end;
end; (* next_pred *)
public procedure define_edge (hd , tl : integer );
begin (* define_edge *)
edge_count := edge_count + 1; (*edge_count maximum not tested *)
with edge[edge_count] do begin
head := hd ; tail := tl ; next_in := node[tl].inward ;
next_out := node[hd].outward
end ;
node[hd].outward := node[tl].inward := edge_count;
end ; (* define_edge *)
public function first_successor (n : integer ) : integer ;
begin first_successor := next_succ (n , node[n].outward ) ; end
public function next_successor (n : integer ) : integer ;
begin next_successor := next_succ (n , node[n].next_out ) ; end
public function first_predecessor (n : integer ) : integer ;
begin first_predecessor := next_pred (n , node[n].inward ) ; end
public function next_predecessor (n : integer ) : integer ;
begin next_predecessor := next_pred (n , node[n].next_in ) ; end
begin (* graph *)
for to do
i := 1 n
with node[i]do inward := outward := next_in := next_out := 0;
edge_count := 0
end ; (* graph *)

Figure B.4: Abstract Data Type for a Directed Graph

B.2 Directed Graphs as Data Structures 335

1 2 3

4 5 6 7 8
Solid lines represent tree edges. Dashed lines represent actual links maintained by the tree
module.
a) Pictorial representation
module tree (n : public
integer );
(* Representation of a tree
n = Number of nodes in the tree *)
var node : array[1 .. n] of record
parent ,child ,sibling :integer ; end
i : integer ;
public procedure define_edge (hd , tl : integer );
begin (* define_edge *)
with do
node[tl]
begin parent := hd ; sibling := node[hd].child ; end
node[hd].child := tl ;
end ; (* define_edge *)
public function parent (n : integer ) : integer ;
begin parent := node[n].parent ;end
public function child (n : integer ) : integer ;
begin child := node[n].child ; end
public function sibling (n : integer ) : integer ;
begin sibling := node[n].sibling ; end
begin (* tree *)
for i := 1to do with
n node[i] do
parent := child := sibling := 0;
end ; (* tree *)
b) Abstract data type
Figure B.5: Simplication for a Tree
The membership relation is embodied in a doubly-linked list. Each node species the block
of which it is a member, and each block species the number of members. Figure B.6 uses a
single array to store both node and block information. This representation greatly simplies
the treatment of the doubly-linked list, since the last and next elds have identical meanings
for node and block entries. The member eld species the number of members in a block entry,
but the block of which the node is a member in a node entry. For our problems, the number
of partitions can never exceed the number of nodes. Hence the array is allocated with twice
as many elements as there are nodes in the graph being manipulated. (Element 0 is included
to avoid zero tests when accessing the next element in a node list.) The rst half of the array
is indexed by the node numbers; the second half is used to specify the blocks of the partition.
Note that the user is not aware of this oset in block indices because all necessary translation
is provided by the interface procedures.
336 Useful Algorithms For Directed Graphs

B.3 Partitioning Algorithms

In this section we discuss algorithms for partitioning the node set of a graph according to
three criteria that are particularly important in compiler construction: strong connectivity,
compatibility of a partition and a function, and nonadjacency. All of the algorithms are
dened in terms of the representations presented in Section B.2.
module partition (n : public
integer );
(* Representation of a partition on of a set of n nodes *)
var
p : array[0 .. 2*n] of record
member , last , next : integer end;
i , number_of_blocks , next_node_state : integer ;
public function block_count : integer ;
begin block_count := number_of_blocks ;end
public function new_block : integer ;
begin new_block := number_of_blocks := number_of_blocks + 1 end;
public procedure add_node (node , block : integer );
begin (* add_node *)
with p[node] do
begin
if member=6 0 then (* Remove node from its previous block *)
begin
p[member].member := p[member].member - 1;
p[last].next := next; p[next].last := last ;
end;
member := block ;
p[block + n].member := p[block + n].member + 1;
last := member; next := p[block + n].next ;
p[last].next := p[next].last := node ;
end
;
end; (* add_node *)
public function block_containing (node : integer ) : integer ;
begin block_containing := p[node].member ;end
public function node_count (block : integer ) : integer ;
begin node_count := p[block + n].member ; end
public function first_node (block : integer ) : integer ;
begin first_node := next_node_state := p[block + n].next ; end
public function next_node : integer ;
begin (* next_node *)
ifnext_node_state = 0 then
next_node := 0
elsenext_node := next_node_state := p[next_node_state].next ;
end; (* next_node *)
begin (* partition *)
for i := 1 to 2 * n do with
p[i] do
member := last := next := 0;
number_of_blocks := next_node_state := 0;
end; (* partition *)

Figure B.6: Abstract Data Type for a Partition

B.3 Partitioning Algorithms 337

B.3.1 Strongly Connected Components

We begin the determination of the strongly connected components of a directed graph by using
a depth-rst search to obtain a spanning forest and a corresponding depth-rst numbering
of the nodes. Suppose that kz is the rst node (in the depth-rst numbering) that belongs
to a strongly connected component of the graph. Then, by construction, all other nodes of
the component must belong to the spanning forest subtree, Tz , whose root is kz . We term
kz the root of the strongly connected component (with respect to the given spanning forest).
Every node, k, of Tz either belongs to the strongly connected component with root kz or it
belongs to a subtree Tx of Tz ; with root kx , and kx is the root of another strongly connected
component. (It is possible that k = kx .) These notions are illustrated by Figure B.1: Node 2
is the root of a strongly-connected component of Figure B.1. The only other node in this
component is 3, which is a descendant of 2 in the spanning forest subtree rooted at 2. This
subtree has three nodes. Nodes 2 and 3 belong to the strongly-connected region, and node 4
is the root of a strongly-connected region containing only itself.
There must be a path from the root of a strongly-connected component to itself. Let kz
be the root, and suppose that the path contained a node k < kz . If this were the case then k
would be an ancestor of kz in the tree, contradicting the hypothesis that kz is the root of the
strongly-connected region. This observation is the basis for recognizing a strongly-connected
region: During the depth-rst search that numbers the nodes of the spanning forest, we keep
track of the lowest-numbered ancestor reachable from a node. (We assume that a node is
reachable from itself.) As we back out of the search, we check each node to see whether any
ancestors are reachable from it. If not, then it is the root of a strongly-connected component.
The algorithm makes use of a xed-depth stack (Figure B.7) for holding nodes. (No node
is ever stacked more than once, and hence the stack depth can never exceed the number of
nodes.) The crucial property of this module is that it provides a constant-time test to discover
whether a node is on the stack.
Figure B.8 gives the complete algorithm for identifying strongly connected components.
Note that strongly connected components has a graph as a parameter. This is not a
variable declaration, so no new graph is instantiated. The value of the parameter is a reference
to the argument graph; the argument graph is not copied into the procedure.
module fixed_depth_stack ( public
maximum_depth : integer );
(* Representation of a stack no deeper than maximum_depth *)
var stack : array[1 .. maximum_depth] of
integer ;
i , top : integer ;
public procedure push (n : integer );
begin stack[n] := top ; top := n ; end
public procedure pop : integer ;
var n : integer ;
begin n := top; top := stack[n] ; stack[n] := 0; pop := n end;
public function member (n : integer ) : boolean ;
begin member := stack[n] = 0 6 ; end
public function empty : boolean ;
begin empty := top < 0 ;end
begin (* fixed_depth_stack *)
for i := 1 to maximum_depth do
stack[i] := 0;
top := - 1;
end ; (* fixed_depth_stack *)

Figure B.7: Abstract Data Type for a Fixed-Depth Stack

338 Useful Algorithms For Directed Graphs

procedure strongly_connected_components (g : graph; p : partition );

(* Make p define the strongly-connected components of g *)
var lowlink : array
[1 .. g.n] of
integer ;
i , counter , root : integer ;
s : fixed_depth_stack (g.n );
procedure depth_first_search (node : integer );
var serial , k , b , w : integer ;
begin (* depth_first_search *)
serial := lowlink[node] := counter := counter + 1;
s.push (node );
k := g.first_successor (node );
while k = 0 6 do begin
if lowlink[k] = 0 then
depth_first_search (k ));
if s.member (k ) then
lowlink[node] := min (lowlink[node] , lowlink[k] );
k := g.next_successor (node )
end
;
if
lowlink[node] = serial then begin
b := p.new_block ;
repeats.pop (w ); p.add_node (w , b ) until
w = node ;
end
end; (* depth_first_search *)
begin (* strongly_connected_components *)
for i := 1 to lowlink[i] := 0;
counter := 0 ; root := 1;
while counter 6
= g.n do
begin
while lowlink[root] = 0 6 do
root := root + 1;
depth_first_search (root );
end ;
end ; (* strongly_connected_components *)

Figure B.8: Partitioning Into Strongly Connected Components

The algorithm traverses the nodes of the graph in the order of a depth-rst number-
ing. Each activation of depth first search corresponds to a single node of the graph.
Lowlink[i] species the lowest-numbered (in the depth-rst numbering) node reachable
from node i . (The lowlink array is also used to indicate the nodes not yet visited.) The
xed-depth stack contains all nodes from which it is possible to reach an ancestor of the cur-
rent node. Note that all access to a node is in terms of its index in the graph g ; the index of
a node in the depth-rst numbering appears only in lowlink and the local variable serial
of depth first search .

B.3.2 Renement
Consider a graph (K; D) and a partition P = fPp ; : : : ; Pk g of Q with m 2. We wish to nd
the partition R = fR1 ; : : : ; Rr g with smallest r such that:
Each Rk is a subset of some Pj (`R is a renement of P 0 ).
If a and b are elements of Rk then, for each (a; x) 2 D and (b; y) 2 D, x and y are
elements of some one Rm (`R is compatible with D0 ).
B.3 Partitioning Algorithms 339

The state minimization problem discussed in Section 6.2.2 and the determination of struc-
tural equivalence of types from Section 9.1.2 can both be cast in this form.
The obvious strategy for making a renement is to check the successors of all nodes in a
single element of the current partition. This element must be split if two nodes have successors
in dierent elements of the partition. To obtain the renement, split the element so that these
two nodes lie in dierent elements. The rened partition is guaranteed to satisfy condition
(1). The process terminates when no element must be split. Since a partition in which
each element contains exactly one node must satisfy condition (2), the process of successive
renement must eventually terminate. It can be shown that this algorithm is quadratic in
the number of nodes.
By checking predecessors rather than successors of the nodes in an element, it is possible
to reduce the asymptotic behavior of the algorithm to O(n log n); where n is the number of
nodes. This reduction is achieved at the cost of a more complex algorithm, however, and may
not be worthwhile for small problems. In the remainder of this section we shall discuss the
O(n log n) algorithm, leaving the simpler approach to the reader (Exercise B.6).
procedure refine (p : partition; f : graph );
(* Make p be the coarsest partition compatible with p and f *)
var pending : fixed_depth_stack (f.n );
i : integer ;
procedure split (block : integer );
var inv :inverse (f ,block ,p ); (* Construct the inverse of block *)
b , k , n : integer ;
begin
(* split *)
k := inv.next_block ;
while 6
k= 0 do
begin (* P [f
k
,1 ( block 6 ;
) = but not k ,1 (
P f ) *) block
b := p.new_block ;
while (n := inv.common_node ) = 0 6 do
p.add_node (n , b );
if
pending.member (k ) or
(p.element_count (k ) < p.element_count (b ))
then pending.push (b )
elsepending.push (k )
k := inv.next_block ;
end
end; (* split *)
begin (* refine *)
for i := 1 to p.block_count do pending.push (i );
repeat pending.pop (i ); split (i ) until pending.empty
end; (* refine *)
Figure B.9: Renement Algorithm
The renement procedure of Figure B.9 accepts a graph G = (K; D) and a partition
fP ; : : : ; Pm g of K with m 2. The elements of D correspond to a mapping f : K ! K for
1
which (k; k0 ) is an element of D if f (k) = k0 . Refine inspects the inverse mappings f ,1(Pj ).
A set Pk must be split into two subsets if and only if Pk \ f ,1(Pj ) is nonempty for some
j , and yet Pk is not a subset of f ,1(Pj ). The two subsets are then Pk0 = (Pk \ f ,1 (Pj ))
and Pk00 = Pk , Pk0 . This split must be carried out once for every Pj . If Pj contributes to
the splitting of Pk and is itself split later, both subsets must again be used to split other
partitions.
340 Useful Algorithms For Directed Graphs

The rst step in each execution of the split procedure is to construct the inverse of block
Pj . Next the blocks Pk for which Pk \ f ,1(Pj ) is nonempty but Pk is not a subset of f ,1(Pj )
are split and the smaller of the two components is returned to the stack of blocks yet to be
considered.
Figure B.10 denes an abstract data type that can be used to represent f ,1 (Pj ). When
inverse is instantiated, it represents an empty set. Nodes are added to the set by invoking
inv node . After all nodes belonging to inverse (j) have been added to the set, we wish to
consider exactly those blocks that contain elements of inverse (j) but are not themselves
subsets of inverse (j) . The module allows us to obtain a block satisfying these constraints
by invoking next block . (If next block returns 0, no more such blocks exist.) Once a block
has been obtained, successive invocations of common node yield the elements common to that
block and inverse (j) . Note that each of the operations provided by the abstract data type
requires constant time.

B.3.3 Coloring
The problem of minimizing the number of rows in a parse table can be cast as a problem
in graph theory as follows: Let each row correspond to a node. Two nodes k and k0 are
adjacent (connected by edges (k; k0 ) and (k0 ; k)) if the corresponding rows are incompatible
and therefore cannot be combined. We seek a partition of the graph such that no two adjacent
nodes belong to the same block of the partition. The rows corresponding to the nodes in a
single block of the partition then have no incompatibilities, and can be merged. Clearly we
would like to nd such a partition having the smallest number of blocks, since this will result
in maximum compression of the table.
This problem is known in graph theory as the coloring problem, and the minimum number
of partitions is the chromatic number of the graph. It has been shown that the coloring prob-
lem is NP-complete, and hence we seek algorithms that eciently approximate the optimum
partition.
Most approximation algorithms are derived from backtracking algorithms that decide
whether a given number of colors is sucient for the specied graph. If such an algorithm
is given a number of colors equal to the number of nodes in the graph then it will never
need to backtrack, and hence all of the mechanism for backtracking can be removed. A good
backtracking algorithm contains heuristics designed to prune large portions of the search tree,
which, in this case, implies using as few colors as possible for trial colorings. But it is just
these heuristics that lead to good approximations when there is no backtracking!
A general approach is to make the most constrained decisions rst. This can be done
by sorting the nodes in order of decreasing incident edge count. The rst node colored has
the maximum number of adjacent nodes and hence rules out the use of its color for as many
nodes as possible. We then choose the node with the most restrictive constraint on its color
next, resolving ties by taking the one with most adjacent nodes. At each step we color the
chosen node with the lowest possible color.
Figure B.11 gives the complete coloring algorithm. We assume that g contains no cycles
of length 1. (A graph with cycles of length 1 cannot be colored because some node is adjacent
to itself and thus, by denition, must have a dierent color than itself.) First we partition the
nodes according to number of adjacencies, coloring any isolated nodes immediately. Because
of our assumptions, block g.n of sort must be empty. The coloring loop then scans the
nodes in order of decreasing adjacency count, seeking the most restrictive choice of colors.
This node is then assigned the lowest available color, and that color is made unavailable to
all of the node's neighbors. Note that we mark a node as having been colored by moving it
to block g.n of the sort partition.
B.3 Partitioning Algorithms 341

module inverse (f : graph; b : integer; p : partition );

(* Representation of f , (b) with respect to partition p *)
1

var node : array [1..f.n] of integer ;

block : array [1..p.block_count] of record first_node , link ,
count : integer end;
i , j , block_list , node_list : integer ;
public procedure inv_node (n : integer );
var b : integer ;
begin (* inv_node *)
b := p.block_containing (n );
with block[b] do
begin
if count = 0 then begin link :=block_list ; block_list :=b end;
node[n] := first_node ; first_node := n ;
count := count + 1
end
end; (* inv_node *)
public function next_block : integer ;
begin (* next_block *)
while block_list=6 0 and
block[block_list].count = p.node_count (block_list ) do
block_list := block[block_list].link ;
if block_list = 0 then next_block := 0
else
begin
next_block := block_list
with block[block_list] do
begin node_list := first_node ; block_list := link end;
end
end; (* next_block *)
public function common_node : integer ;
begin (* common_node *)
if node_list = 0 then common_node := 0
else begin common_node :=node_list ; node_list :=node[node_list] end
end; (* common_node *)
begin (* inverse *)
for i := 1 to p.block_count do with block[i] do
first_node := count := 0;
block_list := 0;
i := p.first_node (b );
while 6 do
i = 0
begin
j := f.first_predecessor (i );
while j 6= 0 do begin
inv_node (j ); j := f.next_predecessor (j )
end;
i := p.next_node ;
end
end; (* inverse *)
Figure B.10: Abstract Data Type for an Inverse
342 Useful Algorithms For Directed Graphs

procedure coloring (g : graph; p : partition );

(* Make p define a coloring of g *)
var sort : partition (g.n );
choice : array [1 .. g.n] of integer ;
available : array [1 .. g.n , 1 .. g.n] of boolean ;
i , j , k , uncolored , min_choice , node , color : integer ;
begin (* coloring *)
for i := 1 to g.n do
begin
j := sort.new_block ;
choice[i] := g.n
for
j := 1 to
g.n do
available[i, j] := true ;
end
;
uncolored := 0;
for i := 1 to
g.n do
if
g.first_successor (i ) = 0 then
p.add_node (i , 1)
else
begin
j := 1; while
g.next_successor = 0 6 do
j := j + 1;
sort.add_block (i , j ); uncolored := uncolored +1;
end
;
for i := 1 to
uncolored do
begin
min_choice := g.n + 1;
for j := g.n downto 1 do
begin
k := sort.first_node (j );
while k =6 0 do
begin
if choice[k] < min_choice then
begin node := k ; min_choice := choice[k] end;
k := sort.next_node ;
end
end;
sort.add_node (node,g.n );
(* Remove the node from further consideration*)
color :=1; while notavailable[color,node] do
color :=color +1;
p.add_node (node , color );
j := g.first_successor (node );
while 6 do
j = 0
begin
if available[color, j] then
begin
available[color,j] := false ;
choice[j] := choice[j] -1
end ;
j := g.next_successor (node );
end
end
end; (* coloring *)

Figure B.11: Coloring Algorithm

B.4 Notes and References 343

B.4 Notes and References

For further information about graph theory, the interested reader should consult the books
by Berge [1962] or Harary [1969].
The representations of graphs and partitions discussed in Section B.2 are chosen to have
the least impact on the complexity of the algorithms that follow. Further insight into the
rationale underlying these representations can be obtained from the book by Aho et al.
[1974]. Both the algorithm for identifying strongly connected components and the partitioning
algorithm are drawn from this book.
Proofs of the NP-completeness of the graph coloring problem are given by Karp [1972] and
Aho et al. [1974]. It can also be shown that most approximation algorithms perform poorly
on particular graphs. Johnson [1974] demonstrates that each of the popular algorithms has
an associated class of graphs for which the ratio of the approximate to the true chromatic
number grows linearly with the number of vertices. Further work by Garey and Johnson
[1976] indicates that it is unlikely that any fast algorithm can guarantee good approximations
to the chromatic number. The algorithm presented in Section B.3.3 has been proposed by
a number of authors [Wells, 1971; Durre, 1973; Brelaz, 1979]. It has been incorporated
into an LALR(1) parser generator [Dencker, 1977] and has proven satisfactory in practice.
Further experimental evidence in favor of this algorithm has also been presented by Durre
[1973].

Exercises
B.1 The graph module of Figure B.4 is unpleasant when the number of edges is not known
at the time the module is instantiated: If e is not made large enough then the program
will fail, and if it is made too large then space will be wasted.
(a) Change the module denition so that the array edge is not present. Instead, each
edge should be represented by a record allocated dynamically by define edge .
(b) What is the lifetime of the edge storage in (a)? How can it be recovered?
B.2 Modify the module of Figure B.5 to save space by omitting the parent eld of each
node. Provide access to the parent via the sibling pointer of the last child. What
additional information is required? If the two versions of the module were implemented
on a machine with which you are familiar, would there be any dierence in the actual
storage requirements for a node? Explain.
B.3 Consider the partition module of Figure B.6.
(a) Show that if array p is dened with lower bound 1, execution of add node may
abort due to an illegal array reference. How can this error be avoided if the lower
bound is made 1? Why is initialization of p[0] unnecessary?
(b) What changes would be required if we wished to remove a node from all blocks
by using add node to add it to a ctitious block 0?
(c) Under what circumstances would the use of first node and next node to scan
a block of the partition be unsatisfactory? How could this problem be overcome?
B.4 Explain why the elements of stack are initialized to 0 in Figure B.7 and why the pop
operation resets the element to 0. Could top be set to 0 initially also?
344 Useful Algorithms For Directed Graphs

B.5 Consider the application of strongly connected components to the graph of Fig-
ure B.3a. Assume that the indexes of the node in the graph were assigned `by column':
The leftmost node has number 1, the next three have numbers 2-4 (from the top) and
the rightmost three have numbers 5-7. Also assume that the lists of edges leaving a
node are ordered clockwise from the 12 o'clock position.
(a) Show that the nodes will be visited in the order given by Figure B.3b.
(b) Give a sequence of snapshots showing the procedure activations and the changes
in lowlink .
(c) Show that the algorithm partitions the graph correctly.
B.6 Consider the renement problem of Section B.3.2.
(a) Implement a Boolean procedure split(block) that will rene block according
to the successors of its nodes: If all of the successors of nodes in block lie in
the same block of p , then split(block) returns false and p is unchanged.
Otherwise, suppose that the successors of nodes in block lie in n distinct blocks,
n > 1. Add n , 1 blocks to p and distribute the nodes of block among block and
these new blocks on the basis of their successor blocks. Split(block) returns
true in this case.
(b) Implement refine as a loop that cycles through the blocks of p , applying split
to each. Repeat the loop so long as any one of the applications of split yields
true . (Note that for each repetition of the loop, the number of blocks in p will
increase by at least one.)
B.7 Consider the problem of structural equivalence of types discussed in Section 9.1.2. We
can solve this problem as follows:
(a) Dene a graph, each of whose nodes represents a single type. There is an edge
from node k1 to node k2 if type k1 `depends upon' type k2 . (One type `depends
upon' another if its denition uses that type. For example, if k1 is declared to be
of type ref k2 then k1 `depends upon' k2 .)
(b) Dene a partition that groups all of the `similarly dened' types. (Two types are
`similarly dened' if their type denitions have the same structure, ignoring any
type specications appearing in them. For example, ref k1 and ref k2 are `similarly
dened'.)
(c) Apply the renement algorithm of Section B.3.2. Assume that array types are
`similarly dened' if they have the same dimensions, and record types are `similarly
dened' if they have the same eld identiers in the same order. Apply the
procedure outlined above to the structural equivalence problem of Exercise 2.2.
B.8 Consider the problem of state minimization discussed in Section 6.2.2. The state dia-
gram is a directed graph with node and edge labels. It denes a function f (i; s), where
i is an input symbol selected from the set of edge labels and s is a state selected from
the set of node labels.
(a) Assume that the state diagram has been completed by adding an error state, so
that there is an edge for every input symbol leaving every node. Dene a three-
block partition on the graph, with the error state in one block, all nal states
in the second and all other states in the third. Consider the edges of the state
diagram to dene a set of functions, fi, one per input symbol. Show that the states
of the minimum automaton correspond to the nodes of the reduction of the state
B.4 Notes and References 345

diagram with respect to the renement of the three block partition compatible
with all fi .
(b) Show that Denition B.1 permits only a single edge directed from one specic node
to another. Is this limitation enforced by Figure B.4? If so, modify Figure B.4 to
remove it.
(c) Modify Figure B.4 to allow attachment of integer edge labels.
(d) Modify Figure B.9 to carry out the renement of a graph with edge labels, treating
each edge label as a distinct function.
(e) Modify the result of (d) to make completion of the state diagram unnecessary:
When a particular edge label is missing, assume that its destination is the error
state.
346 Useful Algorithms For Directed Graphs
References
We have repeatedly stressed the need to derive information about a language from the def-
inition of that language rather than from particular implementation manuals or textbooks
describing the language. In this book, we have used the languages listed below as sources of
examples. For each language we give a reference that we consider to be the `language deni-
tion'. Any statement that we make regarding the language is based upon the cited reference,
and does not necessarily hold for particular implementations or descriptions of the language
found elsewhere in the literature.
Ada The denition of Ada was still under discussion when this book went to press. We have
based our examples upon the version described by Ichbiah [1980].
ALGOL 60 Naur [1963].
ALGOL 68 van Wijngaarden et al. [1975].
BASIC Almost every equipment manufacturer provides a version of this language, and the
strongest similarity among them is the name. We have followed the standard for `min-
imal BASIC' ANSI [1978b].
COBOL ANSI [1968]
Euclid Lampson et al. [1977]
FORTRAN We have drawn examples from both the 1966 ANSI [1966] and 1978 ANSI
[1978a] standards. When we refer simply to `FORTRAN', we assume the 1978 standard.
If we are pointing out dierences, or if the particular version is quite important, then we
use `FORTRAN 66' and `FORTRAN 77' respectively. (Note that the version described
by the 1978 standard is named `FORTRAN 77', due to an unforeseen delay in publication
of the standard.)
LIS Rissen et al. [1974].
LISP The examples for which we use LISP depend upon its applicative nature, and hence we
rely upon the original description McCarthy [1960] rather than more modern versions.
MODULA-2 Wirth [1980].
Pascal Pascal was in the process of being standardized when this book went to press. We
have relied for most of our examples on the User Manual and Report Jensen and
Wirth [1974] but we have also drawn upon the draft standard Addyman [1980]. The
examples from the latter have been explicitly noted as such.
SIMULA Nygaard et al. [1970].
SNOBOL-4 Griswold et al. [1971].
ACM [1961]. ACM compiler symposium 1960. Communications of the ACM, 4(1):3{84.
Addyman, A.M. [1980]. A draft proposal of Pascal. ACM SIGPLAN Notices, 15(4):1{66.
Aho, Alfred V. and Corasick, M. J. [1975]. Ecient string matching: An aid to biblio-
graphic search. Communications of the ACM, 18(6):333{340.
347
348 References

Aho, Alfred V., Hopcroft, J. E., and Ullman, Jeffrey D. [1974]. The Design and
Analysis of Computer Algorithms. Addision Wesley, Reading, MA.
Aho, Alfred V. and Johnson, Stephen C. [1976]. Optimal code generation for expression
trees. Journal of the ACM, 23(3):488{501.
Aho, Alfred V., Johnson, Stephen C., and Ullman, Jeffrey D. [1977]. Code gener-
ation for machines with multiregister operations. Journal of the ACM, pages 21{28.
Aho, Alfred V. and Ullman, Jeffrey D. [1972]. The Theory of Parsing, Translation,
and Compiling. Prentice-Hall, Englewood Clis.
Aho, Alfred V. and Ullman, Jeffrey D. [1977]. Principles of Compiler Design. Addision
Wesley, Reading, MA.
Allen, F. E., Cocke, J., and Kennedy, K. [1981]. Reduction of operator strength. In
[ Muchnick and Jones, 1981], pages 79{101. Prentice-Hall, Englewood Clis.
Ammann, U. [1974]. The method of structured programming applied to the development of
a compiler. In Proceedings of the International Computing Symposium 1973, pages 94{99.
North-Holland, Amsterdam, NL.
Ammann, U. [1975]. Die Entwicklung eines Pascal-Compilers nach der Methode des Struk-
turierten Programmierens. Ph.D. thesis, Eidgenossischen Technischen Hochschule Zurich.
Ammann, U. [1977]. On code generation in a Pascal compiler. Software{Practice and Expe-
rience, 7:391{423.
Anderson, T., Eve, J., and Horning, J. J. [1973]. Ecient LR(1) parsers. Acta Infor-
matica, 2:12{39.
ANSI [1966]. FORTRAN. American National Standards Institute, New York. X3.9-1966.
ANSI [1968]. COBOL. American National Standards Institute, New York. X3.23-1968.
ANSI [1978a]. FORTRAN. American National Standards Institute, New York. X3.9-1978.
ANSI [1978b]. Minimal BASIC. American National Standards Institute, New York. X3.9-
1978.
Asbrock, B. [1979]. Attribut-Implementierung und -Optimierung fur Attributierte Gram-
matiken. Master's thesis, Fakultat fur Informatik, Universitat Karlsruhe, FRG.
Baker, T. P. [1982]. A one-pass algorithm for overload resolution in Ada. ACM Transactions
on Programming Languages and Systems, 4(4):601{614.
Balzer, R. M. [1969]. EXDAMS - extendable debugging and monitoring system. In Spring
Joint Computer Conference, volume 34 of AFIPS Conference Proceedings, pages 567{580.
AFIPS Press, Montvale, NJ.
Banatre, J. P., Routeau, J. P., and Trilling, L. [1979]. An event-driven compiling
technique. Communications of the ACM, 22(1):34{42.
Barron, D. W. and Hartley, D. F. [1963]. Techniques for program error diagnosis on
EDSAC2. Computer Journal, 6:44{49.
References 349

Barth, J. M. [1977]. Shifting garbage collection overhead to compile time. Communications

of the ACM, 20(7):513{518.
Bauer, Friedrich L. and Eickel, Jurgen, editors [1976]. Compiler Construction: An
Advanced Course, volume 21 of Lecture Notes in Computer Science. Springer Verlag, Hei-
delberg, New York.
Bayer, Rudolf, Wiehle, H., Gries, David, Paul, Manfred, and Bauer, Fried-
rich L. [1967]. The ALCOR ILLINOIS 7090/7094 post mortem dump. Communications
of the ACM, 10(12):804{808.
Beatty, J. C. [1974]. Register assignment algorithm for generation of highly optimized object
code. IBM Journal of Research and Development, 18(1):20{39.
Belady, L. A. [1966]. A study of replacement algorithms for a virtual storage computer.
IBM Systems Journal, 5(2):613{640.
Bell, J. R. [1974]. A compression method for compiler precedence tables. In [ Rosenfeld,
1974], pages 359{362. North-Holland, Amsterdam, NL.
Berge, C. [1962]. The Theory of Graphs and Its Applications. Wiley, New York.
Bochman, G. V. [1976]. Semantic evaluation from left to right. Communications of the
ACM, 19(2):55{62.
Bochmann, G. V. and Lecarme, O. [1974]. A (truly) usable and portable compiler writing
system. In Rosenfeld, J. L., editor, Information Processing 74, pages 218{221. North-
Holland, Amsterdam, NL.
Borowiec, J. [1977]. Pragmatics in a compiler production system. In , volume 47 of Lecture
Notes in Computer Science. Springer Verlag, Heidelberg, New York.
Brelaz, D. [1979]. New methods to color the vertices of a graph. Communications of the
ACM, 22(4):251{256.
Brinch-Hansen, P. and Hartmann, A. C. [1975]. Sequential Pascal report. Technical
report, California Institute of Technology, Pasadena.
Brown, W. S. [1977]. A realistic model of oating-point computation. In Rice, J. R., editor,
Mathematical Software III, pages 343{360. Academic Press, New York.
Brown, W. S. [1981]. A simple but realistic model of oating-point computation. Technical
report, Bell Telephone Laboratories, Murray Hill, NJ. Computing Science Technical Report
83.
Bruno, J. L. and Lassagne, T. [1975]. The generation of optimal code for stack machines.
Journal of the ACM, 22(3):382{396.
Bruno, J. L. and Sethi, Ravi [1976]. Code generation for a one-register machine. Journal
of the ACM, 23(3):382{396.
Busam, V. A. [1971]. On the structure of dictionaries for compilers. ACM SIGPLAN Notices,
6(2):287{305.
Carter, Lynn Robert [1982]. An analysis of Pascal programs. Technical report, UMI
Research Press, Ann Arbor, MI.
350 References

Cercone, N., Kraus, M., and Boates, J. [1982]. Lexicon design using perfect hash
functions. SIGSOC Bulletin, 13(2):69{78.
Chaitin, Gregory J. [1982]. Register allocation & spilling via coloring. ACM SIGPLAN
Notices, 17(6):98{105.
Chaitin, Gregory J., Cocke, John, Chandra, A. K., Auslander, Marc A., Hop-
kins, Martin E., and Markstein, Peter W. [1981]. Register allocation via coloring.
Computer Languages, 6:47{57.
Chomsky, N. [1956]. Three models for the description of language. IRE Transactions on
Information Theory, IT-2:113{124.
Cichelli, R. J. [1980]. Minimal perfect hash functions made simple. Communications of the
ACM, 23(1):17{19.
Clark, D. W. and Green, C. C. [1977]. An empirical study of list structure in LISP.
Communications of the ACM, 20(2):78{87.
Cocke, John and Markstein, Peter W. [1980]. Measurement of code improvement al-
gorithms. In Lavington, S. H., editor, Information Processing 80, pages 221{228. North-
Holland, Amsterdam, NL.
Cody, William J. and Waite, William M. [1980]. Software Manual for the Elementary
Functions. Prentice-Hall, Englewood Clis.
Constantine, L. L., Stevens, W. P., and Myers, G. J. [1974]. Structured design. IBM
Systems Journal, 2:115{139.
Conway, R. and Wilcox, T. R. [1973]. Design and implementation of a diagnostic compiler
for PL/1. Communications of the ACM, 16(3):169{179.
Dakin, R. J. and Poole, Peter Cyril [1973]. A mixed code approach. Computer Journal,
16(3):219{222.
Damerau, F. [1964]. A technique for computer detection and correction of spelling errors.
Communications of the ACM, 7(3):171{176.
Davidson, J. W. and Fraser, C. W. [1980]. The design and application of a retargetable
peephole optimizer. ACM Transactions on Programming Languages and Systems, 2(2):191{
202.
Day, W. H. E. [1970]. Compiler assignment of data items to registers. IBM Systems Journal,
9(4):281{317.
Dencker, Peter [1977]. Ein neues LALR-System. Master's thesis, Fakultat fur Informatik,
Universitat Karlsruhe, FRG.
DeRemer, F. L. [1969]. Practical translators for LR(k) languages. Technical report, MIT,
Cambridge, MA. MAC-TR-65.
DeRemer, F. L. [1971]. Simple LR(k) grammars. Communications of the ACM, 14(7):453{
460.
DeRemer, F. L. [1974]. Lexical Analysis, pages 109{120. Springer Verlag, Heidelberg, New
York.
References 351

Deutsch, L. P. and Bobrow, D. G. [1976]. An ecient, incremental, automatic garbage

collector. Communications of the ACM, 19:522{526.
Dijkstra, E. W. [1960]. Recursive programming. Numerische Mathematik, 2:312{318.
Dijkstra, E. W. [1963]. An ALGOL 60 translator for the x1. Annual Review in Automatic
Programming, 3:329{345.
DIN [1980]. Programmiersprache PEARL. Beuth-Verlag. DIN 66253.
Dunn, R. C. [1974]. Design of a Higher-Level Language Transput System. Ph.D. thesis,
University of Colorado, Boulder, CO.
Dunn, R. C. and Waite, William M. [1981]. SYNPUT. Technical report, Department of
Electrical Engineering, University of Colorado, Boulder, CO.
Durre, Karl [1973]. An algorithm for coloring the vertices of an arbitrary graph. In Deussen,
P., editor, 2. Jahrestagung der Gesellschaft fur Informatik Karlsruhe, 1972, volume 78
of Lecture Notes in Economics and Mathematical Systems, pages 82{89. Springer Verlag,
Heidelberg, New York.
Elson, M. and Rake, S. T. [1970]. Code-generation technique for large language compilers.
IBM Systems Journal, 9(3):166{188.
Fang, I. [1972]. FOLDS, a Declarative Formal Language Denition System. Ph.D. thesis,
Stanford University, CA.
Gaines, R. S. [1969]. The Debugging of Computer Programs. Ph.D. thesis, Princeton Uni-
versity, Princeton, NJ.
Galler, B. A. and Fischer, M. J. [1964]. An improved equivalence algorithm. Communi-
cations of the ACM, 7(5):301{303.
Gallucci, M. A. [1981]. SAM/SAL. An Experiment Using an Attributed Grammar. Ph.D.
thesis, University of Colorado, Boulder, CO.

Ganzinger, H. [1978]. Optimierende Erzeugung von Ubersetzerteilen aus implementierung-
sorientierten Sprachbeschreibungen. Ph.D. thesis, Technische Universitat Munchen.
Garey, M. S. and Johnson, D. S. [1976]. The complexity of near-optimal graph coloring.
Journal of the ACM, 23(1):43{49.
General Electric Company [1965]. Ge-625/635 general loader reference manual. Tech-
nical Report CPB-1008B, General Electric Company, Phoenix, AZ.
Giegerich, R. [1979]. Introduction to the compiler generating system MUG2. Technical
Report TUM-INFO 7913, Institut fur Mathematik und Informatik, Technische Universitat
Munchen.
Glanville, R. S. and Graham, S. L. [1978]. A new method for compiler code generation.
In Conference Record of the Fifth Principles of Programming Languages, pages 231{240.
ACMg.
Goos, G. and Kastens, U. [1978]. Programming languages and the design of modular
programs. In Hibbard, Peter and Schuman, Stephen, editors, Constructing Quality Software,
pages 153{186. North-Holland, Amsterdam, NL.
352 References

Gordon, M. [1979]. The Denotational Denition of Programming Languages. An Introduc-

tion. Springer Verlag, Heidelberg, New York.
Graham, M. L. and Ingerman, P. Z. [1965]. An assembly language for reprogramming.
Communications of the ACM, 8(12):769{773.
Grau, A. A., Hill, U., and Langmaack, H. [1967]. Translation of ALGOL 60. Springer
Verlag, Heidelberg, New York.
Gries, David [1971]. Compiler Construction for Digital Computers. WILEY, New York.
Griffiths, M. [1973]. Relationship between denition and implementation of a language. In
Bauer, Friedrich L., editor, Advanced Course on Software Engineering, volume 81 of Lecture
Notes in Economics and Mathematical Systems, pages 76{110. Springer Verlag, Heidelberg,
New York.
Griswold, R. E. [1972]. The Macro Implementation of SNOBOL4. Freeman, San Francisco.
Griswold, R. E., Poage, J. F., and Polonsky, I. P. [1971]. The SNOBOL4 Programming
Language. Prentice-Hall, Englewood Clis, second edition.
Guttag, John V. [1975]. The specication and application to programming of abstract
data types. Technical Report CSRG-59, Computer Systems Research Group, University of
Toronto, Ont.
Guttag, John V. [1977]. Abstract data types and the development of data structures. Com-
munications of the ACM, 20(6):396{404.
Habermann, A. N. [1973]. Critical comments on the programming language Pascal. Acta
Informatica, 3:47{58.
Haddon, Bruce Kenneth and Waite, William M. [1978]. Experience with the universal
intermediate language janus. Software{Practice and Experience, 8:601{616.
Hall, A. D. [1975]. FDS. a FORTRAN debugging system overview and installers guide.
Technical Report Computer Science Technical Report 29, Bell Telephone Laboratories,
Murray Hill, NJ.
Hangelberger, P. [1977]. Ein Algorithmus zur Losung des Problems der kurzen Sprunge.
Elektronische Rechenanlagen, 19:68{71.
Harary, F. [1969]. Graph Theory. Addision Wesley, Reading, MA.
Hartmann, A. C. [1977]. A Concurrent Pascal Compiler for Minicomputers, volume 50 of
Lecture Notes in Computer Science. Springer Verlag, Heidelberg, New York.
Hecht, M. S. [1977]. Flow Analysis of Computer Programs. North-Holland, Amsterdam,
NL.
Hedberg, R. [1963]. Design of an integrated programming and operating system part III. the
expanded function of the loader. IBM Systems Journal, 2:298{310.
Hill, Ursula [1976]. Special run-time organization techniques for ALGOL 68. In [ Bauer
and Eickel, 1976], pages 222{252. Springer Verlag, Heidelberg, New York.
Hoare, Charles Anthony Robert and Wirth, N. [1973]. An axiomatic denition of
the programming language Pascal. Acta Informatica, 3:335{355.
References 353

Holt, R. C., Barnard, David T., Cordy, James R., and Wortman, David B.
[1977]. SP/k: a system for teaching computer programming. Communications of the ACM,
20(5):301{309.
Horning, J. J., Lalonde, W. R., and Lee, E. S. [1972]. An LALR(k) parser genera-
tor. In Freiman, C. V., editor, Information Processing 71, pages 513{518. North-Holland,
Amsterdam, NL.
Housden, R.J.W. [1975]. On string concepts and their implementation. Computer Journal,
18(2):150{156.
Hunt, H. B. I., Szymanski, Thomas G., and Ullman, Jeffrey D. [1975]. On the
complexity of LR(k) testing. In Conference Proceedings of the Second ACM Symposium on
Principles of Programming Languages, pages 137{148. ACMg.
IBM [1968]. IBM System/360 operating system FORTRAN IV (H) compiler program logic
manual. Technical Report Y28-6642-3, IBM Corporation.
ICC [1962]. Symbolic Languages in Data Processing. Gordon and Breach, New York.
Ichbiah, J. D. [1980]. Ada Reference Manual, volume 106 of Lecture Notes in Computer
Science. Springer Verlag, Heidelberg, New York.
Irons, E. T. [1961]. A syntax-directed compiler for ALGOL 60. Communications of the
ACM, 4(1):51{55.
Irons, E. T. [1963a]. An error correcting parse algorithm. Communications of the ACM,
6(11):669{673.
Irons, E. T. [1963b]. Towards more versatile mechanical translators. In Experimental Arith-
metic, High Speed Computing and Mathematics, volume 15 of Proceedings of Symposia in
Applied Mathematics, pages 41{50. American Mathematical Society, Providence, RI.
Jansohn, Hans-Stephan, Landwehr, Rudolph, and Goos, Gerhard [1982]. Experi-
ence with an automatic code generator generator. ACM SIGPLAN Notices, 17(6):56{66.
Jazayeri, M. [1981]. A simpler construction showing the intrinsically exponential complexity
of the circularity problem for attribute grammars. Journal of the ACM, 28(4):715{720.
Jazayeri, M., Ogden, W. F., and Rounds, W. C. [1975]. On the complexity of the
circularity test for attribute grammars. In Conference Record of the Second Principles of
Programming Languages, pages 119{129. ACMg.
Jazayeri, M. and Pozefsky, D. P. [1977]. Algorithms for ecient evaluation of multi-
pass attribute grammars without a parse tree. Technical Report TP77-001, Department of
Computer Science, University of North Carolina, Chapel Hill, NC.
Jazayeri, M. and Walter, K. G. [1975]. Alternating semantic evaluator. In Proceedings
of the ACM National Conference, pages 230{234. ACMg.
Jensen, Kathleen and Wirth, Niklaus [1974]. Pascal User Manual and Report, volume 18
of Lecture Notes in Computer Science. Springer Verlag, Heidelberg, New York.
Johnson, D. S. [1974]. Worst case behavior of graph coloring algorithms. In Proceedings of
the Fifth Southeastern Conference on Combinatorics, Graph Theory and Computing, pages
513{523. Utilitas Mathematica Publishing, Winnipeg, Canada.
354 References

Johnson, W. L., Porter, J. H., Ackley, S. I., and Ross, Douglas T. [1968]. Automatic
generation of ecient lexical processors using nite state techniques. Communications of
the ACM, 11(12):805{813.
Johnston, J. B. [1971]. Contour model of block structured processes. ACM SIGPLAN
Notices, 6(2):55{82.
Joliat, M. L. [1973]. On the Reduced Matrix Representation of LR(k) Parser Tables. Ph.D.
thesis, University of Toronto.
Joliat, M. L. [1974]. Practical minimization of LR(k) parser tables. In [ Rosenfeld, 1974],
pages 376{380. North-Holland, Amsterdam, NL.
Jones, C. B. and Lucas, P. [1971]. Proving correctness of implementation techniques. In
Engeler, E., editor, Symposium on Semantics of Algorithmic Languages, volume 188 of
Lecture Notes in Mathematics, pages 178{211. Springer Verlag, Heidelberg, New York.
Karp, R. M. [1972]. Reducibility among combinatorial problems. In Miller and Thatcher
[1972], pages 85{104. Plenum Press, New York.
Kastens, Uwe [1976]. Systematische Analyse semantischer Abhangigkeiten. In Program-
miersprachen, number 1 in Informatik Fachberichte, pages 19{32. Springer Verlag, Heidel-
berg, New York.
Kastens, Uwe [1980]. Ordered attribute grammars. Acta Informatica, 13(3):229{256.
Kastens, Uwe, Zimmermann, Erich, and Hutt, B. [1982]. GAG: A practical compiler
generator. In , volume 141 of Lecture Notes in Computer Science. Springer Verlag, Heidel-
berg, New York.
Kennedy, K. [1981]. A survey of data ow analysis techniques. In D., Muchnick Steven S.
Jones Neil, editor, Program Flow Analysis: Theory and Applications, pages 5{54. Prentice-
Hall, Englewood Clis.
Kennedy, K. and Ramanathan, J. [1979]. A deterministic attribute grammar evaluator
based on dynamic sequencing. ACM Transactions on Programming Languages and Systems,
1:142{160.
Kennedy, K. and Warren, S. K. [1976]. Automatic generation of ecient evaluators
for attribute grammars. In Conference Record of the Third Principles of Programming
Languages, pages 32{49. ACMg.
Klint, P. [1979]. Line numbers made cheap. Communications of the ACM, 22(10):557{559.
Knuth, D. E. [1962]. History of writing compilers. Computers and Automation, 11:8{14.
Knuth, D. E. [1965]. On the translation of languages from left to right. Information and
Control, 8(6):607{639.
Knuth, D. E. [1968a]. Fundamental Algorithms, volume 1 of The Art of Computer Program-
ming. Addision Wesley, Reading, MA.
Knuth, D. E. [1968b]. Semantics of context-free languages. Mathematical Systems Theory,
2(2):127{146. see [Knuth, 1971b].
References 355

Knuth, D. E. [1969]. Seminumerical Algorithms, volume 2 of The Art of Computer Pro-

gramming. Addision Wesley, Reading, MA.
Knuth, D. E. [1971a]. An empirical study of FORTRAN programs. Software{Practice and
Experience, 1:105{133.
Knuth, D. E. [1971b]. Semantics of context-free languages: Correction. Mathematical
Systems Theory, 5:95{96.
Knuth, D. E. [1973]. Sorting and Searching, volume 3 of The Art of Computer Programming.
Addision Wesley, Reading, MA.
Koster, C. H. A. [1969]. On innite modes. ACM SIGPLAN Notices, 4(3):109{112.
Koster, C. H. A. [1971]. Ax grammars. In [ Peck, 1971], pages 95{109. North-Holland,
Amsterdam, NL.
Koster, C. H. A. [1973]. Error reporting, error treatment and error correction in ALGOL
translation. part 1. In Deussen, P., editor, 2. Jahrestagung der Gesellschaft fur Informatik
Karlsruhe, 1972, volume 78 of Lecture Notes in Economics and Mathematical Systems.
Springer Verlag, Heidelberg, New York.
Koster, C. H. A. [1976]. Using the CDL compiler-compiler. In [ Bauer and Eickel, 1976],
pages 366{426. Springer Verlag, Heidelberg, New York.
Kruseman-Aretz, F. E. J. [1971]. On the bookkeeping of source-text line numbers during
the execution phase of ALGOL 60 programs. In MC-25 Informatica Symposium, volume 37
of Mathematical Centre Tracts, pages 6.1{6.12. Mathematisch Centrum, Amsterdam.
Lampson, B. W., London, R. L., Horning, J. J., Popek, G. L., and Mitchell, J. G.
[1977]. Report on the programming language Euclid. ACM SIGPLAN Notices, 12(2):1{79.
Landin, P. J. [1964]. The mechanical evaluation of expressions. Computer Journal, 6(4):308{
320.
Langmaack, Hans [1971]. Application of regular canonical systems to grammars translatable
from left to right. Acta Informatica, 1:111{114.
Language Resources [1981]. Language Resources Pascal System BL/M-86 Binding Lan-
guage Specication. Language Resources Inc., Boulder, CO.
Lesk, Michael E. [1975]. LEX { a lexical analyzer generator. Technical report, Bell Tele-
phone Laboratories, Murray Hill, NJ. Computing Science Technical Report 39.
Levy, J. [1975]. Automatic correction of syntax-errors in programming languages. Acta
Informatica, 4:271{292.
Lewis, P. M., Rosenkrantz, D. J., and Stearns, R. E. [1974]. Attributed translations.
Journal of Computer and System Sciences, 9(3):279{307.
Lewis, P. M. and Stearns, R. E. [1969]. Property grammars and table machines. Infor-
mation and Control, 14(6):524{549.
Liskov, B. and Zilles, S. [1974]. Programming with abstract data types. ACM SIGPLAN
Notices, 9(4):50{59.
356 References

Lowry, E. S. and Medlock, C. W. [1969]. Object code optimization. Communications of

the ACM, 12(1):13{22.
Lucas, P. and Walk, K. [1969]. On the formal description of PL/1. Annual Review in
Automatic Programming, 6(3):105{181.
Lyon, G. [1974]. Syntax-directed least-error analysis for context-free languages: A practical
approach. Communications of the ACM, 17(1):3{14.
McCarthy, John [1960]. Recursive functions of symbolic expressions and their computation
by machine, part 1. Communications of the ACM, 3(4):184{195.
McClure, R. M. [1972]. An appraisal of compiler technology. In AFIPS: Conference Pro-
ceedings of the Spring Joint Computer Conference, volume 40. AFIPS Press, Montvale,
NJ.
McIlroy, M. D. [1974]. ANS FORTRAN charts. Technical report, Bell Telephone Labora-
tories, Murray Hill, NJ. Computer Science Technical Report 13.
McKeeman, W. M. [1965]. Peephole optimization. Communications of the ACM, 8(7):443{
444.
McLaren, M. D. [1970]. Data matching, data alignment, and structure mapping in PL/1.
ACM SIGPLAN Notices, 5(12):30{43.
Mealy, George H. [1963]. A generalized assembly system. Technical Report RM-3646-PR,
Rand Corporation, Santa Monica, CA.
Miller, R. E. and Thatcher, J. W., editors [1972]. Complexity of Computer Computa-
tions. Plenum Press, New York.
Mock, O., Olsztyn, J., Strong, J., Steel, T. B., Tritter, A., and Wegstein, J.
[1958]. The problem of programming communications with changing machines: Proposed
solution. Communications of the ACM, 1(2):12{18.
Morel, E. and Renvoise, C. [1979]. Global optimization by suppression of partial redun-
dancies. Communications of the ACM, 22(11):96{103.
Morgan, D. L. [1970]. Spelling correction in system programs. Communications of the
ACM, 13:90{94.
Morris, F. L. [1978]. A time- and space-ecient garbage compaction algorithm. Communi-
cations of the ACM, 21(8):662{665.
Morrison, R. [1982]. The string is a simple data type. ACM SIGPLAN Notices, 17(3):46{52.
Moses, J. [1970]. The function of function in LISP. SIGSAM Bulletin, pages 13{27.
Muchnick, S. S. and Jones, N. D., editors [1981]. Program Flow Analysis: Theory and
Applications. Prentice-Hall, Englewood Clis.
Naur, Peter [1963]. Revised report on the algorithmic language ALGOL 60. Communica-
tions of the ACM, 6(1):1{17.
Naur, Peter [1964]. The design of the GIER ALGOL compiler. Annual Review in Automatic
Programming, 4:49{85.
References 357

Nygaard, K., Dahl, O., and Myrhaug, B. [1970]. SIMULA 67 Common Base Language
- Publication S-22. Norwegian Computing Center, Oslo.
Pager, D. [1974]. On eliminating unit productions from LR(k) parsers. In Loeckx, J., editor,
Automata, Languages and Programming, volume 14 of Lecture Notes in Computer Science,
pages 242{254. Springer Verlag, Heidelberg, New York.
Palmer, E. M., Rahimi, M. A., and Robinson, R. W. [1974]. Eciency of a binary
comparison storage technique. Journal of the ACM, 21(3):376{384.
Parnas, D. L. [1972]. On the criteria to be used in decomposing systems into modules.
Communications of the ACM, 15(12):1053{1058.
Parnas, D. L. [1976]. On the design and development of program families. IEEE Transac-
tions on Software Engineering, SE-2(1):1{9.
Peck, J. E. L., editor [1971]. ALGOL 68 Implementation. North-Holland, Amsterdam, NL.
Persch, Guido, Winterstein, Georg, Dausmann, Manfred, and Drossopoulou,
Sophia [1980]. Overloading in preliminary Ada. ACM SIGPLAN Notices, 15(11):47{56.
Peterson, T. G. [1972]. Syntax Error Detection, Correction and Recovery in Parsers. Ph.D.
thesis, Stevens Institute of Technology, Hoboken, NJ.
Pierce, R. H. [1974]. Source language debugging on a small computer. Computer Journal,
17(4):313{317.
Pozefsky, D. P. [1979]. Building Ecient Pass-Oriented Attribute Grammar Evaluators.
Ph.D. thesis, University of North Carolina, Chapel Hill, NC.
Quine, W. V. O. [1960]. Word and Object. Wiley, New York.
Raiha, K. [1980]. Bibliography on attribute grammars. ACM SIGPLAN Notices, 15(3):35{44.
Raiha, K. and Saarinen, M. [1977]. An optimization of the alternating semantic evaluator.
Information Processing Letters, 6(3):97{100.
Raiha, K., Saarinen, M., Soisalon-Soininen, E., and Tienari, M. [1978]. The compiler
writing system HLP (Helsinki Language Processor). Technical Report A-1978-2, Depart-
ment fo Computer Science, University of Helsinki, Finland.
Ramamoorthy, C. V. and Jahanian, P. [1976]. Formalizing the specication of target
machines for compiler adaptability enhancement. In Proceedings of the Symposium on
Computer Software Engineering, pages 353{366. Polytechnic Institute of New York.
Randell, B. and Russell, L. J. [1964]. ALGOL 60 Implementation. Academic Press, New
York.
Richards, M. [1971]. The portability of the BCPL compiler. Software{Practice and Experi-
ence, 1:135{146.
Ripken, K. [1977]. Formale Beschreibung von Maschinen, Implementierungen und Opti-
mierender Machinecoderzeugung Aus Attributierten Programmgraphen. Ph.D. thesis, Tech-
nische Universitaat Munchen.
358 References

Rissen, J. P., Heliard, J. C., Ichbiah, J. D., and Cousot, P. [1974]. The system imple-
mentation language LIS, reference manual. Technical Report 4549 E/EN, CII Honeywell-
Bull, Louveciennes, France.
Robertson, E. L. [1979]. Code generation and storage allocation for machines with span-
dependent instructions. ACM Transactions on Programming Languages and Systems,
1(1):71{83.
Ro hrich, J. [1978]. Automatic construction of error correcting parsers. Technical Report
Interner Bericht 8, Universitat Karlsruhe.
Ro hrich, J. [1980]. Methods for the automatic construction of error correcting parsers. Acta
Informatica, 13(2):115{139.
Rosen, S. [1967]. Programming and Systems and Languages. Mc Grawhill.
Rosenfeld, J. L., editor [1974]. Information Processing 74. North-Holland, Amsterdam,
NL.
Rosenkrantz, D. J. and Stearns, R. E. [1970]. Properties of deterministic top-down
grammars. Information and Control, 17:226{256.
Ross, D. T. [1967]. The AED free storage package. Communications of the ACM, 10(8):481{
492.
Rutishauser, H. [1952]. Automatische Rechenplanfertigung bei Programm-gesteuerten
Rechenmaschinen. Mitteilungen aus dem Institut fur Angewandte Mathematik der ETH-
Zurich, 3.
Sale, Arthur H. J. [1971]. The classication of FORTRAN statements. Computer Journal,
14:10{12.
Sale, Arthur H. J. [1977]. Comments on `report on the programming language Euclid'.
ACM SIGPLAN Notices, 12(4):10.
Sale, Arthur H. J. [1979]. A note on scope, one-pass compilers, and Pascal. Pascal News,
15:62{63.
Salomaa, Arto [1973]. Formal Languages. Academic Press, New York.
Samelson, K. and Bauer, Friedrich L. [1960]. Sequential formula translation. Commu-
nications of the ACM, 3(2):76{83.
Satterthwaite, E. [1972]. Debugging tools for high level languages. Software{Practice and
Experience, 2:197{217.
Scarborough, R. G. and Kolsky, H. G. [1980]. Improved optimization of FORTRAN
object programs. IBM Journal of Research and Development, 24(6):660{676.
Schulz, Waldean A. [1976]. Semantic Analysis and Target Language Synthesis in a Trans-
lator. Ph.D. thesis, University of Colorado, Boulder, CO.
Seegmuller, G. [1963]. Some remarks on the computer as a source language machine.
In Popplewell, C.M., editor, Information processing 1962, pages 524{525. North-Holland,
Amsterdam, NL.
References 359

Sethi, Ravi and Ullman, Jeffrey D. [1970]. The generation of optimal code for arithmetic
expressions. Journal of the ACM, 17(4):715{728.
Steele, G. L. [1977]. Arithmetic shifting considered harmful. ACM SIGPLAN Notices,
12(11):61{69.
Stephens, P. D. [1974]. The IMP language and compiler. Computer Journal, 17:216{223.
Stevenson, D. A. [1981]. Proposed standard for binary oating-point arithmetic. Computer,
14(3):51{62.
Szymanski, T. G. [1978]. Assembling code for machines with span-dependent instructions.
Communications of the ACM, 21(4):300{308.
Talmadge, R. B. [1963]. Design of an integrated programming and operating system part
ii. the assembly program and its language. IBM Systems Journal, 2:162{179.
Tanenbaum, A. S. [1976]. Structured Computer Organization. Prentice-Hall, Englewood
Clis.
Tanenbaum, A. S. [1978]. Implications of structured programming for machine architecture.
Communications of the ACM, 21(3):237{246.
Tanenbaum, Andrew S., van Staveren, H., and Stevenson, J. W. [1982]. Using
peephole optimization on intermediate code. ACM Transactions on Programming Languages
and Systems, 4(1):21{36.
Tennent, R. D. [1981]. Principles of Programming Languages. Prentice-Hall, Englewood
Clis.
Tienari, M. [1980]. On the denition of an attribute grammar. In Semantics-Directed
Compiler Construction, volume 94 of Lecture Notes in Computer Science, pages 408{414.
Springer Verlag, Heidelberg, New York.
Uhl, Jurgen, Drossopoulou, Sophia, Persch, Guido, Goos, Gerhard, Dausmann,
Manfred, Winterstein, Georg, and Kirchgassner, Walter [1982]. An Attribute
Grammar for the Semantic Analysis of Ada, volume 139 of Lecture Notes in Computer
Science. Springer Verlag, Heidelberg, New York.
van Wijngaarden, A., Mailloux, B. J., Lindsey, C. H., Meertens, L. G. L. T.,
Koster, C. H. A., Sintzoff, M., Peck, J. E. L., and Fisker, R. G. [1975]. Revised
report on the algorithmic language ALGOL 68. Acta Informatica, 5:1{236.
Waite, William M. [1973a]. Implementing Software for Non-Numerical Applications. Pren-
tice-Hall, Englewood Clis.
Waite, William M. [1973b]. A sampling monitor for applications programs. Software{
Practice and Experience, 3(1):75{79.
Waite, William M. [1976]. Semantic analysis. In [ Bauer and Eickel, 1976], pages
157{169. Springer Verlag, Heidelberg, New York.
Wegbreit, B. [1972]. A generalised compactifying garbage collector. Computer Journal,
15:204{208.
Wegner, P. [1972]. The vienna denition language. ACM Computing Surveys, 4(1):5{63.
360 References

Wells, M. B. [1971]. Elements of Combinatorial Computing. Pergamon Press, Oxford.

Wilcox, T. R. [1971]. Generating Machine Code for High-Level Programming Languages.
Ph.D. thesis, Computer Science Department, Cornell University, Ithaca, NY.
Wilhelm, R. [1977]. Baum Transformatoren: Ein Vergleich mit Baum-Transduktoren und
Aspekte der Implementierung. Ph.D. thesis, Technische Universitat Munchen.
Wirth, Niklaus [1980]. Modula-2. Technical report, Eidgenossische Technische Hochschule,
Zurich. Bericht Nr. 36.
Wulf, William A., Johnsson, Richard K., Geschke, Charles M., Hobbs, Ste-
ven O., and Weinstock, Charles B. [1975]. The Design of an Optimizing Compiler.
American Elsevier, new York.
Programming Language
Processors in Java
COMPILERS AND INTERPRETERS

DAVID A WATT
University of Glasgow, Scotland
and
DERYCK F BROWN
The Robert Gordon University, Scotland

An imprint of Pearson Education

Harlow, England . London . New York . Reading, Massachusetts . San Francisco . Toronto . Don Mills, Ontario . Sydney
Tokyo . Singapore . Hong Kong . Seoul . Taipei . Cape Town . Madrid . Mexico City . Amsterdam . Munich . Paris . Milan
Pearson Education Limited
Edinburgh Gate
Harlow
Essex, CM20 2JE
England
and Associated Companies throughout the world
Visit us on the World Wide Web at:
http://www.pearsoneduc.com

First published 2000

0Pearson Education Limited 2000

The rights of David A Watt and Deryck F Brown to be identified as authors of this
Work have been asserted by them in accordance with the Copyright, Designs and
Patents Act 1988.

All rights reserved. No part of this publication may be reproduced, stored

in a retrieval system, or transmitted in any form or by any means, electronic,
mechanical, photocopying, recording or otherwise, without either the prior
written permission of the Publishers or a licence permitting restricted copying
in the United Kingdom issued by the Copyright Licensing Agency Ltd.,
90 Tottenham Court Road London W l P OLP.

Many of the designations used by manufacturers and sellers to distinguish their

products are claimed as trademarks. Pearson Education Limited has made every
attempt to supply trademark information about manufacturers and their products
mentioned in this book. A list of the trademark designations and their owners appears
on page x.

ISBN 0 130 25786 9

British Library Cataloguing in Publication Data

A catalogue record for this book is available from the British Library.

Library of'Congrrss Cutaloging-in-Publication Data

Watt, David A. (David Anthony)
Programming language processors in Java : compilers and interpreters / David A. Watt
and Deryck F. Brown
p. cm.
Includes bibliographical references.
ISBN 0-1 3425786-9 (case)
1. Java (Computer program language) 2. Compilers (Computer programs) 3.
Interpreters (Computer programs) 1. Brown, Deryck F. 11. Title.

Typeset by 7
Printed an bound in Great Britain by Biddles Ltd, www.biddles.co.uk
Contents

Preface

1 Introduction
1.1 Levels of programming language
1.2 Programming language processors
1.3 Specification of programming languages
1.3.1 Syntax
1.3.2 Contextual constraints
1.3.3 Semantics
1.4 Case study: the programming language Triangle
1.5 Further reading
Exercises

2 Language Processors
Translators and compilers
Interpreters
Real and abstract machines
interpretive compilers
Portable compilers
Bootstrapping
2.6.1 Bootstrapping a portable compiler
2.6.2 Full bootstrap
2.6.3 Half bootstrap
2.6.4 Bootstrapping to improve efficiency
Case study: the Triangle language processor
Further reading
Exercises

3 Compilation
3.1 Phases
3.1.1 Syntactic analysis
3.1.2 Contextual analysis
3.1.3 Code generation
3.2 Passes
vi Programming Language Processors in Java

3.2.1 Multi-pass compilation

3.2.2 One-pass compilation
3.2.3 Compiler design issues
3.3 Case study: the Triangle compiler
3.4 Further reading
Exercises

4 Syntactic Analysis
4.1 Subphases of syntactic analysis
4.1.1 Tokens
4.2 Grammars revisited
4.2.1 Regular expressions
4.2.2 Extended BNF
4.2.3 Grammar transformations
4.2.4 Starter sets
4.3 Parsing
4.3.1 The bottom-up parsing strategy
4.3.2 The top-down parsing strategy
4.3.3 Recursive-descent parsing
4.3.4 Systematic development of a recursive-descent parser
4.4 Abstract syntax trees
4.4.1 Representation
4.4.2 Construction
4.5 Scanning
4.6 Case study: syntactic analysis in the Triangle compiler
4.6.1 Scanning
4.6.2 Abstract syntax trees
4.6.3 Parsing
4.6.4 Error handling
4.7 Further reading
Exercises

5 Contextual Analysis
5.1 Identification
5.1.1 Monolithic block structure
5.1.2 Flat block structure
5.1.3 Nested block structure
5.1.4 Attributes
5.1.5 Standard environment
5.2 Typechecking
5.3 A contextual analysis algorithm
5.3.1 Decoration
5.3.2 Visitor classes and objects
5.3.3 Contextual analysis as a visitor object
5.4 Case study: contextual analysis in the Triangle compiler
Contents vii

5.4.1 Identification
5.4.2 Type checking
5.4.3 Standard environment
5.5 Further reading
Exercises

6 Run-Time Organization
6.1 Data representation
6.1.1 Primitive types
6.1.2 Records
6.1.3 Disjoint unions
6.1.4 Static arrays
6.1.5 Dynamic arrays
6.1.6 Recursive types
6.2 Expression evaluation
6.3 Static storage allocation
6.4 Stack storage allocation
6.4.1 Accessing local and global variables
6.4.2 Accessing nonlocal variables
6.5 Routines
6.5.1 Routine protocols
6.5.2 Static links
6.5.3 Arguments
6.5.4 Recursion
6.6 Heap storage allocation
6.6.1 Heap management
6.6.2 Explicit storage deallocation
6.6.3 Automatic storage deallocation and garbage collection
6.7 Run-time organization for object-oriented languages
6.8 Case study: the abstract machine TAM
6.9 Further reading
Exercises

7 Code Generation
7.1 Code selection
7.1.1 Code templates
7.1.2 Special-case code templates
7.2 A code generation algorithm
7.2.1 Representation of the object program
7.2.2 Systematic development of a code generator
7.2.3 Control structures
7.3 Constants and variables
7.3.1 Constant and variable declarations
7.3.2 Static storage allocation
7.3.3 Stack storage allocation
viii Programming Language Processors in Java

7.4 Procedures and functions

7.4.1 Global procedures and functions
7.4.2 Nested procedures and functions
7.4.3 Parameters
7.5 Case study: code generation in the Triangle compiler
7.5.1 Entity descriptions
7.5.2 Constants and variables
7.6 Further reading
Exercises

8 Interpretation
8.1 Iterative interpretation
8.1.1 Iterative interpretation of machine code
8.1.2 Iterative interpretation of command languages
8.1.3 Iterative interpretation of simple programming languages
8.2 Recursive interpretation
8.3 Case study: the TAM interpreter
8.4 Further reading
Exercises

9 Conclusion
9.1 The programming language life cycle
9.1.1 Design
9.1.2 Specification
9.1.3 Prototypes
9.1.4 Compilers
9.2 Error reporting
9.2.1 Compile-time error reporting
9.2.2 Run-time error reporting
9.3 Efficiency
9.3.1 Compile-time efficiency
9.3.2 Run-time efficiency
9.4 Further reading
Exercises
Projects with the Triangle language processor

Appendices

A Answers to Selected Exercises

Answers 1
Answers 2
Answers 3
Answers 4
Answers 5
Answers 6
Contents ix

Answers 7
Answers 8
Answers 9

B Informal Specification of the Programming Language Triangle

B. 1 Introduction
B.2 Commands
B.3 Expressions
B.4 Value-or-variable names
B.5 Declarations
B.6 Parameters
B.7 Type-denoters
B.8 Lexicon
B.9 Programs

C Description of the Abstract Machine TAM

C. 1 Storage and registers
C.2 Instructions
C.3 Routines

D Class Diagrams for the Triangle Compiler

D.l Compiler
D.2 Abstract syntax trees
D.2.1 Commands
D.2.2 Expressions
D.2.3 Value-or-variable names
D.2.4 Declarations
D.2.5 Parameters
D.2.6 Type-denoters
D.2.7 Terminals
D.3 Syntactic analyzer
D.4 Contextual analyzer
D.5 Code generator

Bibliography
Preface

The subject of this book is the implementation of programming languages.

Programming language processors are programs that process other programs. The
primary examples of language processors are compilers and interpreters.
Programming languages are of central importance in computer science. They are
the most fundamental tools of software engineers, who are completely dependent on the
quality of the language processors they use. There is an interplay between the design of
programming languages and computer instruction sets: compilers must bridge the gap
between high-level languages and machine code. And programming language design
itself raises strong feelings among computer scientists, as witnessed by the proliferation
of language paradigms. Imperative and object-oriented languages are currently
dominant in terms of actual usage, and it is on the implementation of such languages
that this book focuses.
Programming language implementation is a particularly fascinating topic, in our
view, because of its close interplay between theory and practice. Ever since the dawn of
computer science, the engineering of language processors has driven, and has been
vastly improved by, the development of relevant theories.
Nowadays, the principles of programming language implementation are very well
understood. An experienced compiler writer can implement a simple programming lan-
guage about as fast as he or she can type. The basic techniques are simple yet effective,
and can be lucidly presented to students. Once the techniques have been mastered,
building a compiler from scratch is essentially an exercise in software engineering.
A textbook example of a compiler is often the first complete program of its size
seen by computer science students. Such an example should therefore be an exemplar of
good software engineering principles. Regrettably, many compiler textbooks offend
these principles. This textbook, based on a total of about twenty-five years' experience
of teaching programming language implementation, aims to exemplify good software
engineering principles at the same time as explaining the specific techniques needed to
build compilers and interpreters.
The book shows how to design and build simple compilers and interpreters using
the object-oriented programming language Java. The reasons for this choice are two-
fold. First, object-oriented methods have emerged as a dominant software engineering
technology, yielding substantial improvements in software modularity, maintainability,
xii Programming Language Processors in Java

and reusability. Secondly, Java itself has experienced a prodigious growth in popularity
since its appearance as recently as 1994, and that for good technical reasons: Java is
simple, consistent, portable, and equipped with an extremely rich class library. Soon we
can expect all computer science students to have at least some familiarity with Java.

A programming languages series

This is the fourth of a series of books on programming languages:
Programming Language Concepts and Paradigms
Programming Language Syntax and Semantics
Programming Language Processors
Programming Language Processors in Java
Programming Language Concepts and Paradigms studies the concepts underlying pro-
gramming languages, and the major language paradigms that use these concepts in
different ways; in other words, it is about language design. Programming Language
Syntax and Semantics shows how we can formally specify the syntax (form) and seman-
tics (meaning) of programming languages. Programming Language Processors studies
the implementation of programming languages, examining language processors such as
compilers and interpreters, and using Pascal as the implementation language. Program-
ming Language Processors in Java likewise studies the implementation of programming
languages, but now using Java as the implementation language and object-oriented
design as the engineering principle; moreover, it introduces basic techniques for imple-
menting object-oriented languages.
This series attempts something that has not previously been achieved, as far as we
know: a broad study of all aspects of programming languages, using consistent termi-
nology, and emphasizing connections likely to be missed by books that deal with these
aspects separately. For example, the concepts incorporated in a language must be
defined precisely in the language's semantic specification. Conversely, a study of
semantics helps us to discover and refine elegant and powerful new concepts, which can
be incorporated in future language designs. A language's syntax underlies analysis of
source programs by language processors; its semantics underlies object code generation
and interpretation. Implementation is an important consideration for the language
designer, since a language that cannot be implemented with acceptable efficiency will
not be used.
The books may be read as a series, but each book is sufficiently self-contained to be
read on its own, if the reader prefers.

Content of this book

Chapter 1 introduces the topic of the book. It reviews the concepts of high-level pro-
gramming languages, and their syntax, contextual constraints, and semantics. It explains
what a language processor is, with examples from well-known programming systems.
Preface xiii

Chapter 2 introduces the basic terminology of language processors: translators,

compilers, interpreters, source and target languages, and real and abstract machines. It
goes on to study interesting ways of using language processors: interpretive compilers,
portable compilers, and bootstrapping. In this chapter we view language processors as
'black boxes'. In the following chapters we look inside these black boxes.
Chapter 3 looks inside compilers. It shows how compilation can be decomposed
into three principal phases: syntactic analysis, contextual analysis, and code generation.
It also compares different ways of designing compilers, leading to one-pass and multi-
pass compilation.
Chapter 4 studies syntactic analysis in detail. It decomposes syntactic analysis into
scanning, parsing, and abstract syntax tree construction. It introduces recursive-descent
parsing, and shows how a parser and scanner can be systematically constructed from the
source language's syntactic specification.
Chapter 5 studies contextual analysis in detail, assuming that the source language
exhibits static bindings and is statically typed. The main topics are identification, which
is related to the language's scope rules, and type checking, which is related to the lan-
guage's type rules.
Chapter 6 prepares for code generation by discussing the relationship between the
source language and the target machine. It shows how target machine instructions and
storage must be marshaled to support the higher-level concepts of the source language.
The topics covered include data representation, expression evaluation, storage
allocation, routines and their arguments, garbage collection, and the run-time
organization of simple object-oriented languages.
Chapter 7 studies code generation in detail. It shows how to organize the translation
from source language to object code. It relates the selection of object code to the seman-
tics of the source language. As this is an introductory textbook, only code generation for
a stack-based target machine is covered. (The more difficult topics of code generation
for a register-based machine, and code transformations are left to more advanced
textbooks.)
Chapter 8 looks inside interpreters. It gives examples of interpreters for both low-
level and high-level languages.
Chapter 9 concludes the book. It places the implementation of a programming lan-
guage in the context of the language's life cycle, along with design and specification. It
also discusses quality issues, namely error reporting and efficiency.
There are several possible orders for studying the main topics of this book. The
chapter on interpretation can be read independently of the chapters on compilation.
Within the latter, the chapters on syntactic analysis, contextual analysis, and code gen-
eration can be read in any order. The following diagram summarizes the dependencies
between chapters.
xiv Programming Language Processors in Java

Processors

6 Run-Time

u Conclusion

Examples and case studies

The methods described in this textbook are freely illustrated by examples. In Chapter 2,
the examples are of language processors for real programming languages. In the remain-
ing chapters, most examples are based on smaller languages, in order that the essential
points can be conveyed without the reader getting lost in detail.
A complete programming language is a synthesis of numerous concepts, which
often interact with one another in quite complicated ways. It is important that the reader
understands how we cope with these complications in implementing a complete
programming language. For this purpose we use the programming language Triangle as
a case study. An overview of Triangle is given in Section 1.4. A reader already familiar
with a Pascal-like language should have no trouble in reading Triangle programs. A
complete specification of Triangle is given in Appendix B; this includes a formal
specification of its syntax, but is otherwise informal.
We designed Triangle for two specific purposes: to illustrate how a programming
language can be formally specified (in the companion textbook Programming Language
Syntax and Semantics), and to illustrate how a programming language can be imple-
mented. Ideally we would use a real programming language, such as Pascal or Java, for
these purposes. In practice, however, real languages are excessively complicated. They
contain many features that are tedious but unilluminating to specify and to implement.
Preface xv

Although Triangle is a model language, it is rich enough to write interesting programs

and to illustrate basic methods of specification and implementation. Finally, it can
readily be extended in various ways (such as adding new types, new control structures,
or packages), and such extensions are a basis for a variety of projects.

Educational software
A Triangle language processor is available for educational use in conjunction with this
textbook. The Triangle language processor consists of: a compiler for Triangle, which
generates code for TAM (Triangle Abstract Machine); an interpreter for TAM; and a
disassembler for TAM. The tools are written entirely in Java, and will run on any
computer equipped with a JVM (Java Virtual Machine). You can download the Triangle
language processor from our Web site:

Exercises and projects

Each chapter of this book is followed by a number of relevant exercises. These vary
from short exercises, through longer ones (marked *), up to truly demanding ones
(marked **) that could be treated as projects.
A typical exercise is to apply the methods of the chapter to a very small toy
language, or a minor extension of Triangle.
A typical project is to implement some substantial extension to Triangle. Most of
the projects are gathered together at the end of Chapter 9; they require modifications to
several parts of the Triangle compiler, and should be undertaken only after reading up to
Chapter 7 at least.

Readership
This book and its companions are aimed at junior, senior, and graduate students of com-
puter science and information technology, all of whom need some understanding of the
fundamentals of programming languages. The books should also be of interest to profes-
sional software engineers, especially project leaders responsible for language evaluation
and selection, designers and implementors of language processors, and designers of new
languages and extensions to existing languages.
The basic prerequisites for this textbook are courses in programming and data struc-
tures, and a course in programming languages that covers at least basic language con-
cepts and syntax. The reader should be familiar with Java, and preferably at least one
other high-level language, since in studying implementation of programming languages
it is important not to be unduly influenced by the idiosyncrasies of a particular language.
All the algorithms in this textbook are expressed in Java.
The ability to read a programming language specification critically is an essential
skill. A programming language implementor is forced to explore the entire language,
including its darker corners. (The ordinary programmer is wise to avoid these dark
xvi Programming Language Processors in Java

corners!) The reader of this textbook will need a good knowledge of syntax, and ideally
some knowledge of semantics; these topics are briefly reviewed in Chapter 1 for the
benefit of readers who might lack such knowledge. Familiarity with BNF and EBNF
(which are commonly used in language specifications) is essential, because in Chapter 4
we show how to exploit them in syntactic analysis. No knowledge of formal semantics
is assumed.
The reader should be comfortable with some elementary concepts from discrete
mathematics - sets and recursive functions - as these help to sharpen understanding of,
for example, parsing algorithms. Discrete mathematics is essential for a deeper under-
standing of compiler theory; however, only a minimum of compiler theory is presented
in this book.
This book and its companions attempt to cover all the most important aspects of a
large subject. Where necessary, depth has been sacrificed for breadth. Thus the really
serious student will need to follow up with more advanced studies. Each book has an
extensive bibliography, and each chapter closes with pointers to further reading on the
topics covered by the chapter.

Acknowledgments
Most of the methods described in this textbook have long since passed into compiler
folklore, and are almost impossible to attribute to individuals. Instead, we shall mention
people who have particularly influenced us personally.
For providing a stimulating environment in which to think about programming lan-
guage issues, we are grateful to colleagues in the Department of Computing Science at
the University of Glasgow, in particular Malcolm Atkinson, Muffy Calder, Quintin
Cutts, Peter Dickman, Bill Findlay, John Hughes, John Launchbury, Hermano Moura,
John Patterson, Simon Peyton Jones, Fermin Reig, Phil Trinder, and Phil Wadler. We
have also been strongly influenced, in many different ways, by the work of Peter
Buneman, Luca Cardelli, Edsger Dijkstra, Jim Gosling, Susan Graham, Tony Hoare,
Jean Ichbiah, Mehdi Jazayeri, Robin Milner, Peter Mosses, Atsushi Ohori, Bob Tennent,
Jim Welsh, and Niklaus Wirth.
We wish to thank the reviewers for reading and providing valuable comments on an
earlier draft of this book. Numerous cohorts of undergraduate students taking the
Programming Languages 3 module at the University of Glasgow made an involuntary
but essential contribution by class-testing the Triangle language processor, as have three
cohorts of students taking the Compilers module at the Robert Gordon University.
We are particularly grateful to Tony Hoare, editor of the Prentice Hall International
Series in Computer Science, for his encouragement and advice, freely and generously
offered when these books were still at the planning stage. If this book is more than just
another compiler textbook, that is partly due to his suggestion to emphasize the connec-
tions between compilation, interpretation, and semantics.
Glasgow and Aberdeen D.A.W.
July, 1999 D.F.B.
CHAPTER ONE

Introduction

In this introductory chapter we start by reviewing the distinction between low-level and
high-level programming languages. We then see what is meant by a programming lan-
guage processor, and look at examples from different programming systems. We review
the specification of the syntax and semantics of programming languages. Finally, we
look at Triangle, a programming language that will be used as a case study throughout
this book.

1 Levels of programming language

/-'
Programming languages are the basic tools of all programmers. A programming lan-
guage is a formal notation for expressing algorithms. Now, an algorithm is an abstract
concept, and has an existence independent of any particular notation in which it might
be expressed. Without a notation, however, we cannot express an algorithm, nor com-
municate it to others, nor reason about its correctness.
Practicing programmers, of course, are concerned not only with expressing and ana-
lyzing algorithms, but also with constructing software that instructs machines to perform
useful tasks. For this purpose programmers need facilities to enter, edit, translate, and
interpret programs on machines. Tools that perform these tasks are called programming
language processors, and are the subject of this book.
Machines are driven by programs expressed in machine code (or muchine lang-
uage). A machine-code program is a sequence of instructions, where each instruction is
just a bit string that is interpreted by the machine to perform some defined operation.
Typical machine-code instructions perform primitive operations like the following:
Load an item of data from memory address 366.
Add two numbers held in registers 1 and 2.
Jump to instruction 13 if the result of the previous operation was zero.
In the very early days of computing, programs were written directly in machine
code. The above instructions might be written, respectively, as follows:
2 Programming Language Processors in Java

Once written, a program could simply be loaded into the machine and run.
Clearly, machine-code programs are extremely difficult to read, write, and edit. The
programmer must keep track of the exact address of each item of data and each instruc-
tion in storage, and must encode every single instruction as a bit string. For small pro-
grams (consisting of thousands of instructions) this task is onerous; for larger programs
the task is practically infeasible.
Programmers soon began to invent symbolic notations to make programs easier to
read, write, and edit. The above instructions might be written, respectively, as follows:
LOAD x
ADD R1 R2
JUMPZ h
where LOAD,ADD,and JUMPZ are symbolic names for operations, R1 and R2 are sym-
bolic names for registers, x is a symbolic name for the address of a particular item of
data, and h is a symbolic name for the address of a particular instruction. Having written
a program like this on paper, the programmer would prepare it to be run by manually
translating each instruction into machine code. This process was called assembling the
program.
The obvious next step was to make the machine itself assemble the program. For this
process to work, it is necessary to standardize the symbolic names for operations and
registers. (However, the programmer should still be free to choose symbolic names for
data and instruction addresses.) Thus the symbolic notation is formalized, and can now
be termed an assembly language.
Even when writing programs in an assembly language, the programmer is still work-
ing in terms of the machine's instruction set. A program consists of a large number of
very primitive instructions. The instructions must be written individually, and put to-
gether in the correct sequence. The algorithm in the mind of the programmer tends to be
swamped by details of registers, jumps, and so on. To take a very simple example, con-
sider computing the area of a triangle with sides a , b, and c, using the formula:
d(s x (S- a) x (s - b) x (s - c))
where s = (a + b + c) I 2
Written in assembly language, the program must be expressed in terms of individual
arithmetic operations, and in terms of the registers that contain intermediate results:
LOAD R1 a; ADD R1 b; ADD R1 c; DIV R1 # 2 ;
LOAD R2 R1;
LOAD R3 R1; SUB R3 a; MULT R2 R3;
LOAD R3 R1; SUB R3 b; MULT R2 R3;
Introduction 3

LOAD R3 R1; SUB R3 c; MULT R2 ~ 3 ;

LOAD RO R2; CALL sqrt
Programming is made very much easier if we can use notation similar to the familiar
mathematical notation:
let s = (a+b+c)/ 2
in sqrt (s*(s-a)* (s-b)* (s-c))
Today the vast majority of programs are written in programming languages of this
kind. These are called high-level languages, by contrast with machine languages and
assembly languages, which are low-level languages. Low-level languages are so called
because they force algorithms to be expressed in terms of primitive instructions, of the
kind that can be performed directly by electronic hardware. High-level languages are so
called because they allow algorithms to be expressed in terms that are closer to the way
in which we conceptualize these algorithms in our heads. The following are typical of
concepts that are supported by high-level languages, but are supported only in a rudi-
mentary form or not at all by low-level languages:
Expressions: An expression is a rule for computing a value. The high-level language
programmer can write expressions similar to ordinary mathematical notation, using
operators such as '+', '-', '*', and '/'.
Data types: Programs manipulate data of many types: primitive types such as truth
values, characters, and integers, and composite types such as records and arrays. The
high-level language programmer can explicitly define such types, and declare con-
stants, variables, functions, and parameters of these types.
Control structures: Control structures allow the high-level language programmer to
program selective computation ( e g , by if- and case-commands) and iterative compu-
tation (e.g., by while- and for-commands).
Declarations: Declarations allow the high-level language programmer to introduce
identifiers to denote entities such as constant values, variables, procedures, functions,
and types.
Abstraction: An essential mental tool of the programmer is abstraction, or separation
of concerns: separating the notion of what computation is to be performed from the
details of how it is to be performed. The programmer can emphasize this separation
by use of named procedures and functions. Moreover, these can be parameterized
with respect to the entities on which they operate.
Encapsulation (or data abstraction): Packages and classes allow the programmer to
group together related declarations, and selectively to hide some of them. A particu-
larly important usage of this concept is to group hidden variables together with oper-
ations on these variables, which is the essence of object-oriented programming.
Section 1.5 suggests further reading on the concepts of high-level programming lan-
guages.
4 Programming Language Processors in Java

1.2 Programming language processors

A programming language processor is any system that manipulates programs
expressed in some particular programming language. With the help of language
processors we can run programs, or prepare them to be run.
This definition of language processors is very general. It encompasses a variety of
systems, including the following:
Editors. An editor allows a program text to be entered, modified, and saved in a file.
An ordinary text editor lets us edit any textual document (not necessarily a program
text). A more sophisticated kind of editor is one tailored to edit programs expressed in
a particular language.
Translators and compilers. A translator translates a text from one language to
another. In particular, a compiler translates a program from a high-level language to a
low-level language, thus preparing it to be run on a machine. Prior to performing this
translation, a compiler checks the program for syntactic and contextual errors.
Interpreters. An interpreter takes a program expressed in a particular language, and
runs it immediately. This mode of execution, omitting a compilation stage in favor of
immediate response, is preferred in an interactive environment. Command languages
and database query languages are usually interpreted.
In practice, we use all the above kinds of language processor in program develop-
ment. In a conventional programming system, these language processors are usually
separate tools; this is the 'software tools' philosophy. However, most systems now offer
integrated language processors, in which editing, compilation, and interpretation are just
options within a single system. The following examples contrast these two approaches.

Example 1.1 Language processors as software tools

The 'software tools' philosophy is well exemplified by the UNIX operating system. In-
deed, this philosophy was fundamental to the system's design.
Consider a UNIX user developing a chess-playing application in Java, using the Sun
Java Development Kit (JDK). The user invokes an editor, such as the screen editor v i ,
to enter and store the program text in a file named (say) C h e s s . ja v a :

Then the user invokes the Java compiler, j a v a c :

ja v a c C h e s s . ja v a
This translates the stored program into object code, which it stores in a file named
C h e s s . c l a s s . The user can now test the object-code program by running it using the
interpreter, ja v a :
java C h e s s
Introduction 5

If the program fails to compile, or misbehaves when run, the user reinvokes the
editor to modify the program; then reinvokes the compiler; and so on. Thus program
development is an edit-compile-run cycle.
There is no direct communication between these language processors. If the program
fails to compile, the compiler will generate one or more error reports, each indicating
the position of the error. The user must note these error reports, and on reinvoking the
editor must find the errors and correct them. This is very inconvenient, especially in the
early stages of program development when errors might be numerous.
0
The essence of the 'software tools' philosophy is to provide a small number of com-
mon and simple tools, which can be used in various combinations to perform a large
variety of tasks. Thus only a single editor need be provided, one that can be used to edit
programs in a variety of languages, and indeed other textual documents too.
What we have described is the 'software tools' philosophy in its purest form. In
practice, the philosophy is compromised in order to make program development easier.
The editor might have a facility that allows the user to compile the program (or indeed
issue any system command) without leaving the editor. Some compilers go further: if
the program fails to compile, the editor is automatically reinvoked and positioned at the
first error.
These are ad hoc solutions. A fresh approach seems preferable: a fully integrated
language processor, designed specifically to support the edit-compile-run cycle.

Example 1.2 Integrated language processor

Borland JBuilder is a fully integrated language processor for Java, consisting of an
editor, a compiler, and other facilities. The user issues commands to open, edit, compile,
and run the program. These commands may be selected from pull-down menus, or from
the keyboard.
The editor is tailored to Java. It assists with the program layout using indentation,
and it distinguishes between Java keywords, literals and comments using color. The
editor is also fully integrated with the visual interface construction facilities of JBuilder.
The compiler is integrated with the editor. When the user issues the 'compile' com-
mand, and the program is found to contain a compile-time error, the erroneous phrase is
highlighted, ready for immediate editing. If the program contains several errors, then the
compiler will list all of them, and the user can select a particular error message and have
the relevant phrase highlighted.
The object program is also integrated with the editor. If the program fails at run-
time, the failing phrase is highlighted. (Of course, this phrase is not necessarily the one
that contains the logical error. But it would be unreasonable to expect the language
processor to debug the program automatically !)
0
Introduction 7

widely understood. But contextual constraints and semantics are usually specified infor-
mally, because their formal specification is more difficult, and the available notations
are not yet widely understood. A typical language specification, with formal syntax but
otherwise informal, may be found in Appendix B.

1.3.1 Syntax
Syntax is concerned with the form of programs. We can specify the syntax of a pro-
gramming language formally by means of a context-free grammar. This consists of the
following elements:
A finite set of terminal symbols (or just terminals). These are atomic symbols, the
ones we actually enter at the keyboard when composing a program in the language.
Typical examples of terminals in a programming language's grammar are '>=',
' w h i l e ' , and '; '.
A finite set of nonterminal symbols (or just nonteminals). A nonterminal symbol
represents a particular class of phrases in the language. Typical examples of
nonterminals in a programming language's grammar are Program, Command,
Expression, and Declaration.
A start symbol, which is one of the nonterminals. The start symbol represents the
principal class of phrases in the language. Typically the start symbol in a
programming language's grammar is Program.
A finite set of production rules. These define how phrases are composed from termi-
nals and subphrases.
Grammars are usually written in the notation BNF (Backus-Naur Form). In BNF, a
production rule is written in the form N ::= a, where N is a nonterminal symbol, and
where a is a (possibly empty) string of terminal andlor nonterminal symbols. Several
production rules with a common nonterminal on their left-hand sides:

may be grouped as:

The BNF symbol '::=' is pronounced 'may consist of', and 'I' is pronounced 'or alterna-
tively'.

Example 1.3 Mini-Triangle syntax

Mini-Triangle is a toy programming language that will serve as a running example here
and elsewhere. (It is a subset of Triangle, the language to be introduced in Section 1.4.)
Introduction 9

Expression primary-Expression
Expression Operator primary-Expression
Integer-Literal
V-name
Operator primary-Expression
( Expression )

V-name ldentifier
Declaration single-Declaration
Declaration ;single-Declaration
-
const ldentifier Expression
var ldentifier :Type-denoter
Type-denoter ldentifier
Operator +1-1*1/1<1>1=1\
ldentifier Letter I ldentifier Letter I ldentifier Digit
Integer-Literal Digit 1 Integer-Literal Digit
Comment ! Graphic* eol

Production rule (1.30 tells us that a single-command may consist of the terminal
symbol 'begin',followed by a command, followed by the terminal symbol 'end'.
Production rule (1.3a) tells us that a single-command may consist of a value-or-
variable-name, followed by the terminal symbol ' :=', followed by an expression.
A value-or-variable-name, represented by the nonterminal symbol V-name, is the
name of a declared constant or variable. Production rule (1.6) tells us that a value-or-
variable-name is just an identifier. (More complex value-or-variable-names can be writ-
ten in full Triangle.)
Production rules (1.2a-b) tell us that a command may consist of a single-command
alone, or alternatively it may consist of a command followed by the terminal symbol ' ;'
followed by a single-command. In other words, a command consists of a sequence of
one or more single-commands separated by semicolons.
In production rules (1.1 la-c), (1.12a-b), and (1.13):
eol stands for an end-of-line 'character';
Letter stands for one of the lowercase letters 'a','b',..., or '2';

Digitstandsforoneofthedigits 'O', 'l', ..., or'9';

Graphic stands for a space or visible character.
The nonterminals Letter, Digit, and Graphic each represents a set of single characters.
Specifying them formally is simple but tedious, for example:
Introduction

Example 1.5 Mini-Triangle abstract syntax

Here we present a grammar specifying the abstract syntax of Mini-Triangle. This
specify only the phrase structure of Mini-Triangle. Distinctions between commands
single-commands, between declarations and single-declarations, and between exp~
sions and primary-expressions, will be swept away.
The nonteminal symbols are:
Program (start symbol)
Command
Expression
V-name
Declaration
Type-denoter
The production rules are:
Program ::= Command Program
Command ::= V-name := Expression
I ldentifier ( Expression )
I Command ;Command
I i f Expression then Command
else Command
I while Expression do Command WhileCommand
I l e t Declaration i n Command Letcommand
Expression ::= Integer-Literal IntegerExpression
I V-name VnameExpression
( Operator Expression UnaryExpression
I Expression Operator Expression BinaryExpression
V-name ::= Identifier
Declaration ::= const ldentifier
I
-
Expression
var ldentifier :Type-denoter
ConstDeclaration
VarDeclaration
I Declaration ; Declaration SequentialDeclaration (1.18~)
Type-denoter ::= Identifier SimpleTypeDenoter (1 .19)
Production rules in the abstract syntax look much like those in the concrete syntax.
In addition, we give each production rule a suitable label, as shown above right. We will
use these labels to label the nonterminal nodes of ASTs.
Figures 1.4 through 1.6 show some Mini-Triangle ASTs, corresponding to the (con-
crete) syntax trees of Figures 1.1 through 1.3, respectively.
The AST of Figure 1.5 represents the following command:
while b do begin n := 0 ; b : = false end
14 Programming Language Processors in Java

This AST's root node is labeled Whilecommand, signifying the fact that this is a while-
command. The root node's second child is labeled Sequentialcommand, signifying the
fact that the body of the while-command is a sequential-command. Both children of the
Sequentialcommand node are labeled Assigncommand.
When we write down the above command, we need the symbols 'begin'and 'end'
to bracket the subcommands 'n : = 0' and 'b : = false'.These brackets distinguish
the above command from:
while b do n := 0; b := false
whose meaning is quite different. (See Exercise 1S.) There is no trace of these brackets
in the abstract syntax, nor in the AST of Figure 1.5. They are not needed because the
AST structure itself represents the bracketing of the subcommands.
0
A program's AST represents its phrase structure explicitly. The AST is a convenient
structure for specifying the program's contextual constraints and semantics. It is also a
convenient representation for language processors such as compilers. For example, con-
sider again the assignment command 'while E do C'. The meaning of this command can
be specified in terms of the meanings of its subphrases E and C . The translation of this
command into object code can be specified in terms of the translations of E and C into
object code. The command is represented by an AST with root node labeled 'While-
Command' and two subtrees representing E and C, so the compiler can easily access
these subphrases.
In Chapter 3 we shall use ASTs extensively to discuss the internal phases of a com-
piler. In Chapter 4 we shall see how a compiler constructs an AST to represent the
source program. In Chapter 5 we shall see how the AST is used to check that the
program satisfies the contextual constraints. In Chapter 7 we shall see how to translate
the program into object code.

BinaryExpression

VnameExpr.
I
SimpleVname
I
SimpleVname
I
1dent. Op. 1nt.Lit. op.
I
Ident.

Figure 1.4 Abstract syntax tree of a Mini-Triangle expression.

Introduction 15

AssignCommand AssignCommand

VnameExpr.

Ident. Ident. 1nt.iit. 1dent. 1dent.

Figure 1.5 Abstract syntax tree of a Mini-Triangle command.

Program

,
I
Letcommand

BinaryExpression

,
VarDeclaration Simplev.
V n a m T

SimpleT. Simplev.

Ident. Ident. Ident. Ident. Op. 1nt.Lit.

y Integer

Figure 1.6 Abstract syntax tree of a Mini-Triangle program.

1.3.2 Contextual constraints

Contextual constraints are things like scope rules and type rules. They arise from the
possibility that whether a phrase is well-formed or not may depend on its context.
Every programming language allows identifiers to be declared, and thereafter used
in ways consistent with their declaration. For instance, an identifier declared as a
16 Programming Language Processors in Java

constant can be used as an operand in an expression; an identifier declared as a variable

can be used either as an operand in an expression or on the left-hand side of an assign-
ment; an identifier declared as a procedure can be used in a procedure call; and so on.
The occurrence of an identifier I at which it is declared is called a binding occur-
rence. Any other occurrence of I (at which it is used) is called an applied occurrence.
At its binding occurrence, the identifier I is bound to some entity (such as a value,
variable, or procedure). Each applied occurrence of I then denotes that entity. A
programming language's rules about binding and applied occurrences of identifiers are
called its scope rules.
If the programming language permits the same identifier I to be declared in several
places, we need to be careful about which binding occurrence of I corresponds to a
given applied occurrence of I. The language exhibits static binding if this can be
determined by a language processor without actually running the program; the language
exhibits dynamic binding if this can be determined only at run-time. In fact, nearly all
major programming languages do exhibit static binding; only a few languages (such as
Lisp and Smalltalk) exhibit dynamic binding.

Example 1.6 Triangle scope rules

Mini-Triangle is too simplistic a language for static binding to be an issue, so we shall
use Triangle itself for illustration. In the following Triangle program outline, binding
occurrences of identifiers are underlined, and applied occurrences are italicized:
let
const g - 2;
var a: Integer;
func f (i: Integer) : Integer -
i * m
in
begin
... ,
n := f ( n ) ; (1)
.. .
end
Each applied occurrence of m denotes the constant value 2. Each applied occurrence of
n denotes a particular variable. Each applied occurrence of f denotes a function that
doubles its argument. Each applied occurrence of i denotes that function's argument.
Each applied occurrence of Integer denotes the standard type int, whose values are
integer numbers.
Triangle exhibits static binding. The function call at point (1) above doubles its argu-
ment. Imagine a call to f in a block where m is redeclared:
let
const g - 3
Introduction 17

The function call at point (2) also doubles its argument, because the applied occurrence
of m inside the function f always denotes 2, regardless of what m denotes at the point of
call.
In a language with dynamic binding, on the other hand, the applied occurrence of m
would denote the value to which m was most recently bound. In such a language, the
function call at (1) would double its argument, whereas the function call at (2) would
triple its argument.
0
Every programming language has a universe of discourse, the elements of which we
call values. Usually these values are classified into types. Each operation in the language
has an associated type rule, which tells us the expected operand type(s), and the type of
the operation's result (if any). Any attempt to apply an operation to a wrongly-typed
value is called a type error.
A programming language is statically typed if a language processor can detect all
type errors without actually running the program; the language is dynamically typed if
type errors cannot be detected until run-time.

Example 1.7 Mini-Triangle type rules

Mini-Triangle is statically typed. Consider the following program outline:
let
v a r n: I n t e g e r
in
begin
...
while n > 0 do (1)
n : = n - 1; (2)
...
end
The type rule of 'z' is:
If both operands are of type int, then the result is of type bool.
Thus the expression 'n > 0' at point ( I ) is indeed of type bool. Although we cannot tell
in advance what particular values n will take, we know that such values will always be
integers. Likewise, although we cannot tell in advance what particular values the expres-
sion 'n > 0' will take, we know that such values will always be truth values.
The type rule of 'whileE do C' is:
E must be of type bool.
18 Programming Language Processors in Java

Thus the while-command starting at point (2) is indeed well-typed.

The type rule of '-' is:
If both operands are of type int, then the result is of type int.
Thus the expression 'n - 1' at point (2) is indeed of type int.
The type rule of 'V : = E' is:
V and E must be of equivalent type.
Thus the assignment command at point (2) is indeed well-typed.
In a dynamically-typed language, each variable, parameter, etc., may take values of
any type. For example, a given variable x might contain an integer or a truth value or a
value of some other type. The same variable might even contain values of different
types at different times. Thus we could not tell in advance what type of value x will
contain, never mind what individual value. It follows that we could not tell in advance
whether evaluating an expression such as 'x + 1' will satisfy the type rule of '+'.

The fact that a programming language is statically typed implies the following:
Every well-formed expression E has a unique type T, which can be inferred without
actually evaluating E.
Whenever E is evaluated, it will yield a value of type T. (Evaluation of E might fail
due to overflow or some other run-time error, or it might diverge, but its evaluation
will never fail due to a type error.)
In this book we shall generally assume that the source language exhibits static bind-
ing and is statically typed.

1.3.3 Semantics
Semantics is concerned with the meanings of programs, i.e., their behavior when run.
Many notations have been devised for specifying semantics formally, but so far none
has achieved widespread acceptance. Here we show how to specify the semantics of a
programming language informally.
Our first task is to specify, in general terms, what will be the semantics of each class
of phrase in the language. We may specify the semantics of commands, expressions, and
declarations as follows:
A command is executed to update variables. [It may also have the side effect of per-
forming input-output.]
An expression is evaluated to yield a value. [It may also have the side effect of updat-
ing variables.]
Introduction 23

is simply bound to the corresponding argument, which is a value, variable, procedure, or

function, respectively.

Example 1.11 Triangle procedures and finctions

The following function and procedure implement operations on a type Point:
type Point - record
x: Integer,
y: Integer
end ;
func projection (pt: Point) : Point -
{ x - pt-x, y - 0 - pt.y I ;

proc moveup (yshift: Integer, var pt: Point) -

pt.y : = pt.y + yshift;
.. .
var p: Point; var q: Point;
...
moveup(3, var p);
q : = projection(p)

Triangle has the usual variety of operators, standard functions, and standard proce-
dures. These behave exactly like ordinary declared functions and procedures; unlike
Pascal, they have no special type rules or parameter mechanisms. In particular, Triangle
operators behave exactly like functions of one or two parameters.

Example 1.12 Triangle operators

The Triangle operator ' / \ ' (logical conjunction) is, in effect, declared as follows:
func / \ (bl: Boolean, b2: Boolean) : Boolean -
if bl then b2 else false
The expression 'a / \b'is, in effect, a function call:
/\(a, b)
and the more complicated expression ' (n> 0) / \ ( sum/n > 40) ' likewise:
/\(>(n, O), >(/(sum, n), 40))
Note that the above declaration of / \ implies that both operands of / \ are evaluated
before the function is called. (Some other programming languages allow short-circuit
evaluation: the second operand of / \ is skipped if the first operand evaluates to false.)
0
Introduction 25

3ach expressions, commands, declarations - rather than individual lines. You proba-
ions. bly spend a lot of time on chores such as good layout. Also think of the
ruct, common syntactic errors that might reasonably be detected immediately.)
]ally
your 1.4 According to the context-free grammar of Mini-Triangle in Example 1.3,
.d to which of the following are Mini-Triangle expressions?
ning
(a) true
(b) sin(x)
(c) -n
(d) m 2 = n
(e) m - n * 2
am- Draw the syntax tree and AST of each one that is an expression.
lese Similarly, which of the following are Mini-Triangle commands?
lion
ling (f) n : = n + 1
(g) halt
dler (h) put (m, n)
and (i) if n > m then m : = n
OUS ) while n > 0 do n : = n-1
e is
Similarly, which of the following are Mini-Triangle declarations?
tics
(k) const pi - 3 .I416
of (1) const y - x+l
:ual (m) var b: Boolean
ans (n) var m, n: Integer
(0) var y: Integer; const dpy - 365

1.5 Draw the syntax tree and AST of the Mini-Triangle command:
while b do n : = 0; b : = false
cited at the end of Example 1.5. Compare with Figures 1.2 and 1.5.
of
1.6 According to the syntax and semantics of Mini-Triangle in Examples 1.3 and
1.8, what value is written by the following Mini-Triangle program? (The stan-
ro- dard procedure putint writes its argument, an integer value.)
let
const m 2;-
const n - rn + 1
in
putint(m + n * 2)
(Note: Do not be misled by your knowledge of any other languages.)
Language Processors 27

(c) A ~ava-into-x86'compiler: This is a program that translates Java programs into

x86 machine code. The source language is Java, and the target language is x86
machine code.
(d) An x86 assembler: This is a program that translates x86 assembly-language pro-
grams into x86 machine code. The source language is x86 assembly language, and
the target language is x86 machine code.
0
An assembler translates from an assembly language into the corresponding machine
code. An example is the x86 assembler of Example 2.l(d). Typically, an assembler gen-
erates one machine-code instruction per source instruction.
A compiler translates from a high-level language into a low-level language. An
example is the Java-into-x86 compiler of Example 2.l(c). Typically, a compiler gener-
ates several machine-code instructions per source command.
Assemblers and compilers are the most important kinds of programming language
translator, but not the only kinds. We sometimes come across high-level translutors
whose source and target languages are both high-level languages, such as the Java-into-
C translator of Example 2.l(b). A disassembler translates a machine code into the corre-
sponding assembly language. A decompiler translates a low-level language into a high-
level language. (See Exercise 2.1 .)
Here the translated texts are themselves programs. The source language text is called
the source program, and the target language text is called the object program.
Before performing any translation, a compiler checks that the source text really is a
well-formed program of the source language. (Otherwise it generates error reports.)
These checks take into account the syntax and the contextual constraints of the source
language. Assuming that the source program is indeed well-formed, the compiler goes
on to generate an object program that is semantically equivalent to the source program,
i.e., that will have exactly the desired effect when run. Generation of the object program
takes into account the semantics of the source and target languages.
Translators, and other language processors, are programs that manipulate programs.
Several languages are involved: not only the source language and the target language,
but also the language in which the translator is itself expressed! The latter is called the
implementation language.
To help avoid confusion, we shall use tombstone diagrams to represent ordinary
programs and language processors, and to express manipulations of programs by
language processors. We shall use one form of tombstone to represent an ordinary
program, and distinctive forms of tombstone to represent translators and interpreters.

' We use the term x86 to refer to the family of processors represented by the Intel 80386
processor and its successors.
Language Processors 29

(b) A Power PC (PPC) machine.

A program can run on a machine only if it is expressed in the appropriate machine

code. Consider running a program P (expressed in machine code M ) on machine M. We
represent this by putting the P tombstone on top of the M pentagon, as shown in Fig-
ure 2.3.

Figure 2.3 Running program P on machine M.

Example 2.4 Tombstone diagrams representing program execution

The following diagrams show how we represent:
(a) Running program sort (expressed in x86 machine code) on an x86 machine.
(b) Running program sort (expressed in PPC machine code) on a PPC machine.
(c) Attempting to run program sort (expressed in PPC machine code) on an x86
machine. Of course, this will not work; the diagram clearly shows that the machine
code in which the program is expressed does not match the machine on which we
are attempting to run the program.
(d) Attempting to run program sort (expressed in Java) on an x86 machine. This will
not work either; a program expressed in a high-level language cannot run immedi-
ately on any machine. (It must first be translated into machine code.)
Language Processors 31

An S-into-T translator is itself a program, and can run on machine M only if it is ex-
pressed in machine code M. When the translator runs, it. translates a source program P,
expressed in the source language S, to an equivalent object program P, expressed in the
target language T. This is shown in Figure 2.5. (The object program is shaded gray, to
emphasize that it is newly generated, unlike the translator and source program, which
must be given at the start.)

...........
"'.'must match
.......'"
must ....___,_.._
match

Figure 2.5 Translating a source program P expressed in language S to an object program

expressed in language T, using an S-into-T translator running on machine M.

Example 2.6 Compilation

The following diagram represents compilation of a Java program on an x86 machine.
Using the Java-into-x86 compiler, we translate the source program s o r t to an equiva-
lent object program, expressed in x86 machine code. Since the compiler is itself ex-
pressed in x86 machine code, the compiler will run on an x86 machine.

The second stage of the diagram shows the object program being run, also on an x86
machine.
0
A cross-compiler is a compiler that runs on one machine (the host machine) but gen-
erates code for a dissimilar machine (the target machine). The object program must be
generated on the host machine but downloaded to the target machine to be run. A cross-
compiler is a useful tool if the target machine has too little memory to accommodate the
compiler, or if the target machine is ill-equipped with program development aids. (Com-
pilers tend to be large programs, needing a good programming environment to develop,
and needing ample memory to run.)
Language Processors 35

The instructions have complicated formats, and are therefore time-consuming to

analyze. (This is the case in most high-level languages.)

Example 2.11 Interpreters

Here are some well-known examples of interpreters:
A Basic interpreter: Basic has expressions and assignment commands like other
high-level languages. But its control structures are low-level: a program is just a
sequence of commands linked by conditional and unconditional jumps. A Basic in-
terpreter fetches, analyzes, and executes one command at a time.
A Lisp interpreter: Lisp is a very unusual language in that it assumes a common
data structure (trees) for both code and data. Indeed, a Lisp program can manufac-
ture new code at run-time! The Lisp program structure lends itself to interpretation.
(See also Exercise 2.10.)
The UNIX command language interpreter (shell): A UNIX user instructs the
operating system by entering textual commands. The shell program reads each
command, analyzes it to extract a command-name together with some arguments,
and executes the command by means of a system call. The user can see the results
of a command before entering the next one. The commands constitute a command
language, and the shell is an interpreter for that command language.
An SQL interpreter: SQL is a database query language. The user extracts inform-
ation from the database by entering an SQL query, which is analyzed and executed
immediately. This is done by an SQL interpreter within the database management
system.

An interpreter is represented by a rectangular tombstone, as shown in Figure 2.6.

The head of the tombstone names the interpreter's source language. The base of the
tombstone (as usual) names the implementation language.

Figure 2.6 Tombstone representing an S interpreter expressed in language L.

Example 2.12 Tombstones representing interpreters

The following diagrams show how we represent:
(a) A Basic interpreter, expressed in x86 machine code.
Language Processors 37

chess chess
Basic Lisp Lisp
Basic Lisp Basic

x86

(b) (c)

2.3 Real and abstract machines

The interpreters mentioned in Example 2.12 were all for (relatively) high-level lan-
guages. But interpreters for low-level languages are also useful.

Example 2.14 Hardware emulation

Suppose that a computer engineer has designed the architecture and instruction set of a
radical new machine, Ultima. Now, actually constructing Ultima as a piece of hardware
will be an expensive and time-consuming job. Modifying the hardware to implement
design changes will likewise be costly. It would be wise to defer hardware construction
until the engineer has somehow tested the design. But how can a paper design be tested?
There is a remarkably simple method that is both cheap and fast: we write an
interpreter for Ultima machine code. E.g., we could write the interpreter in C:

L-l
Ultima

We can now translate the interpreter into some machine code, say M, using the C
compiler:
Language Processors 39

v
Figure 2.8 An abstract machine is functionally equivalent to a real machine.

2.4 Interpretive compilers

A compiler may take quite a long time to translate a source program into machine code,
but then the object program will run at full machine speed. An interpreter allows the
program to start running immediately, but it will run very slowly (up to 100 times more
slowly than the machine-code program).
An interpretive compiler is a combination of compiler and interpreter, giving some
of the advantages of each. The key idea is to translate the source program into an
intermediate language, designed to the following requirements:
it is intermediate in level between the source language and ordinary machine code;
its instructions have simple formats, and therefore can be analyzed easily and quickly;
translation from the source language into the intermediate language is easy and fast.
Thus an interpretive compiler combines fast compilation with tolerable running speed.

Example 2.15 Interpretive compilation

Sun Microsystems' Java Development Kit (JDK) is an implementation of an interpretive
compiler for Java. At its heart is the Java Virtual Machine (JVM), a powerful abstract
machine.
JVM-code is an intermediate language oriented to Java. It provides powerful
instructions that correspond directly to Java operations such as object creation, method
call, and array indexing. Thus translation from Java into JVM-code is easy and fast.
Although powerful, JVM-code instructions have simple formats like machine-code
instructions, with operation fields and operand fields, and so are easy to analyze. Thus
JVM-code interpretation is relatively fast: 'only' about ten times slower than machine
code.
JDK consists of a Java-into-JVM-code translator and a JVM-code interpreter, both
of which run on some machine M:
Language Processors

character set or different arithmetic. Written with care, however, application prograi
expressed in high-level languages should achieve 95-99% portability.
Similar points apply to language processors, which are themselves programs. Inder
it is particularly important for language processors to be portable because they :
especially valuable and widely-used programs. For this reason language processors :
commonly written in high-level languages such as Pascal, C, and Java.
Unfortunately, it is particularly hard to make compilers portable. A compile
function is to generate machine code for a particular machine, a function that
machine-dependent by its very nature. If we have a C-into-x86 compiler expressed ir
high-level language, we should be able to move this compiler quite easily to run or
dissimilar machine, but it will still generate x86 machine code! To change the compi
to generate different machine code would require about half the compiler to
rewritten, implying that the compiler is only about 50% portable.
It might seem that highly portable compilers are unattainable. However, the situatil
is not quite so gloomy: a compiler that generates intermediate language is potential
much more portable than a compiler that generates machine code.

Example 2.1 6 A portable compiler kit

Consider the possibility of producing a portable Java compiler kit. Such a kit wou
consist of a Java-into-JVM-code translator, expressed both in Java and in JVM-coc
and a JVM-code interpreter, expressed in Java:

How can we make this work? It seems that we cannot compile Java programs un
we have an implementation of JVM-code, and we cannot use the JVM-code interpret
until we can compile Java programs! Fortunately, a small amount of work can get us o
of this chicken-and-egg situation.
Suppose that we want to get the system running on machine M, and suppose that v
already have a compiler for a suitable high-level language, such as C, on this machine.
Then we rewrite the interpreter in C:

and then compile it:

Language Processors 43
I Now suppose that the implementation language is the source language: the language
processor can be used to process itself! This process is called bootstrapping. The idea
seems at first to be paradoxical, but it can be made to work. Indeed, it turns out to be
extremely useful. In this section we study several kinds of bootstrapping.

2.6.1 Bootstrapping a portable compiler

I In Sections 2.4 and 2.5 we looked at interpretive and portable compilers. These work by
translating from the high-level source language into an intermediate language, and then
interpreting the latter.
A portable compiler can be bootstrapped to make a true compiler - one that
generates machine code - by writing an intermediate-language-into-machine-code
translator.

I Example 2.1 7 Bootstrapping an interpretive compiler to generate

machine code
Suppose that we have made a portable Java compiler kit into an interpretive compiler
running on machine M, as described in Example 2.16. We can use this to build an
efficient Java-into-M compiler, as follows.
First, we write a JVM-code-into-M translator, in Java:

JVM -+ M

(This is a substantial job, but only about half as much work as writing a complete Java-
into-M compiler.) Next, we compile this translator using the existing interpretive
compiler:

This gives a JVM-code-into-M translator expressed in JVM-code itself.

Next, we use this translator to translate itself:
Language Processors 45

Exactly the same point applies to a language processor expressed in L. In Exam-

ple 2.10, we saw how a Java compiler, expressed in C, could be translated into machine
code by a C compiler (and thus enabled to run). However, this Java compiler can be
maintained only as long as a C compiler is available. If we wish to make a new version
of the Java compiler (e.g., to remove known bugs, or to generate better-quality machine
code), we will need a C compiler to recompile the Java compiler.
In general, a compiler whose source language is S, expressed in a different high-
level language L, can be maintained only as long as a compiler for L is available. This
problem can be avoided by writing the S compiler in S itself! Whenever we make a new
version of the S compiler, we use the old version to compile the new version. The only
difficulty is how to get started: how can we compile thefirst version of the S compiler?
The key idea is to start with a subset of S - a subset just large enough to be suitable for
writing the compiler. The method is called full bootstrap - since a whole compiler is to
be written from scratch.

Example 2.18 Full bootstrap

Suppose that we wish to build an Ada compiler for machine M. Now Ada is a very large
language, so it makes sense to build the compiler incrementally. We start by selecting a
small subset of Ada that will be adequate for compiler writing. (The Pascal-like subset
of Ada would be suitable.) Call this subset Ada-S.
We write version 1 of our Ada-S compiler in C (or any suitable language for which a
compiler is currently available):

We compile version 1 using the C compiler:

This gives an Ada-S compiler for machine M. We can test it by using it to compile and
run Ada-S test programs.
But we prefer not to rely permanently on version 1 of the Ada-S compiler, because it
is expressed in C, and therefore is maintainable only as long as a C compiler is
available. Instead, we make version 2 of the Ada-S compiler, expressed in Ada-S itself:
Language Processors 47

2.6.3 Half bootstrap

Suppose that we have a compiler that runs on a machine HM, and generates HM's
machine code; now we wish to move the compiler to run on a dissimilar machine TM. In
this transaction HM is called the host machine, and TM is called the target machine.
If the compiler is expressed in a high-level language for which we have a compiler
on TM, just getting the compiler to run on TM is straightforward, but we would still
have a compiler that generates HM's machine code. It would, in fact, be a cross-
compiler.
To make our compiler generate TM's machine code, we have no choice but to
rewrite part of the compiler. As we shall see in Chapter 3, one of the major parts of a
compiler is the code generator, which does the actual translation into the target
language. Typically the code generator is about half of the compiler. If our compiler has
been constructed in a modular fashion, it is not too difficult to strip out the old code
generator, which generated HM's machine code; then we can substitute the new code
generator, which will generate TM's machine code.
If the compiler is expressed in its own source language, this process is called a half
bootstrap - since roughly half the compiler must be modified. It does not depend on any
compiler or assembler being already available on the target machine - indeed, it
depends only on the host machine compiler!

Example 2.19 Half bootstrap

Suppose that we have a Ada compiler that generates machine code for machine HM.
The compiler is expressed in Ada itself, and in HM's machine code:

We wish to bootstrap this compiler to machine TM. To be precise, we want a compiler

that runs on TM and generates TM's machine code.
First, we modify the compiler's code generator to generate TM's machine code:

We compile the modified compiler, using the original compiler, to obtain a cross-
compiler:
Language Processors 5 1

The compiler translates Triangle source programs into TAM code. TAM (Triangle
Abstract Machine) is an abstract machine, implemented by an interpreter. TAM has
been designed to facilitate the implementation of Triangle - although it would be
equally suitable for implementing Algol, Pascal, and similar languages. Like JVM-code
(Example 2.15), TAM'S primitive operations are more similar to the operations of a
high-level language than to the very primitive operations of a typical real machine. As a
consequence, the translation from Triangle into TAM code is straightforward and fast.
The Triangle-into-TAM compiler and the TAM interpreter together constitute an
interpretive compiler, much like the one described in Example 2.15. (See Exercise 2.2.)
The TAM disassembler translates a TAM machine code program into TAL (Triangle
Assembly Language). It is used to inspect the object programs produced by the
Triangle-into-TAM compiler.

c+ c+
Triangle + TAM TAM 4 TAL

Figure 2.9 The compiler, interpreter, and disassembler components

of the Triangle language processor.

Further reading
A number of authors have used tombstone diagrams to represent language processors
and their interactions. The formalism was fully developed, complete with mathematical
underpinnings, by Earley and Sturgis (1970). Their paper also presents an algorithm that
systematically determines all the tombstones that can be generated from a given initial
set of tombstones.
A case study of compiler development by full bootstrap may be found in Wirth
(1971). A case study of compiler development by half bootstrap may be found in Welsh
and Quinn (1972). Finally, a case study of compiler improvement by bootstrapping may
be found in Ammann (1981). Interestingly, all these three case studies are interlinked:
Wirth's Pascal compiler was the starting point for the other two developments.
Bootstrapping has a longer history, the basic idea being described by several authors
in the 19.50s. (At that time compiler development itself was still in its infancy !) The first
well-known application of the idea seems to have been a program called eval,which
was a Lisp interpreter expressed in Lisp itself (McCarthy et al. 1965).
Sun Microsystems' Java Development Kit (JDK) consists of a compiler that trans-
lates Java code to JVM code, a JVM interpreter, and a number of other tools. The
compiler (javac) is written in Java itself, having been bootstrapped from an initial
Language Processors 53

2.3 Assume that you have the following: a machine M; a C compiler that runs on
machine M and generates machine code M; and a Java-into-C translator ex-
pressed in C. Use tombstone diagrams to represent these language processors.
Also show how you would use these language processors to:
(a) compile and run a program P expressed in C;
(b) compile the Java-into-C translator into machine code;
(c) compile and run a program Q expressed in Java.

2.4 Assume that you have the following: a machine M; a C compiler that runs on
machine M and generates machine code M; a TAM interpreter expressed in C;
and a Pascal-into-TAM compiler expressed in C. Use tombstone diagrams to
represent these language processors. Also show how you would use these lan-
guage processors to:
(a) compile the TAM interpreter into machine code;

(b) compile the Pascal-into-TAM compiler into machine code;

2.5 The Gnu compiler kit uses a machine-independent register transfer language,
RTL, as an intermediate language. The kit includes translators from several
high-level languages (such as C, C++, Pascal) into RTL, and translators from
RTL into several machine codes (such as Alpha, PPC, SPARC). It also
includes an RTL 'optimizer', i.e., a program that translates RTL into more
efficient RTL. All of these translators are expressed in C.

(a) Show how you would install these translators on a SPARC machine,
given a C compiler for the SPARC.
Now show how you would use these translators to:
(b) compile a program P, expressed in Pascal, into SPARC machine code;
(c) compile the same program, but using the RTL optimizer to generate more
efficient object code;
(d) cross-compile a program Q, expressed in C++, into PPC machine code.

2.6 The Triangle language processor (see Section 2.7) is expressed entirely in Java.
Use tombstone diagrams to show how the compiler, interpreter, and disassem-
bler would be made to run on machine M. Assume that a Java-into-M compiler
is available.

2.7 Draw tombstone diagrams to illustrate the use of a Java JIT (just-in-time)
compiler. Show what happens when a Java program P is compiled and stored
on a host machine H, and subsequently downloaded for execution on the user's
CHAPTER THREE

Compilation

In this chapter we study the internal structure of compilers. A compiler's basic function
is to translate a high-level source program to a low-level object program, but before
doing so it must check that the source program is well-formed. So compilation is
decomposed into three phases: syntactic analysis, contextual analysis, and code gener-
ation. In this chapter we study these phases and their relationships. We also examine
some possible compiler designs, each design being characterized by the number of
passes over the source program or its internal representation, and discuss the issues
underlying the choice of compiler design.
In this chapter we restrict ourselves to a shallow exploration of compilation. We
shall take a more detailed look at syntactic analysis, contextual analysis, and code
generation in Chapters 4, 5, and 7, respectively.

Inside any compiler, the source program is subjected to several transformations before
an object program is finally generated. These transformations are called phases. The
three principal phases of compilation are as follows:
Syntactic analysis: The source program is parsed to check whether it conforms to the
source language's syntax, and to determine its phrase structure.
Contextual analysis: The parsed program is analyzed to check whether it conforms to
the source language's contextual constraints.
Code generation: The checked program is translated to an object program, in accor-
dance with the semantics of the source and target languages.
The three phases of compilation correspond directly to the three parts of the source
language's specification: its syntax, its contextual constraints, and its semantics. '

' Some compilers include a fourth phase, code optimization. Lexical analysis is sometimes
treated as a distinct phase, but in this book we shall treat it as a sub-phase of syntactic analysis.
Compilation 57

In order to be concrete, we shall explain these transformations as implemented in the

Triangle compiler that is our case study. It should be understood, however, that another
Triangle compiler could implement the transformations in a different way. The main
purpose of this section is to explain what transformations are performed, not how they
are implemented. In Section 3.2.2 we shall emphasize this point by sketching an
alternative Triangle compiler with a very different design, which nevertheless performs
essentially the same processing on the source program.

3.11 Syntactic analysis

The purpose of syntactic analysis is to determine the source program's phrase structure.
This process is called parsing. It is an essential part of compilation because the subse-
quent phases (contextual analysis and code generation) depend on knowing how the
program is composed from commands, expressions, declarations, and so on.
The source program is parsed to check whether it conforms to the source language's
syntax, and to construct a suitable representation of its phrase structure. Here we assume
that the chosen representation is an AST.

Example 3.1 Triangle AST

Syntactic analysis of the Triangle source program of Figure 3.2 yields the AST of
Figure 3.3. As we shall be studying the compilation of this program in some detail, let
us examine those parts of the AST that are numbered in Figure 3.3.
The program is a let-command. It consists of a declaration ( ' v a r n : I n t e g e r ;
v a r c : C h a r ' in the source program) and a subcommand ('c : = ' & ' ; n : =
n + l ' ) . This is represented by an AST whose root node is labeled 'Letcommand',
and whose subtrees represent the declaration and subcommand, respectively.
This is a variable declaration. It consists of an identifier (n) and a type-denoter
(Integer).
This also is a variable declaration. It consists of an identifier (c) and a type-denoter
(Char).
This is a sequential command. It consists of two subcommands ( ' c : = ' & ' ' and
'n : = n + l ' ) .
This is an assignment command. It consists of a value-or-variable-name on the
left-hand side (n) and an expression on the right-hand side ( n + l ) .
This value-or-variable-name is just an identifier (n).
This is an expression that applies an operator ('+') to two subexpressions.
This expression is a value-or-variable-name (n).
This expression is an integer-literal (1).
If the source program contains syntactic errors, it has no proper phrase structure. In
that case, syntactic analysis generates error reports instead of constructing an AST.

3.1.2 Contextual analysis

In contextual analysis the parsed program is further analyzed, to determine whether it
conforms to the source language's contextual constraints:
The source language's scope rules allow us, at compile-time, to associate each applied
occurrence of an identifier (e.g., in an expression or command) with the
corresponding declaration of that identifier, and to detect any undeclared identifiers.
(Here we are assuming that the source language exhibits static binding.)
The source language's type rules allow us, at compile-time, to infer the type of each
expression and to detect any type errors. (Here we are assuming that the source lan-
guage is statically typed.)
If the parsed program is represented by its AST, then contextual analysis will yield a
decorated AST. This is an AST enriched with information gathered during contextual
analysis:
As a result of applying the scope rules, each applied occurrence of an identifier is
linked to the corresponding declaration. We show this diagrammatically by a dashed
arrow.
As a result of applying the type rules, each expression is decorated by its type T. We
show this diagrammatically by marking the expression's root node ': 7".

Example 3.2 Triangle contextual analysis

Triangle exhibits static binding and is statically typed. Contextual analysis of the AST of
Figure 3.3 yields the decorated AST of Figure 3.4.
The contextual analyzer checks the declarations as follows:

(2) It notes that identifier n is declared as a variable of type int.

(3) It notes that identifier c is declared as a variable of type char.
The contextual analyzer checks the second assignment command as follows:
(6) At this applied occurrence of identifier n, it finds the corresponding declaration at
(2). It links this node to (2). From the declaration it infers that n is a variable of
type int.
(8) Here, similarly, it infers that the expression n is of type int.
(9) This expression, being an integer-literal, is manifestly of type int.
(7) Since the operator '+' is of type int x int -+ int, it checks that the left and right
subexpressions are of type int, and infers that the whole expression is of type int.
Compilation 61

let
var n: Integer
in ! ill-formed program
while n / 2 do
m : = 'n' > 1
Figure 3.5 An ill-formed Triangle source program.

Program
I
Letcommand
(1)l
Whilecommand
I

BinaryE&ession :i n x

VarDeclaration \\, VnameExpr.

int ‘,SimpleV.
int (2)
Ident. Ident. Op. 1nt.Lit. Ident. Char.Lit. Op. 1nt.Lit.

n / 2 m 'n' > 1

Figure 3.6 Discovering errors during contextual analysis of the Triangle program of Figure 3.5.

3.1.3 Code generation

After syntactic and contextual analysis, the source program has been thoroughly
checked and is known to be well-formed. Code generation is the final translation of the
checked program to an object program, in accordance with the source and target
languages' semantics.
A pervasive issue in code generation is the treatment of identifiers that are declared
andor used in the source program. In semantic terms, a declaration binds an identifier to
some sort of entity. For example:
A constant declaration such as 'const m - 7' binds the identifier m to the value 7.
The code generator must then replace each applied occurrence of m by the value 7.
A variable declaration such as 'var b : Boolean' binds the identifier b to some
address (storage cell), which is decided by the code generator itself. The code generat-
or must then replace each applied occurrence of b by the address to which it is bound.
Compilation 63

(7) It generates the instruction 'CALL add'. (When executed, this instruction will add
the two previously-fetched values.)
( 5 ) By following the link to the declaration of n, it retrieves this variable's address,
namely 0 [SB] . Then it generates the instruction 'STORE 0 [SB] '. (When exe-
cuted, this instruction will store the previously-computed value in that variable.)
In this way the code generator translates the whole program into object code.
0

3.2 Passes
In the previous section we examined the principal phases of compilation, and the flow
of data between them. In this section we go on to examine and compare alternative
compiler designs.
In designing a compiler, we wish to decompose it into modules, in such a way that
each module is responsible for a particular phase. In practice there are several ways of
doing so. The design of the compiler affects its modularity, its time and space require-
ments, and the number of passes over the program being compiled.
A pass is a complete traversal of the source program, or a complete traversal of an
internal representation of the source program (such as an AST). A one-pass compiler
makes a single traversal of the source program; a multi-pass compiler makes several
traversals.
In practice, the design of a compiler is inextricably linked to the number of passes it
makes. In this section we contrast multi-pass and one-pass compilation, and summarize
the advantages and disadvantages of each.

3.2.1 Multi-pass compilation

One possible compiler design is shown by the structure diagram4 of Figure 3.8.
The compiler consists of a top-level driver module together with three lower-level
modules, the syntactic analyzer, the contextual analyzer, and the code generator. First,
the compiler driver calls the syntactic analyzer, which reads the source program, parses
it, and constructs a complete AST. Next, the compiler driver calls the contextual

A structure diagram summarizes the modules and module dependencies in a system. The
higher-level modules are those near the top of the structure diagram. A connecting line
represents a dependency of a higher-level module on a lower-level module. This dependency
consists of the higher-level module using the services (e.g., types or methods) provided by the
lower-level module.
Compilation 65

Example 3.5 One-pass compilation

A one-pass Triangle compiler would work as follows. Consider the following Triangle
source program:
! This program is useless
! except for illustration.
let
var n: lntegedl);
var c:
in
begin
c(3) : = ' & ' (4x5);
n(6) : = n+1(7)(8)
end
This is identical to the source program of Figure 3.2, but some of the key points in the
program have been numbered for easy reference. At these points the following actions
are taken:
After parsing the variable declaration 'var n: Integer',the syntactic analyzer
calls the contextual analyzer to record the fact (in a table) that identifier n is de-
clared to be a variable of type in?. It then calls the code generator to allocate and
record an address for this variable, say 0 [ S B ].
After parsing the variable declaration 'var c : Char', the syntactic analyzer
similarly calls the contextual analyzer to record the fact that identifier c is declared
to be a variable of type char. It then calls the code generator to allocate and record
an address for this variable, say 1[ SB].
After parsing the value-or-variable-name c, the syntactic analyzer infers (by
calling the contextual analyzer) that it is a variable of type char. It then calls the
code generator to retrieve its address, 1 [SB] .
After parsing the expression & ' , the syntactic analyzer infers that it is of type
char. It then calls the code generator to generate instruction 'LOADL 3 8'.

After parsing the assignment command 'c := ' & ' ', the syntactic analyzer calls
the contextual analyzer to check type compatibility. It then calls the code generator
to generate instruction 'STORE 1[ S B ] ', using the address retrieved at point (3).
After parsing the value-or-variable-name n, the syntactic analyzer infers (by
calling the contextual analyzer) that it is a variable of type int. It then calls the code
generator to retrieve the variable's address, 0 [SB].
While parsing the expression n+l,the syntactic analyzer infers (by calling the
contextual analyzer) that the subexpression n is of type int, that the operator '+' is
of type int x int + int, that the subexpression 1 is of type int, and hence that the
whole expression is of type int. It calls the code generator to generate instructions
'LOAD 0 [ SBI ', 'LOADL l',and 'CALL add'.
Compilation 67

possible. (These are the so-called 'optimizing' compilers.) Such transformations

generally require analysis of the whole program prior to code generation, so they
force a multi-pass design on the compiler.
Source language properties might restrict the choice of compiler design. A source
program can be compiled in one pass only if every phrase (e.g., command or expres-
sion) can be compiled using only information obtained from the preceding part of the
source program. This requirement usually boils down to whether identifiers must be
declared before use. If they must be declared before use (as in Pascal, Ada, and Trian-
gle), then one-pass compilation is possible in principle. If identifiers need not be
declared before use (as in Java and ML), then multi-pass compilation is required.

Example 3.6 Pascal compiler design

In Pascal, the usual rule is that identifiers must be declared before use. Thus an applied
occurrence of an identifier can be compiled in the sure knowledge that the identifier's
declaration has already been processed (or is missing altogether).
Consider the following Pascal block:
var n : Integer;
procedure inc;
begin
n := n+l
end ;
begin
n : = 0; inc
end
When a Pascal one-pass compiler encounters the command 'n : = n+17, it has already
processed the declaration of n. It can therefore retrieve the type and address of the
variable, and subject the command to contextual analysis and code generation.
Suppose, instead, that the declaration of n follows the procedure. When the Pascal
one-pass compiler encounters the command 'n : = n+17,it has not yet encountered the
declaration of n. So it cannot subject the command to contextual analysis and code
generation. Fortunately, the compiler is not obliged to do so: it can safely generate an
error report that the declaration of n is either misplaced or missing altogether.
0

Example 3.7 Java compiler design

The situation is different in Java, in which variable or method declarations need not be
in any particular order. The following Java class is perfectly well-formed:
Compilation 69

public static void main (String[] args) {

A one-pass Triangle compiler would have been perfectly feasible, so the choice of a
three-pass design needs to be justified. The Triangle compiler is intended primarily for
educational purposes, so simplicity and clarity are paramount. Efficiency is a secondary
consideration; in any case, efficiency arguments for a one-pass compiler are inconclu-
sive, as we saw in Section 3.2.3. So the Triangle compiler was designed to be as modul-
ar as possible, allowing the different phases to be studied independently of one another.

Triangle

Triangle. Triangle. Triangle.

SyntacticAnalyzer ContextualAnalyzer CodeGenerator

Figure 3.10 Structure diagram for the Triangle compiler

A detailed structure diagram of the Triangle compiler is given in Figure 3.10,

showing the main classes and packages. Here are brief explanations of the packages and
the main classes they contain:
The Triangle .AbstractSyntaxTreespackage contains classes defining the
AST data structure. There is a class for each Triangle construct, e.g., AssignCom-
mand, Ifcommand, BinaryExpression, ConstDeclaration, VarDec-
laration,etc. Each class contains a constructor for building the AST for that
construct, and a visitor method used by the contextual analyzer and the code generator
to traverse the AST. The other parts of the compiler are allowed to manipulate the
fields of the AST objects directly.
The Triangle.SyntacticAnalyzer package contains the Parser class (and
some classes of no concern here). The parser parses the source program, and
constructs the AST. It generates an error report if it detects a syntactic error.
The Triangle.ContextualAnalyzer package contains the Checker class.
The checker traverses the AST, links applied occurrences of identifiers to the corre-
sponding declarations, infers the types of all expressions, and performs all necessary
Compilation 71

Exercises
In Examples 3.2 and 3.4, the first assignment command 'c : = ' & ' ' was
ignored. Describe how this command would have been subjected to contextual
analysis and code generation.

The Mini-Triangle source program below left would be compiled to the object
program below right:
let
const m 7;-
var x: Integer PUSH 1
in
x : = m * x LOADL 7
LOAD O[SB]
CALL mult
STORE 0 [SB]
POP 1
HALT
Describe the compilation in the same manner as Examples 3.1, 3.2, and 3.4.
(You may ignore the generation of the PUSH,and POP instructions.)

The Mini-Triangle source program below contains several contextual errors:

let
var a: Logical;
var b: Boolean;
var i: Integer
in
if i then b : = i = 0 else b : = yes
In the same manner as Example 3.3, show how contextual analysis will detect
these errors.

Choose a compiler with which you are familiar. Find out and describe its
phases and its pass structure. Draw a data flow diagram (like Figure 3.1) and a
structure diagram (like Figure 3.8 or Figure 3.9).

Consider a source language, like Fortran or C, in which the source program

consists of one or more distinct subprograms - a main program plus some pro-
cedures or functions. Design a compiler that uses ASTs, but (assuming that in-
dividual subprograms are moderately-sized) requires only a moderate amount
of memory for ASTs.
CHAPTER FOUR

Syntactic Analysis

In Chapter 3 we saw how compilation can be decomposed into three principal phases,
one of which is syntactic analysis. In this chapter we study syntactic analysis, and
further decompose it into scanning, parsing, and abstract syntax tree construction.
Section 4.1 explains this decomposition.
The main function of syntactic analysis is to parse the source program in order to
discover its phrase structure. Thus the main topic of this chapter is parsing, and in
particular the simple but effective method known as recursive-descent parsing. Sec-
tion 4.3 explains how parsing works, and shows how a recursive-descent parser can be
systematically developed from the programming language's grammar. This
development is facilitated by a flexible grammatical notation (EBNF) and by various
techniques for transforming grammars, ideas that are introduced in Section 4.2.
In a multi-pass compiler, the source program's phrase structure must be represented
explicitly in some way. This choice of representation is a major design decision. One
convenient and widely-used representation is the abstract syntax tree. Section 4.4 shows
how to make the parser construct an abstract syntax tree.
In parsing it is convenient to view the source program as a stream of tokens: symbols
such as identifiers, literals, operators, keywords, and punctuation. Since the source
program text actually consists of individual characters, and a token may consist of
several characters, scanning is needed to group the characters into tokens, and to discard
other text such as blank space and comments. Scanning is the topic of Section 4.5.

4.1 Subphases of syntactic analysis

Syntactic analysis in a compiler consists of the following subphases:
Scanning (or lexical analysis): The source program is transformed to a stream of
tokens: symbols such as identifiers, literals, operators, keywords, and punctuation.
Comments, and blank spaces between tokens, are discarded. (They are present in the
source program mainly for the benefit of human readers.)
Parsing: The source program (now represented by a stream of tokens) is parsed to
determine its phrase structure. The parser treats each token as a terminal symbol.
Syntactic Analysis 75

literal, and '+' is of kind operator. The criterion for classifying tokens is simply this: all
tokens of the same kind can be freely interchanged without affecting the program's
phrase structure. Thus the identifier ' y ' could be replaced by 'x'or 'banana',and the
integer-literal '1' by '7' or 'loo', without affecting the program's phrase structure. On
the other hand, the token '1e t' could not be replaced by '1o t' or '1ed' or anything
else; 'let'is the only token of its kind.
Each token is completely described by its kind and spelling. Thus a token can be
represented simply by an object with these two fields. The different kinds of token can
be represented by small integers.

let var y: Integer

in !new year
y : = y+l
Figure 4.1 A Mini-Triangle source program.

R ~ eger
~ .-
~ ~ ~
Figure 4.2 The program of Figure 4.1 represented by a stream of tokens.

Program
f \

-l

Expression

Declaration Expression
f > n

Type-Denoter V-name V-name primary-Expr.

n n n n

RmmFRAFlRRRRn
Ident. Ident. Ident. Ident. Op. Int-Lit.
n n n nnn

eger

Figure 4.3 The program of Figure 4.1 after parsing.

Syntactic Analysis 77

THEN = 11, // then

VAR = 12, // var
WHILE = 13, // while
SEMICOLON = 14, // ;
COLON = 15, // :
BECOMES = 16, // :=
IS = 17, // -
LPAREN = 18, // (
RPAREN = 19, // )
EOT = 20; // end of text
1
Note that a token of kind EOT represents the end of the source text. In both scanning
and parsing of the source program, the existence of this token will prove convenient.
0
Only the kind of each token will be examined by the parser, since different tokens of
the same kind do not affect the source program's phrase structure. The spellings of some
tokens (identifiers, literals, operators) will be examined by the contextual analyzer
andlor code generator, so their spellings must be retained and eventually incorporated
into the AST. The spellings of other tokens (such as 'let')will never be examined after
scanning. Nevertheless, it is convenient to have a uniform representation for all tokens.

4.2 Grammars revisited

In Section 1.3.1 we briefly reviewed context-free grammars, and showed how a
grammar generates a set of sentences. Each sentence is a string of terminal symbols. An
(unambiguous) sentence has a unique phrase structure, embodied in its syntax tree.
In Section 4.3 we shall see, not only how parsers work, but also how parsers can be
systematically developed from context-free grammars. The development is clearest if
we use an extension of BNF called EBNF, which is effectively BNF plus regular expres-
sions. EBNF lends itself to a variety of transformations that can be used to mould a
programming language's grammar into a form suitable for parser development. In this
section we briefly review regular expressions and EBNF, before presenting some useful
grammar transformations.

4.2.1 Regular expressions

A regular expression (RE) is a convenient notation for expressing a set of strings of
terminal symbols. The main features of the RE notation are:
'I' separates alternatives;
Syntactic Analysis 79

In summary:
A regular language - a language that does not exhibit self-embedding - can be
generated by an RE.
A language that does exhibit self-embedding cannot be generated by any RE. To
generate such a language, we must write recursive production rules in either BNF or
EBNF.

4.2.2 Extended BNF

EBNF (Extended BNF) is a combination of BNF and REs. An EBNF production rule is
of the form N ::= X, where N is a nonterminal symbol and X is an extended RE, i.e., an
RE constructed from both terminal and nonterminal symbols.
Unlike BNF, the right-hand side of an EBNF production rule may use not only 'I'
but also '*' and '(' and ')'. Unlike an ordinary RE, the right-hand side may contain non-
terminal symbols as well as terminal symbols. Thus we can write recursive production
rules, and an EBNF grammar is capable of generating a language with self-embedding.

Example 4.4 Grammar expressed in EBNF

Consider the following EBNF grammar:
Expression ..- primary-Expression (Operator primary-Expression)*
primary-Expression ::= Identifier
I ( Expression )

ldentifier
Operator
This grammar generates expressions such as:
e
a + b
a - b - c
a + (b * C )
a * ( b + c )/ d
a - ( b - ( C - ( d - el))
Because the production rules defining Expression and primary-Expression are
mutually recursive, the grammar can generate self-embedded expressions.
0
EBNF combines the advantages of both BNF and REs. It is equivalent to BNF in
expressive power. Its use of RE notation makes it more convenient than BNF for
specifying some aspects of syntax.
Syntactic Analysis 81

These production rules are equivalent in the sense that they generate exactly the
same languages. The production rule N ::= X 1 N Y states that an N-phrase may consist
either of an X-phrase or of an N-phrase followed by a Y-phrase. This is just a roundabout
way of stating that an N-phrase consists of an X-phrase followed by any number of Y-
phrases. The production rule N ::= X (Y)* states the same thing more concisely.

Example 4.6 Elimination of left recursion

The syntax of Triangle identifiers is expressed in BNF as follows:
Identifier . Letter
I ldentifier Letter
1 ldentifier Digit

This production rule is a little more complicated than the form shown above, but we can
left-factorize it:
Identifier . Letter
I ldentifier (Letter I Digit)

and now eliminate the left recursion:

ldentifier ...- Letter (Letter I Digit)*

As illustrated by Example 4.6, it is possible for a more complicated production rule

to be left-recursive:
N ::= XI 1 ... 1 Xm 1 NYI 1 ... 1 NY,
However, left factorization gives us:
N ::= ( X I 1 ... lXm) 1 N(Yl 1 ... 1 Y,)
and now we can apply our elimination rule:
N ::= (XI 1 ... 1Xm) (Yl 1 ... 1 Y,)*

Substitution of nonterminal symbols

Given an EBNF production rule N ::= X, we may substitute X for any occurrence of N
on the right-hand side of another production rule.
If we substitute X for every occurrence of N, then we may eliminate the nonterminal
N and the production rule N ::= X altogether. (This is possible, however, only if N ::= X
is nonrecursive and is the only production rule for N.)
Whether we actually choose to make such substitutions is a matter of convenience. If
N occurs in only a few places, and if X is uncomplicated, then elimination of N ::= X
might well simplify the grammar as a whole.
Syntactic Analysis 83

We can easily generalize this to define the starter set of an extended RE. There is
only one case to add:
where N is a nonterminal
symbol defined by
production rule N ::=X
In Example 4.4:
starters[[Expression] = starters[[primary-Expression
(Operator primary-Expression)*]
= starters[[primary-Expression]
= starters[[ldentifier] u starters[[( Expression ) ]
= starters[[a 1 b I c I d I el u { ( )
= l a , b, c,d,e, ( 1

4.3 Parsing
In this section we are concerned with analyzing sentences in some grammar. Given an
input string of terminal symbols, our task is to determine whether the input string is a
sentence of the grammar, and if so to discover its phrase structure. The following
definitions capture the essence of this.
With respect to a particular context-free grammar G:
Recognition of an input string is deciding whether or not the input string is a sentence
of G.
Parsing of an input string is recognition of the input string plus determination of its
phrase structure. The phrase structure can be represented by a syntax tree, or other-
wise.
We assume that G is unambiguous, i.e., that every sentence of G has exactly one
syntax tree. The possibility of an input string having several syntax trees is a compli-
cation we prefer to avoid.
Parsing is a task that humans perform extremely well. As we read a document, or
listen to a speaker, we are continuously parsing the sentences to determine their phrase
structure (and then determine their meaning). Parsing is subconscious most of the time,
but occasionally it surfaces in our consciousness: when we notice a grammatical error,
or realize that a sentence is ambiguous. Young children can be taught consciously to
parse simple sentences on paper.
In this section we are interested in parsing algorithms, which we can use in syntactic
analysis. Many parsing algorithms have been developed, but there are only two basic
parsing strategies: bottom-up parsing and top-down parsing. These strategies are
characterized by the order in which the input string's syntax tree is reconstructed. (In
Syntactic Analysis 85

Example 4.9 Bottom-up parsing of micro-English

Recall the grammar of micro-English (Example 4.8). Consider the following input
string, consisting of six terminal symbols:
the cat sees a rat.
Bottom-up parsing of this input string proceeds as follows:
(I) The first input terminal symbol is 'the'. The parser cannot do anything with this
terminal symbol yet, so it moves on to the next input terminal symbol, 'cat'. Here
it can apply the production rule 'Noun ::= cat' (4.4a), forming a Noun-tree with the
terminal symbol 'cat' as subtree:

Noun

the
I
cat

(Input terminal symbols not yet examined by the parser are shaded gray.)
(2) Now the parser can apply the production rule 'Subject ::= the Noun' (4.2c), com-
bining the input terminal symbol 'the' and the adjacent Noun-tree into a Subject-
tree:
Subject

I
the
I
cat

(3) Now the parser moves on to the next input terminal symbol, 'sees'. Here it can
apply the production rule 'Verb ::= sees' (4.5d), forming a Verb-tree:
Subject

c.dun Verb

(4) The next input terminal symbol is 'a'. The parser cannot do anything with this
terminal symbol yet, so it moves on to the following input terminal symbol, 'rat'.
Here it can apply the production rule 'Noun ::= rat' (4.4c), forming a Noun-tree:
Subject

G u n Verb Noun
I
the
I
cat
I
sees a
I
rat
I Syntactic Analysis 89

(6) The leftmost stub is now the (second) node labeled Noun. If the parser chooses to
apply production rule 'Noun ::= rat' (4.4c), it can connect the input terminal sym-
bol 'rat' to the tree. This step leaves the parser with a stub labeled '.' that matches
the next (and last) input terminal symbol:

Sentence
I

Subject Object

the cat sees a rat

Thus the parser has successfully parsed the input string.

Consider a particular context-free grammar G. In general, a top-down parser for G

starts with just a stub for the root node, labeled by S (the start symbol of G). At each
step, the parser takes the leftmost stub. If the stub is labeled by terminal symbol t, the
parser connects it to the next input terminal symbol, which must be t. (If not, the parser
has detected a syntactic error.) If the stub is labeled by nonterminal symbol N, the parser
chooses one of the production rules N ::= XI...Xn, and grows branches from the node
labeled by N to new stubs labeled XI, ..., X, (in order from left to right). Parsing
succeeds when and if the whole input string is connected up to the syntax tree.
How does the parser choose which production rule to apply at each step? In the
micro-English top-down parser the choices are easy. For example, the parser can always
choose which of the production rules 'Subject ::= ...' to apply simply by examining the
next input terminal symbol: if the terminal symbol is 'I', it chooses 'Subject ::= 1'; or if
the terminal symbol is 'the', it chooses 'Subject ::= the Noun'; or if the terminal symbol
is 'a', it chooses 'Subject ::= a Noun'. Unfortunately, some grammars make the choice
more difficult; and some grammars are completely unsuited to this parsing strategy.

4.3.3 Recursive-descent parsing

The bottom-up and top-down parsing strategies outlined in the previous subsections are
the basis of a variety of parsing algorithms. We observed that a parser often has to
choose which production rule to apply next. A particular way of making such choices
gives rise to a particular parsing algorithm.
Several parsing algorithms are commonly used in compilers. Here we describe just
one, which is both effective and easy to understand.
Recursive descent is a top-down parsing algorithm. A recursive-descent parser for a
grammar G consists of a group of methods p a r s e N , one for each nonterminal symbol
Syntactic Analysis 91

Now let us see how to implement the parser. We need a class to contain all of the
parsing methods; let us call it Parser.This class will also contain an instance variable,
currentTermina1,that will range over the terminal symbols of the input string. (For
example, given the input string of Figure 4.5, currentTermina1 will first contain
'the', then 'cat', then 'sees', etc., and finally '.'.) The Parser class, containing
currentTermina1,is declared as follows:
public class Parser {

private TerminalSymbol currentTermina1;

.. . / / Auxiliary methods will go here.
... / / Parsing methods will go here.
1
The current terminal is accessed by the following auxiliary method of the Parser
class:
private void accept (Terminalsymbol expectedTermina1) {
if (currentTermina1matches expectedTermina1)
currentTermina1 = next input terminal ;
else
report a syntactic error2
1
The parser will call 'accept(t)' when it expects the current terminal to be t, and
wishes to check that it is indeed t , before fetching the next input terminal.
The parsing methods themselves are implemented as follows. (For easy reference,
the corresponding production rules of the grammar are reproduced on the right.)
First, method parsesentence:
private void parsesentence 0 { Sentence ::=
parsesubject ( ) ; Subject
parseverb ( ) ; Verb
parseObject ( ) ; Object
accept ( '.' ) ;
1
This is easy to understand. According to the production rule, a sentence consists of a
subject, verb, object, and period, in that order. Therefore parsesentence should
encounter the subject, verb, object, and period, in that same order. It calls methods
parsesubject,parseverb,and parseobject,one after another, to parse the
subject, verb, and object, respectively. Finally it calls accept to check that the (now)
current terminal is indeed a period.

This type style indicates a command or expression not yet refined into Java. We will use this
convention to suppress minor details.
The parser is initiated using the following method:
public void parse ( ) {
currentTermina1 = first input terminal ;
parsesentence();
check that no terminal follows the sentence
I
This parser does not actually construct a syntax tree. But it does (implicitly) deter-
mine the input string's phrase structure. For example, parseNoun whenever called
finds the beginning and end of a phrase of class Noun, and parsesubject whenever
called finds the beginning and end of a phrase of class Subject. (See Figure 4.5.)
0
In general, the methods of a recursive-descent parser cooperate as follows:
The variable currentTermina1will successively contain each input terminal. All
parsing methods have access to this variable.
On entry to method parseN,currentTermina1 is supposed to contain the first
terminal of an N-phrase. On exit from parseN,currentTermina1 is supposed to
contain the input terminal immediately following that N-phrase.
On entry to method accept with argument t, current~erminalis supposed to
contain the terminal t . On exit from accept,currentTermina1 is supposed to
contain the input terminal immediately following t.
If the production rules are mutually recursive, then the parsing methods will also be
mutually recursive. For this reason (and because the parsing strategy is top-down), the
algorithm is called recursive descent.

4.3.4 Systematic development of a recursive-descent

parser
A recursive-descent parser can be systematically developed from a (suitable) context-
free grammar, in the following steps:
(1) Express the grammar in EBNF, with a single production rule for each nonterminal
symbol, and perform any necessary grammar transformations. In particular, always
eliminate left recursion, and left-factorize wherever possible.
( 2 ) Transcribe each EBNF production rule N : : = X to a parsing method parseN,
whose body is determined by X.
(3) Make the parser consist of:
a private variable currentToken;
private parsing methods developed in step (2);
Syntactic Analysis 9

These transformations are justified because they will make the grammar mor
suitable for parsing purposes. After making similar transformations to other parts of th
grammar, we obtain the following complete EBNF grammar of Mini-Triangle:
Program ...- single-Command (4.t
Command ..-
. single-Command (; single-Command)" (4.7
single-Command ::= Identifier (: = Expression ( ( Expression ) ) (4.8
I if Expression then single-Command
else single-Command
I while Expression do single-Command
( let Declaration in single-Command
I begin Command end
Expression ..- primary-Expression
(Operator primary-Expression)"
primary-Expression ::= Integer-Literal
I ldentifier
I Operator primary-Expression
( ( Expression )

Declaration .
..- single-Declaration (; single-Declaration)" (4.1 1

single-Declaration ::= -
const ldentifier Expression
I var ldentifier : Type-denoter
Type-denoter ..-
. Identifier (4.13
We have excluded production rules (1.10) through (1 .I 3), which specify the synta
of operators, identifiers, literals, and comments, all in terms of individual characters
This part of the syntax is called the language's lexicon (or microsyntax). The lexicon i
of no concern to the parser, which will view each identifier, literal, and operator as ,
single token. Instead, the lexicon will later be used to develop the scanner, in Sectioi
4.5.
We shall assume that the scanner returns tokens of class Token,defined in Exam
ple 4.2. Each token consists of a kind and a spelling. The parser will examine only th~
kind of each token.
Step (2) is to convert each EBNF production rule to a parsing method. The parsin1
methods will be as follows:
private void parseprogram 0 ;
private void parsecommand ( ) ;
private void parseSingleCommand ( ) ;
private void parseExpression 0 ;
private void parsePrimaryExpression 0 ;
private void parseDeclaration 0 ;
private void parseSingleDeclaration 0 ;
96 Programming Language Processors in Java

private void parseTypeDenoter 0;

private void parseIdentifier 0;
private void parse1ntegerLiteral 0;
private void parseoperator 0 ;
Here is method parseSingleDeclaration:
private void parseSingleDeclaration () {
switch (currentToken.kind) { single-Declaration ::=
case Token.CONST:
{
acceptIt ( ) ; const
parseIdentifier0; Identifier
accept(Token.IS); -
parseExpression(); Expression
1
break ;
case Token.VAR: I
{
acceptIt ( ) ; var
parseIdentifier0; Identifier
accept(Token.COLON);
parseTypeDenoter0; Type-denoter
I I
I
break ;
default : 1
report a syntactic error
I
1
Note the use of the auxiliary method accept1t,which unconditionally fetches the
next token from the source program. The following is also correct:
case Token.VAR:

accept (Token.VAR); var

parseIdentifier0; ldentifier
accept(Token.COLON);
parseTypeDenoter0; Type-denoter
I
break ;
Here 'accept(Token.VAR) ;' would check that the current token is of kind
Token.VAR.In this context, however, such a check is redundant.
Now here is method parsecommand:
Syntactic Analysis 9

private void parsecommand ( ) { Command ::=

parseSingleCommand(); single-Command
while (currentToken.kind
== Token.SEMIC0LON)
{ (
acceptIt ( ) ; I

parsesinglecommand(); single-Command
1 >*
1
This method illustrates something new. The EBNF notation '(; single-Command)*
signifies a sequence of zero or more occurrences of '; single-Command'. To parse thi
we use a while-loop, which is iterated zero or more times. The condition for continuin<
the iteration is simply that the current token is a semicolon.
Method parseDeclaration is similar to parsecommand. The remainin;
methods are as follows:
private void parseprogram () { Program ::=
parseSingleCommand(); single-Command
1
private void parseSingleCommand () {
switch (currentToken.kind) {
case Token.1DENTIFIER:
{
parseIdentifier0; Identifier
switch (currentToken-kind){
case Token.BECOMES:
{
acceptIt ( ) ; :=
parseExpression(); Expression
1
break ;
case Token.LPAREN: I
acceptIt ( ) ; (
parseExpression(); Expression
accept(Token.RPAREN); 1
1
break ;
default :
report a syntactic error
1
1
break ;
Syntactic Analysis 99

Operator
primary-Expression
>*

private void parsePrimaryExpression () {

switch (currentToken.kind) { primary-Expression ::=
case Token.INTLITERAL:
parseIntegerLiteral(); Integer-Literal
break ;
case Token.IDENTIFIER:
parseIdentifier0; ldentifier
break ;
case Token.OPERATOR:
I
parseoperator(); Operator
parsePrimaryExpression(); primary-Expression
1
break;
case Token.LPAREN:

acceptIt ( ) ; (
parseExpression ( ) ; Expression
accept(Token.RPAREN); 1

break ;
default:
report a syntactic error
1
private void parseTypeDenoter 0 { Type-denoter ::=
parseIdentifier0; Identifier
1
The nonterminal symbol ldentifier corresponds to a single token, so the method
parseIdentif ier is similar to accept:
private void parseIdentifier ( ) {
if (currentToken-kind== Token.1DENTIFIER)
currentToken = scanner.scan0;
else
report a syntactic error
1
Syntactic Analysis I(

Having worked through a complete example, let us now study in general terms ho
we systematically develop a recursive-descent parser from a suitable grammar. The tw
main steps are: (1) express the grammar in EBNF, performing any necessary transforn
ations; and (2) convert the EBNF production rules to parsing methods. It will be cot
venient to examine these steps in reverse order.

Converting EBNF production rules to parsing methods

Consider an EBNF production rule N ::= X. We convert this production rule to a parsin
method named p a r s e N . This method's body will be derived from the extended REX:
private void p a r s e N () {
parse X
I
Here 'parse X' is supposed to parse an X-phrase, i.e., a terminal string generated by J
(And of course the task of method p a r s e N is to parse an N-phrase.)
Next, we perform stepwise refinement on 'parse X', decomposing it according to th
structure of X. (In the following, X and Y stand for arbitrary extended REs.)
We refine 'parse E' to a dummy statement.
We refine 'parse t' (where t is a terminal symbol) to:
a c c e p t ( t );
In a situation where the current terminal is already known to be t , the following is alsc
correct and more efficient:

We refine 'parse N' (where N is a nonterminal symbol) to a call of the corresponding

parsing method:
parseN( ) ;

We refine 'parse X Y to:

{
parse X
parse Y

The reasoning behind this is simple. The input must consist of an X-phrase followed
by a Y-phrase. Since the parser works from left to right, it must parse the X-phrase and

1 then parse the Y-phrase.

I This refinement rule is easily generalized to 'parse XI .. . Xn' .

Syntactic Analysis

We start with the following outline of the method:

private void parsecommand ( ) {
parse single-Command (; single-Command)"
1
Now we refine 'parse single-Command (; single-Command)*' to:
parseSingleCornmand();
parse (; single-Command)"
Now we refine 'parse (; single-Command)"' to:
while (currentToken.kind == Token.SEMICOLON)
parse (; single-Command)
since starters[; single-Command] = [ ;).
Finally we refine 'parse (1 single-Command)' to:

{
acceptIt ( ) ;
parsesinglecommand();
1
In this situation we know already that the current token is a semicolon, so 'accept
It ( ) ;' is a correct alternative to 'accept(Token. SEMICOLON);'.
[

Example 4.14 Stepwise refinement of p a r s e ~ i n g l e ~ e c l a rtion

a
Let us also follow the stepwise refinement of the method parseSingleDeclara
tion of Example 4.12,starting from production rule (4.1 1):
single-Declaration ::= -
const ldentifier Expression
I var ldentifier :Type-denoter
We start with the following outline of the method:
private void parseSingleDeclaration ( ) {
-
parse const ldentifier Expression 1 var ldentifier :Type-denoter
1
Now we refine 'parse const ... I var ...' to:
switch (currentToken.kind) {
case Token.CONST:
parse const ldentifier
break ;
- Expression
Syntactic Analysis 107

This eliminates the problem, assuming that starters[[Declaration ;1) is disjoint from
starters[Command].

The above examples are quite typical. Although the LL(1) condition is quite restric-
tive, in practice most programming language grammars can be transformed to make
them LL(1) and thus suitable for recursive-descent parsing.

Performing grammar transformations

Left factorization is essential in some situations, as illustrated by the following example.

Example 4.1 7 Left factorization

In Example 4.12, the production rule 'V-name ::=Identifier' was eliminated. The
occurrences of V-name on the right-hand sides of (1.3a) and (1S b ) were simply replaced
by Identifier, giving:
single-Command ::= ldentifier := Expression
I ldentifier ( Expression )
I if Expression then single-Command
else single-Command
I ...
The starter sets are not disjoint:
startersl[ldentifier := Expression] = { Identifier)
startersl[ldentifier ( Expression ) I] = { Identifier]

However, the substitution created an opportunity for left factorization:

single-Command ::= Identifier ( := Expression 1 ( Expression ) )
I if Expression then single-Command
else single-Command
1 ...
This is an improvement, since now the relevant starter sets are disjoint:
starters1:= Expression] = { := )
starterst ( Expression ) lJ = ( ()
0
Left recursion must always be eliminated if the grammar is to be LL(1). The
following example shows why.
Syntactic Analysis 109

In general, a grammar that exhibits left recursion cannot be LL(1). Any attempt to
convert left-recursive production rules directly into parsing methods would result in an
incorrect parser. It is easy to see why. Given the left-recursive production rule:
N ::= X 1 NY
we find:
startersl[N YJI = startersl[NJ = starters[Xlj u startersl[N YJ
so startersl[fl and starters[N Y j cannot be disjoint.

4.4 Abstract syntax trees

A recursive-descent parser determines the source program's phrase structure implicitly,
in the sense that it finds the beginning and end of each phrase. In a one-pass compiler,
this is quite sufficient for the syntactic analyzer to know when to call the contextual
analyzer and code generator. In a multi-pass compiler, however, the syntactic analyzer
must construct an explicit representation of the source program's phrase structure. Here
we shall assume that the representation is to be an AST.

4.4.1 Representation
The following example illustrates how we can define ASTs in Java.

Example 4.19 Abstract syntax trees of Mini-Triangle

Figure 4.4 shows an example of a Mini-Triangle AST. Below we summarize all possible
forms of Mini-Triangle AST, showing how each form relates to one of the production
rules of the Mini-Triangle abstract syntax (Example 1.5):
Program ASTs (P):

Program
I (1.14)
C

Command ASTs (C):

Assigncommand CallCommand
el (1.15a) (1.15b)
V E Identifier E

spelling
Syntactic Analysis 1 1 1

A node with tag 'ConstDeclaration' is the root of a Declaration AST with two
subtrees: an ldentifier AST and an Expression AST.
A node with tag 'Identifier' is the root of an ldentifier AST. This is just a terminal
node, whose only content is its spelling.
We need to define Java classes that capture the structure of Mini-Triangle ASTs. We
begin by introducing an abstract class, called AST,for all abstract syntax trees:
public abstract class AST {

Every node in the AST will be an object of a subclass of AST.

Program ASTs:
public class Program extends AST {
public Command C; / / body of program

Program has only a single form, consisting simply of a Command, so the class
Program simply contains an instance variable for the command that is the body of the
program.
For each nonterminal in the Mini-Triangle abstract syntax that has several forms
(such as Command), we introduce an abstract class (such as Command), and several
concrete subclasses.
Command ASTs:
public abstract class Command extends AST { ... )

public class AssignCommand extends Command {

public Vname V; / / left-side variable
public Expression E; / / right-side expression

public class CallCommand extends Command {

public Identifier I; / / procedure name
public Expression E; / / actual parameter

public class Sequentialcommand extends Command {

public Command C1, C2; / / subcommands
Syntactic Analysis 1 15

Example 4.20 Construction of Mini-Triangle ASTs

Here we enhance the Mini-Triangle parser of Example 4.12, to construct an AST
representing the source program. The enhanced parsing methods will be as follows:
private Program parseprogram ( ) ;
private Command parsecommand 0 ;
private Command parseSingleComrnand 0 ;
private Expression parseExpression 0 ;
private Expression parse~rimary~xpression ();
private Declaration parseDeclaration 0 ;
private Declaration parse~ingle~eclaration 0 ;
private TfleDeno ter parseTypeDenoter ( ) ;
private Identifier parseIdentif ier ( ) ;
private Integerliteral parse1ntegerLiteral 0 ;
private Operator parseoperator ( ) ;
Each returns an AST of the appropriate class.
Here is the enhanced method parseSingleDeclaration (with the enhance-
ments italicized for emphasis):
private Declaration parseSingleDeclaration ( ) {
Declara tion declAST;
switch (currentToken.kind) {
case Token.CONST: {
acceptIt ( ) ;
Identifier iAST = parseIdentifier0 ;
accept (Token.IS);
Expression eAST = parseExpression ( ) ;
declAST = new ConstDeclaration (iAST, eAST);
1
break ;
case Token.VAR: {
acceptIt ( ) ;
Identifier iAST = parseIdentifier ( ) ;
accept(Token.COL0N);
TypeDenoter tAST = parseTypeDenoter ( ) ;
declAST = new VarDeclaration (iAST, tAST);
1
break ;
default :
report a syntactic error
J
return declAST;
I 16 Programming Language Processors in Java

This method is fairly typical. It has been enhanced with a local variable, declAST,in
which the AST of the single-declaration will be stored. The method eventually returns
this AST as its result. Local variables iAST,eAST,and tAST are introduced where
required to contain the ASTs of the single-declaration's subphrases.
Here is the enhanced method parsecommand:
private Command parsecommand ( ) {
Command clAST = parseSingleCommand();
while (currentToken.kind == Token.SEMICOLON) {
acceptIt ( ) ;
Command c2AST = parseSingleComrnand();
clAST = new SequentialCommand (clAST, c2AST) ;
1
return clAST;

This method contains a loop, arising from the iteration '*' in production rule (4.7),
which in turn was introduced by eliminating the left recursion in (1.2a-b). We must be
careful to construct an AST with the correct structure. The local variable clAST is used
to accumulate this AST.
Suppose that the command being parsed is 't := x; x := y ; y := t'.Then after
the method parses 't := x',it sets clAST to the AST for 't := x';after it parses 'x :=
y', it updates clAST to the AST for 't := x ; x := y';and after it parses 'y := t', it
updates c lAST to the AST for ' t : = x; x : = y; y : = t'.
Here is an outline of the enhanced method parseSingleCommand:
private Command parseSingleCommand () {
Command comAST;
switch (currentToken.kind) {
case Token.IDENTIFIER: {
Identifier iAST = parseIdentifier0 ;
switch (currentToken.kind) {
case Token.BECOMES: {
acceptIt ( ) ;
Expression eAST = parseExpression ( ) ;
comAST = new ~ssignCommand(iAST, eAST) ;
1
break ;
case Token.LPAREN: {
acceptIt ( ) ;
Expression eAST = parseExpression0;
accept(Token.RPAREN);
comAST = new CallCommand (iAST, eAST) ;

break ;
Syntactic Analysis 1

default :
report a syntactic error
I
1
break ;
case Token.IF:
...
case Token.WHILE:
...
case Token.LET: {
acceptIt ( ) ;
D e c l a r a t i o n dAST = parseDeclaration ( ) ;
accept(Token.IN);
Command CAST = parseSingleCommand();
comAST = new Letcommand ( d A S T , CAST) ;
1
break ;
case Token.BEGIN: {
acceptIt ( ) ;
comAST = parsecommand ( ) ;
accept(Token.END);
}
break ;
default :
report a syntactic error
1
return comAST;
1
If the single-command turns out to be of the form 'beginC end',there is no need 1
construct a new AST, since the 'begin'and 'end'are just command brackets. So i
this case the method immediately stores C's AST in comAST.
The method parseIdentifier constructs an AST terminal node:
private I d e n t i f i e r parseIdentifier () {
I d e n t i f i e r idAST;
if (currentToken.kind == Token-IDENTIFIER) {
i d A S T = new I d e n t i f i e r ( c u r r e nt T o k e n . s p e l l i n g ) ;
currentToken = scanner.scan();
) else
report a syntactic error
return i d A S T ;
1
The methods parseIntegerLitera1 and parseoperator do likewise.
Syntactic Analysis 119

private auxiliary methods t a k e and t a k e 1 t ;

private scanning methods developed in step ( 2 ) ,enhanced to record each token's
kind and spelling;
a public s c a n method that scans 'Separator* Token', discarding any separators
but returning the token that follows them.
The scanning methods will be analogous to the parsing methods we met in Sec-
tion 4.3. On entry to s c a n N , c u r r e n t c h a r is supposed to contain the first character
of a character sequence of kind N; on exit, c u r r e n t c h a r is supposed to contain the
character immediately following that character sequence.
Likewise, the auxiliary methods t a k e and t a k e I t are analogous to the parser's
auxiliary methods a c c e p t and a c c e p t I t . Both t a k e and t a k e I t will fetch the
next character from the source text and store it in c u r r e n t c h a r ; however, t a k e will
do so only if its argument character matches c u r r e n t c h a r .
The method s c a n is supposed to fetch the next token from the source program, each
time it is called. But the next token might turn out to be preceded by some separators.
This is the reason for scanning 'Separator* Token'. In this we are assuming that the
source language has a conventional lexicon: separators may be used freely between
tokens. (Most modern programming languages do follow this convention.)

Example 4.21 Scanner for Mini-Triangle

The lexical grammar of Mini-Triangle is partly given by production rules (1.10) through
(1.13). We add production rules for Token and Separator:
Token . Identifier I Integer-Literal I Operator I (4.14)
; l : l : = l - l ( 1 ) leot
Identifier ..- Letter I Identifier Letter I Identifier Digit (4.15)
Integer-Literal ..- Digit I Integer-Literal Digit (4.16)
Operator ::= + ~ - ~ * ~ / ~ < ~ > ~ (4.17)
= ~ \
Separator ...- Comment I space I eol (4.18)
Comment ..-
. ! Graphic* eol (4.19)
In these production rules:
space stands for a space character;
eol stands for an end-of-line 'character';
eot stands for an end-of-text 'character'.

(Visible characters can be expressed as themselves in (E)BNF, but these invisible

characters cannot.) Also:
Digit stands for one of the digits 'O', 'l', ..., or '9';
Syntactic Analysis 125

The lexical grammar of Triangle expressed in EBNF may be found in Section B.8.
Before developing the scanner, the lexical grammar was modified in two respects:
The production rule for Token was modified to add end-of-text as a distinct token.
Keywords were grouped with identifiers. (See Exercise 4.18 for an explanation.)
Most nonterminals were eliminated by substitution. The result was a lexical grammar
containing only individual characters, nonterminals that represent individual characters
(i.e., Letter, Digit, Graphic, and Blank), and the nonterminals Token and Separator:
Token ::= Letter (Letter I Digit)* I Digit Digit* I (4.24)
Op-character Op-character* I Graphic
. l ~ l ; l : ~ ~ l = ~ l ~ l ~ l ~ l ~ l l l ~ l l l
end-of-text
Separator ::= ! Graphic* end-of-line I Blank (4.25)
The Triangle scanner was then developed from this lexical grammar, following the
procedure described in Section 4.5.

4.6.2 Abstract syntax trees

The package Triangle.AbstractSyntaxTrees contains the class definitions for
the AST, in a style similar to that of Example 4.19. Each concrete subclass contains a
constructor for creating a new AST node, and the parser uses these to construct the
complete AST of the whole program.
The package Triangle .AbstractSyntaxTrees does not actually hide the
AST representation, so other parts of the compiler can directly access the instance
variables representing the subtrees of a node. However, the package does define a
design pattern, known as a visitor,for traversing the AST. This design pattern is used by
the later phases of the compiler. (Visitors will be explained in Chapter 5.)
In the Triangle compiler, an AST node contains more fields than shown in Exam-
ple 4.19. One such field, position,records the position of the corresponding phrase
in the source program. This is derived from the position fields of the phrase's con-
stituent tokens, and is useful for generating error reports. Every node in the AST has an
associated position, so position is declared as an instance variable of the AST class.
Some other fields (decl,type,and entity) are specific to certain classes of nodes
(principally identifiers, expressions, and declarations, respectively), and are therefore
declared as instance variables of the appropriate AST subclasses. These other fields will
be used later by the contextual analyzer and code generator to decorate the AST.

4.6.3 Parsing
The Parser class contains a recursive-descent parser, as described in Section 4.3. The
parser calls the scan method of the Scanner class to scan the source program, om
Syntactic Analysis 129

descent and backtracking algorithms) and bottom-up (Earley's algorithm, various

precedence algorithms, and the LR algorithm). The major triumph of this research has
been the discovery of algorithms for generating scanners and parsers automatically from
lexical grammars and (suitable) context-free grammars, respectively. A comprehensive
account of the theory of scanning and parsing may be found in Aho and Ullman ( 1 972).
For practical application in compilers, the recursive-descent and LR algorithms are
now generally held to be the best. Both algorithms are described in Chapter 4 of Aho et
al. (1985), emphasizing practical application rather than theory. Chapter 3 of the same
textbook covers scanning, including finite-state scanning (a good alternative to the
algorithm described in Section 4.5).
In Section 4.3 we saw how to construct a parser from the source language's context-
free grammar, and in Section 4.5 how to construct a scanner from its lexical grammar. Tt
is striking how straightforward the construction algorithms are - almost mechanical.
This is also true for other algorithms such as finite-state scanning and LR parsing. A
variety of tools have been developed that generate scanners and parsers automatically.
Among the best-known are the UNIX tools Lex and Yacc. Lex (Lesk and Schmidt
1975) accepts the lexical grammar of a source language S, and from it generates a finite-
state scanner for S, expressed in C. Analogously, Yacc (Johnson 1975) accepts the
context-free grammar of S, and from it generates an LR parser for S, also expressed in
C. Both Lex and Yacc are described in Aho et al. (1985), which explains how they work
and shows how to use them in practical applications. More recently, versions of Lex and
Yacc have appeared that generate scanners and parsers in languages other than C.
JavaCC (www. suntest.corn/ JavaCC/) is a powerful tool that can be used to
generate a complete syntactic analyzer - scanner, parser, and tree builder - expressed in
Java. JavaCC accepts a grammar expressed in EBNF, and the generated parser uses
recursive descent.

Exercises
Section 4.1
4.1 Perform syntactic analysis of the Mini-Triangle program:
begin while true do putint(1); putint(0) end
along the lines of Figures 4.1 through 4.4.

4.2 Modify the class Token (Example 4.2) so that the instance variable spell-
ing is left empty unless the token is an identifier, literal, or operator.
Syntactic Analysis 131

4.9 A calculator accepts commands according to the following EBNF grammar:

Command ::= Expression =
Expression ::= Numeral ((+ ( - ( *) Numeral)*
Numeral ::= Digit Digit*

(a) Construct a recursive-descent parser for a calculator command. The termi-

nal symbols should be individual characters.
(b) Enhance the parser to display the command's result.

4.10* The following EBNF grammar generates a subset of the UNIX shell command
language:
Script ..-
. Command*
Command ::= Filename Argument* eol
I Variable = Argument eol
I if Filename Argument* then eol
Command*
else eol
Command*
f i eol
I for Variable in Argument* eol
do eol
Command*
od eol
Argument ::= Filename I Literal ( Variable
The start symbol is Script. The token eol corresponds to an end-of-line.
Construct a recursive-descent parser for this language. Treat filenames, literals,
and variables as single tokens.

4.11* Consider the rules for converting EBNF production rules to parsing methods
(Section 4.3.4).
(a) Suggest an alternative refinement rule for 'parse X I Y , using an if-
statement rather than a switch-statement.
(b) In some variants of EBNF, [a is used as an abbreviation for X ( E.
Suggest a refinement rule for 'parse [XI'.
(c) In some variants of EBNF, Xt is used as an abbreviation for X X*.
Suggest a refinement rule for 'parse X+'.
In each case, state any condition that must be satisfied for the refinement rule
to be correct.
134 Programming Language Processors in Java

another AST node. A terminal node contains a tag and a spelling. The tag dis-
tinguishes between an identifier, a literal, and an operator.
(a) Reimplement the class AST for Mini-Triangle.
(b) Provide this class with a method d i s p l a y , as specified in Exercise 4.15.

Section 4.5
4.17 The Mini-Triangle scanner (Example 4.21) stores the spellings of separators,
including comments, only to discard them later. Modify the scanner to avoid
this inefficiency.

4.18* Suppose that the Mini-Triangle lexical grammar (Example 4.21) were modified
as follows, in an attempt to distinguish between identifiers and keywords (such
as ' i f ', 'then', ' e l s e ' , etc.):
Token .
..- Identifier I Integer-Literal I Operator (
if 1 then ( else 1 ... 1
; ~ : ~ : = ~ - ~ ( ~ ) ~ e o t
Identifier ::= Letter (Letter ( Digit)"
Point out a serious problem with this lexical grammar. (Remember that the ter-
minal symbols are individual characters.) Can you see any way to remedy this
problem?

4.19 (a) Modify the Mini-Triangle lexical grammar (Example 4.21) as follows.
Allow identifiers to contain single embedded underscores, e.g., 'set-up'
(but not ' s e t u p ' , nor 'set-', nor '-up'). Allow real-literals, with a
decimal point surrounded by digits, e.g., ' 3 . 1 4 1 6 ' (but not ' 4 . ', nor
' .1 25 ' ) .

(b) Modify the Mini-Triangle scanner accordingly.

General
4.20* Consider a hypothetical programming language, Newspeak, with an English-
like syntax (expressed in EBNF) as follows:
Program ..-
.- Command .
Command ..- single-Command single-Command*
do nothing
store Expression in Variable
if Condition : single-Command
otherwise : single-Command
do Expression times : single-Command
Syntactic Analysis 135

Expression ..- Numeral

I Variable
I sum of Expression and Expression
I product of Expression and Expression
Condition ..- Expression is Expression
I Expression is less than Expression
Numeral ...-
.- Digit Digit*
Variable ...-
.- Letter Letter*
Consecutive keywords and variables must be separated by blank space; other-
wise blank space may be inserted freely between symbols.
Design and implement a syntactic analyzer for Newspeak:
(a) Decide which Newspeak symbols should be tokens, and how they should
be classified. Define the class Token. Then implement a Newspeak
scanner.
(b) Name and specify the parsing methods in a recursive-descent parser for
Newspeak. Then implement the Newspeak parser.

4.21** Design and implement a complete syntactic analyzer for your favorite pro-
gramming language.

4.22"" A cross-referencer is a language processor that lists each identifier that occurs
in the source program, together with the line numbers where that identifier oc-
curs. Starting with either the Mini-Triangle syntactic analyzer or the syntactic
analyzer you implemented in Exercise 4.21:
(a) Modify the scanner so that every token contains a field for the line number
where it occurs.
(b) Develop a simple cross-referencer, reusing appropriate parts of your syn-
tactic analyzer.
(c) Now make your cross-referencer distinguish between defining and applied
occurrences of each identifier.

4.23** A pretty-printer is a language processor that reproduces the source program

with consistent indentation and spacing. Starting with either the Mini-Triangle
syntactic analyzer or the syntactic analyzer you implemented in Exercise 4.21:
(a) Develop a simple pretty-printer, reusing appropriate parts of your syntactic
analyzer. At this stage your pretty-printer need not reproduce comments.
(b) Now make your pretty-printer reproduce comments.
Contextual Analysis 147

public class ConstAttribute extends Attribute {

Type type;
1
public class VarAttribute extends Attribute {
Type type;

public class ProcAttribute extends Attribute {

FormalList formals; / / a list of (identifier, attribute) pairs

public class FuncAttribute extends Attribute {

FormalList formals; / / a list of (identifier, attribute) pairs
Type resultType;
1
public class TypeAttribute extends Attribute {
Type type;
1
0
For a realistic source language, the information to be stored in the identification table
is quite complex, as Example 5.5 illustrates. A lot of tedious programming is required to
declare and construct the attributes.
Fortunately, this can be avoided if the source program is represented by an AST.
This is because the AST itself contains the information about identifiers that we need to
store and retrieve. The information associated with an identifier I can be accessed via a
pointer to the subtree that represents the declaration of I. In other words, we can replace
the class Attribute with the class Declaration throughout the definition of the
IdentificationTable class (assuming the AST representation described in
Section 4.4.1).

Example 5.6 Mini-Triangle attributes represented by declaration ASTs

Consider once more the Mini-Triangle program outlined in Figure 5.3. Figure 5.4 shows
part of the AST representing this program, including one of the inner blocks, with the
subtree representing each block shaded to indicate its scope level. Figure 5.4 also shows
a picture of the identification table as it stands during contextual analysis of each block.
When the contextual analyzer visits the declaration at subtree (I), it calls enter
with identifier a and a pointer to subtree (1). Whenever it visits an applied occurrence of
a, it calls retrieve with identifier a,and thus retrieves the pointer to subtree (1). By
inspecting this subtree, it determines that a denotes an integer variable. The other
declarations are treated similarly.
0
Contextual Analysis 149

A programming language must also specify the appropriate scope rule for the
standard environment. Most programming languages consider the standard environment
to be a scope enclosing the whole program, so that the source program may contain a
declaration of an identifier present in the standard environment without causing a scope
error. Some other programming languages (such as C) introduce the standard environ-
ment at the same scope level as the global declarations of the source program.
If the standard environment is to be at a scope enclosing the whole program, the
declarations of the standard environment should be entered at scope level 0 in the
identification table.

Example 5.7 Standard environment in Mini-Triangle

The standard environment of Mini-Triangle contains the following constant, type,
procedure, and operator declarations:
type Boolean - ... ;
const false - ... ;
const true - ... ;
func \ (b: Boolean) : Boolean - ... ;
type Integer - ... ;
const maxint - ... ;
func + (il: Integer, i2: Integer) : Integer - ...;
func - (il: Integer, i2: Integer) : Integer - ... ;
func * (il: Integer, i2: Integer) : Integer - ... ;
func / (il: Integer, i2: Integer) : Integer - ... ;
func < (il: Integer, i2: Integer) : Boolean - ...;
func > (il: Integer, i2: Integer) : Boolean - ... ;
proc putint (i: Integer) - .,.;
In addition, the following operator is available for every type T (i.e., both Integer
and Boolean):
func = (vall: T , va12: T ) : Boolean - ... ;
Here, a unary operator declaration is treated like a function declaration with one
argument, and a binary operator declaration is treated like a function declaration with
two arguments. The operator symbol is treated like a normal identifier. The contextual
analyzer only requires information about the types of the arguments and result of an
operator, and so these declarations have no corresponding expressions.
152 Programming Language Processors in Java

(We show the inferred type T by annotating the AST node with ': T'.)
If I is declared in a variable declaration, whose right side is type T, then the type of
the applied occurrence of I is T:

I I I I
e,
VarDeclaration
---- -----
Ident. T
SimpleVname
I
Ident.
,-J,
VarDeclaration

Ident. T
SimpleVname : T
----- -- -- I
Ident.

An application of a binary operator such as 'i' would be handled as follows:

I I
BinaryExpression BinaryExpression : boo1
m.
. . .Expr. Op. . .Expr.
m.
. .Expr. Op. ..Expr.
I : int / I : int 1:int / I : int
... < ... ... < ...

The operator '<' is of type int x int + bool. Having checked that the type of E l is
equivalent to int, and that the type of E2 is equivalent to int, the type checker infers that
the type of ' E l < E2' is bool. Other operators would be handled similarly.
0
Of course, Mini-Triangle type checking is exceptionally simple: the representation
of types is trivial, and testing for type equivalence is also trivial. Type checking is more
complicated if the source language has composite types. For example, Triangle array
and record types have component types, which are unrestricted. Thus we need to
represent types by trees.
Furthermore, there are two possible definitions of type equivalence.
Some programming languages (such as Triangle) adopt structural equivalence,
whereby two types are equivalent if and only if their structures are the same. If types are
represented by trees, structural equivalence can be tested by comparing the structures of
these trees. If the implementation language is Java, then this kind of equality is conven-
tionally tested by an e q u a l s method in the T y p e class.
Other programming languages (such as Pascal and Ada) adopt name equivalence.
Every occurrence of a type constructor (e.g., a r r a y or r e c o r d ) creates a new and
distinct type. In this case type equivalence can be tested simply by comparing pointers
to the objects representing the types: distinct objects (created at different times)
represent types that are not equivalent, even if they happen to be structurally similar. If
the implementation language is Java, then this kind of equality is tested by the '=='
operator applied to objects of class Type.
Contextual Analysis 1

The work of the contextual analyzer will be done by a set of visitor methods. The
will be exactly one visitor method, visitA,for each concrete AST subclass A.The
visitor methods will cooperate to traverse the AST in the desired order.

Example 5.10 Mini-Triangle visitor methods

The visitor methods for Mini-Triangle are summarized by the following Java interface
public interface Visitor {
/ / Programs:
public Object visitprogram
(Program prog, Object arg);
/ / Commands:
public Object visitAssignCommand
(Assigncommand corn, Object arg);
public Object visitCallCommand
(CallCommand corn, Object arg);
public Object visitSequentialCommand
(SequentialCommand corn, Object arg);
public Object visitIfCommand
(Ifcommand com, Object arg);
public Object visitWhileCommand
(Whilecommand com, Object arg);
public Object visitLetCommand
(Letcommand com, Object arg);
/ / Expressions:
public Object visitIntegerExpression
(IntegerExpression expr, Object arg);
public Object visitVnameExpression
(VnarneExpression expr, Object arg)
public Object visitUnaryExpression
(UnaryExpression expr, Object arg)
public Object visitBinaryExpression
(BinaryExpression expr, Object arg
/ / Value-or-variable-names:
public Object visitsimplevname
(Simplevname vname, Object arg);
/ / Declarations:
public Object visitConstDeclaration
(ConstDeclaration decl, Object arg);
public Object visitVarDeclaration
(VarDeclaration decl, Object arg);
public Object visitSequentialDeclaration
(SequentialDeclarationdecl, Object arg);
Contextual Analysis 157

public class Assigncommand extends Command {

...
public Object visit (Visitor v, Object arg) {
return v.visitAssignCommand(this, arg);
1
}

public class Ifcommand extends Command {

...
public Object visit (Visitor v, Object arg) {
return v.visitIfCommand(this, arg);
1
1
Each visit method 'knows' which particular visitor method to call. For example,
I £Command's visit method knows that this is an IfCommand object, so it calls
the v.visitIfCommand method, passing this (and arg) as arguments. In general,
the visit method in the concrete AST subclass A simply calls v.visi tA:
public class A extends ... {
...
public Object visit (Visitor v, Object arg) {
return v.visitA(this, arg);
I
1
When visi tA visits an AST node, it may visit any child of that node by calling that
child's own visit method.

5.3.3 Contextual analysis as a visitor object

The contextual analyzer is a visitor object that performs identification and type
checking. Each visitor method visitA in the contextual analyzer will check a node of
class A, generating an error report if it determines that the phrase represented by the
node is ill-formed. Visitor methods in the contextual analyzer can conveniently be called
checking methods.

Example 5.11 Mini- Triangle contextual analysis

The Mini-Triangle contextual analyzer is an implementation of the Visitor interface
given in Example 5.10.
We shall assume the following representation of Mini-Triangle types, adapted from
Example 5.8:
public class Type C
private byte kind; / / either BOOL,I N T or ERROR
Contextual Analysis 161

The visitorlchecking methods visitIntegerExpression and vis itVname-

Expression are self-explanatory. In the case of a binary operator application ' E l 0
E2', visitBinaryExpression assumes that 0 .visit returns a pointer to a
'declaration' of operator 0,where the operator's operand and result types may be found.
(This declaration will be the attribute value returned by searching the identification table
for the operator.) Method visitUnaryExpression is similar to visitBinary-
Expression.
The visitSimp1evname visitorlchecking method checks that the value-or-
variable-name is well-formed, and decorates it with its inferred type together with an
indication of whether it is a variable or not. The method's result is the inferred type.
(The arg object is ignored.) The following are typical:
public Object visitSimpleVname
(Simplevname vname, Object arg) {
Declaration decl =
(Declaration) vname.I.visit (this, null);
if (decl == null) {
... / / Report an error - this identifier is not declared.
vname.type = Type.error;
vname-variable = true; / / decoration
) else if (decl instanceof ConstDeclaration) {
vname.type = ((ConstDeclaration) decl) .E.type;
vname.variable = false; / / decoration
} else if (decl instanceof VarDeclaration) {
vname.type = ((VarDeclaration) decl) .T.type;
vname.variable = true; / / decoration
1
return vname.type;
1
Each of the visit ...Declaration visitorlchecking methods checks that the
declaration is well-formed, and enters all declared identifiers into the identification
table. (The method's result is null,and the arg object is ignored.)
public Object visitConstDeclaration
(ConstDeclaration decl, Object arg) {
decl.E.visit(this, null); / / result is ignored
idTable.enter(decl.I.spelling, decl);
return null;

public Object visitVarDeclaration

(VarDeclaration decl, Object arg) {
decl .T.visit(this, null) ; / / result is ignored
idTable.enter(decl.I.spelling, decl);
return null;
Contextual Analysis 165

(5) array 10 of Char

(6) array 10 of Char
(7) array 10 of Integer
(8) record n: Integer, c: Char end
(9) record c: Char, n: Integer end

In Triangle type equivalence is structural. Of the types shown in Figure 5.5, only (5)
and (6) are equivalent to each other. To test whether two types are equivalent, the type
checker just compares their ASTs structurally. This test is performed by defining an
equals method in each subclass of TypeDenoter.Class TypeDenoter itself is
enhanced as follows:
public abstract class TypeDenoter extends AST {
public abstract boolean equals (Object other);

Type identifiers in the AST would complicate the type equivalence test. To remove
this complication, the visitorlchecking methods for type-denoters are made to eliminate
all type identifiers. This is achieved by replacing each type identifier by the type it
denotes.
Figure 5.6 shows the ASTs representing the following Triangle declarations:
type Word -
array 8 of Char;
var wl: Word;
var w2: array 8 of Char
Initially the type subtrees (1) and (2) in the two variable declarations are different. After
these subtrees have been checked, however, the type identifiers 'Char'and 'Word'
have been eliminated. The resulting subtrees (3) and (4) are structurally similar. The
elimination of type identifiers makes it clear that the types of variables wl and w2 are
equivalent.
A consequence of this transformation is to make each type 'subtree' (and hence the
whole AST) into a directed acyclic graph. Fortunately, this causes no serious complic-
ation in the Triangle compiler. (But recursive types - as found in Pascal, Ada, and ML -
would cause a complication: see Exercise 5.9.)
The Triangle type checker infers and checks the types of expressions and value-or-
variable-names in much the same way as in Example 5.8. Types are tested for structural
equivalence by using the equals method of the TypeDenoter class. (Instead,
comparing types by means of '==' would implement name equivalence.)
Contextual Analysis 167

Before analyzing a source program, the contextual analyzer initializes the identifi-
cation table with entries for the standard identifiers, at scope level 0, as shown in Figure
5.8. The attribute stored in each of these entries is a pointer to the appropriate 'declar-
ation'. Thus standard identifiers are treated in exactly the same way as identifiers dec-
lared in the source program.

-
(1) (2) (3)
TypeDeclaration ConstDeclaration ConstDeclaration
l-l
Ident. boo1
i-l
Ident EmptyExpr.
I
Ident EmptyExpr.

~ooiean faise true

--
(4)
FuncDeclaration

Ident. EmptyFP. bool EmptyExpr.

eof

(5) (6)
ProcDeclaration ProcDeclaration

Ident. SingleFP. SkipCommand Ident. SingleFP. SkipCommand

-
get VarFP.

r-l
Ident. char
el
Ident. char

dummy
(8)
BinaryOpDeclaration

Op. boo1 bool Op. int int boo1

Figure 5.7 Small ASTs representing the Triangle standard environment (abridged).
168 Programming Language Processors in Java

Level Ident. Attr.

0 Boolean (1)
0 false (2)

Figure 5.8 Identification table for the Triangle standard environment (abridged).

The Triangle standard environment also includes a collection of unary and binary
operators. It is convenient to treat operators in much the same way as identifiers, as
shown in Figures 5.7 and 5.8.'
The representation of the Triangle standard environment therefore includes small
ASTs representing 'operator declarations', such as one for the unary operator ' \ ' (7),
and one for the binary operator '<' (8). (See Figure 5.7.) An 'operator declaration'
merely defines the types of the operator's operand(s) and result. Entries are also made
for operators in the identification table. (See Figure 5.8.) At an application of operator
0, the identification table is used to retrieve the 'operator declaration' of 0,and thus to
find the operand and result types for type checking.

Further reading
For a more detailed discussion of declarations, scope, and block structure, see Chapter 4
of the companion textbook by Watt (1990). Section 2.5 of the same textbook discusses
simple type systems (of the kind found in Triangle, Pascal, and indeed most program-
ming languages). Chapter 7 goes on to explore more advanced type systems. Coercions
(found in most languages) are implicit conversions from one type to another. Overload-
ing (found in Ada and Java) allows several functions/procedures/methods with different
bodies and different types to have a common identifier, even in the same scope. In a
function/procedure/method call with this common identifier, a technique called overload
resolution is needed to identify which of several functions/procedures/methods is being
called. Parametric polymorphism (found in M L ) allows a single function to operate

' Indeed, some programming languages. such as ML and Ada, actually allow operators to be
declared like functions in the source program. This emphasizes the analogy between operators
and function identifiers.
Contextual Analysis 169

uniformly on arguments of a family of types (e.g., the list types). Moreover, the types of
functions, parameters, etc., need not be declared explicitly. Polymorphic type inference
is a technique that allows the types in a source program to be inferred in the context of a
polymorphic type system.
For a comprehensive account of type checking, see Chapter 6 of Aho et al. (1985).
As well as elementary techniques, the authors discuss techniques required by the more
advanced type systems: type checking of coercions, overload resolution, and polymor-
phic type inference. For some reason, however, Aho et al. defer discussion of identifi-
cation to Chapter 7 (run-time organization).
A classic paper on polymorphic type inference by Milner (1978) was the genesis of
the type system that was adopted by ML, and borrowed by later functional languages.
For a good short account of contextual analysis in a one-pass compiler for a Pascal
subset, see Chapter 2 of Welsh and McKeag (1980). The authors clearly explain ways of
representing the identification table, attributes, and types. They also present a simple
error recovery technique that enables the contextual analyzer to generate sensible error
reports when an identifier is declared twice in the same scope, or not declared at all.
The visitor pattern used to structure the Triangle compiler is not the only possible
object-oriented design. One alternative design, explained in Appel (1 997), is to associate
the checking methods (and the encoding methods in the code generator) for a particular
AST object with the AST object itself. This design is initially easier to understand than
the visitor design pattern, but it has the disadvantage that the checking methods (and
encoding methods) are spread all over the AST subclass definitions instead of being
grouped together in one place.
You should be aware of a lack of standard terminology in the area of contextual
analysis. Identification tables are often called 'symbol tables' or 'declaration tables'.
Contextual analysis itself is often misnamed 'semantic analysis'.

Exercises
Section 5.1
5.1 Consider a source language with monolithic block structure. as in Section
5.1.1, and consider the following ways of implementing the identification table:
(a) an ordered list;
(b) a binary search tree;
(c) a hash table.
In each case implement the IdentificationTable class, including the
methods enter and retrieve.
In efficiency terms, how do these implementations compare with one another?
170 Programming Language Processors in Java

5.2 Consider a source language with flat block structure, as in Section 5.1.2.
Devise an efficient way of implementing the identification table. Implement the
IdentificationTable class, including the methods enter,retrieve,
openscope,and closescope.

5.3" For a source language with nested block structure, as in Section 5.1.3, we could
implement the identification table by a stack of binary search trees (BSTs).
Each BST would contain entries for declarations at one scope level. Consider
the innermost block of Figure 5.3, for example. At the stack top there would be
a BST containing the level-3 entries; below that there would be a BST
containing the level-2 entries; and at the stack bottom there would be a BST
containing the level- 1 entries.
Implement the Identif icationTable class, including the methods en-
ter,retrieve,openscope,and closescope.
In efficiency terms, how does this implementation compare with that used in
the Triangle compiler (Section 5.4.1)?

5.4* For a source language with nested block structure, we can alternatively imple-
ment the identification table by a sparse matrix, with columns indexed by scope
levels and rows indexed by identifiers. Each column links the entries at a par-
ticular scope level. Each row links the entries for a particular identifier, in order
from innermost scope to outermost scope. In the innermost block of Figure 5.3,
for example, the table would look like Figure 5.9.
Implement the IdentificationTable class, including the methods en-
ter,retrieve,openscope,and closescope.
In efficiency terms, how does this implementation compare with that used in
the Triangle compiler (Section 5.4.1), and with a stack of binary search trees
(Exercise 5.3)?

5.5" Outline an identification algorithm that does not use an identification table, but
instead searches the AST. For simplicity, assume monolithic block structure.
In efficiency terms, how does this algorithm compare with one based on an
identification table?
CHAPTER SIX

Run-Time Organization

A programming language supports high-level concepts such as types and values,

expressions, variables, procedures, functions, and parameters. The target machine
typically supports low-level concepts such as bits, bytes, words, registers, stacks,
addresses, and (sub)routines. The gap between the higher and lower levels is often
called the semantic gap. Bridging this gap is the task of the code generator.
Before writing a code generator, however, we must decide how to marshal the
resources of the target machine (instructions, storage, and system software) in order to
implement the source language. This is called run-time organization, and is the subject
of this chapter.
The following are key issues in run-time organization:
Data representation: How should we represent the values of each source-language
type in the target machine?
Expression evaluation: How should we organize the evaluation of expressions,
taking care of intermediate results?
Storage allocation: How should we organize storage for variables, taking into
account the different lifetimes of global, local, and heap variables?
Routines: How should we implement procedures, functions, and parameters, in terms
of low-level routines?
We shall study all these topics in this chapter, together with another topic of ever-
increasing interest:
Run-time organization for object-oriented languages: How should we represent
objects and methods?
A thorough knowledge of run-time organization is essential for implementors of
language processors, but a basic knowledge is useful to any serious programmer. In
order to make rational programming decisions, the application programmer should have
a feel for the efficiency of various high-level language constructs. An example is the
choice of data structures: as we shall see, records and static arrays can be represented
very efficiently, but the representations of dynamic arrays and recursive types carry
overheads (indirect addressing, garbage collection) that might be unacceptable in some
174 Programming Language Processors in Java
8
i
applications. This chapter covers all of these topics, for the sake of completeness,
although not all of them are essential to understand the Triangle language processor.

6.1 Data representation i

Programming languages provide high-level data types such as truth values, integers,
characters, records, and arrays, together with operations over these types. Target I
machines provide only machine 'types' such as bits, bytes, words, and double-words,
together with low-level arithmetic and logical operations. To bridge the semantic gap i
between the source language and the target machine, the implementor must decide how
to represent the source language's types and operations in terms of the target machine's
types and operations.
i
In the following subsections we shall survey representations of various types. As we 1
study these representations, we should bear in mind the following fundamental
principles of data representation:
Nonconfusion: Different values of a given type should have different representations.
i
Uniqueness: Each value should always have the same representation.
The nonconfusion requirement should be self-evident. If two different values are
confused, i.e., have the same representation, then comparison of these values will
incorrectly treat the values as equal.
Nevertheless, confusion does arise in practice. A well-known example is the approx-
imate representation of real numbers: real numbers that are slightly different mathemat-
ically might have the same approximate representation. This confusion is inevitable,
however, given the design of our digital computers. So language designers must formul-
ate the semantics of real-number operations with care; and programmers on their part
must learn to live with the problem, by avoiding naive comparisons of real numbers.
On the other hand, confusion can and must be avoided in the representations of
discrete types, such as truth values, characters, and integers.
If the source language is statically typed, the nonconfusion requirement refers only
to values of the same type; values of distinct types need not have distinct represent-
ations. Thus the word 00.. .002 may represent the truth value false, the integer 0, the real
number 0.0, and so on. Compile-time type checks will ensure that values of different
types cannot be used interchangeably at run-time, and therefore cannot be confused.
Thus we can be sure that if 00 ...002 turns up as an operand of a logical operation, it
represents false, whereas if it turns up as an operand of an arithmetic operation, it
represents the integer 0.
The uniqueness requirement is likewise self-evident. Comparison of values would be
complicated by the possibility of any value having more than one representation. Cor-
rect comparison is possible, however, so uniqueness is desirable rather than essential.
Run-Time Organization 175

An example of nonuniqueness is the ones-complement representation of integers, in

which zero is represented both by 00.. .002 and by 1 1.. .112. (These are +O and -0!) A
simple bit-string comparison would incorrectly treat these values as unequal, so a more
specialized integer comparison must be used. The alternative twos-complement repre-
sentation does give us unique representations of integers.
As well as these fundamental principles, we should bear in mind the following
pragmatic issues in data representation:
Constant-size representation: The representations of all values of a given type should
occupy the same amount of space.
Direct representation or indirect representation: Should the values of a given type be
represented directly, or indirectly through pointers?
Constant-size representation makes it possible for a compiler to plan the allocation
of storage. Knowing the type of a variable, but not its actual value, the compiler will
know exactly how much storage space the variable will occupy.
The direct representation of a value x is just the binary representation of x itself,
which consists of one or more bits, bytes, or words. This is illustrated in Figure 6.l(a).
The indirect representation of x is a handle, which points to a storage area (usually
in the heap) that contains the binary representation of x. See Figure 6.l(b).
To understand the distinction, it is helpful to visualize what happens when the value
x is copied (e.g., passed as an argument). With the direct representation, it is the binary
representation of x that is copied. With the indirect representation, it is only the handle
to x that is copied. The direct representation is so called because x can be accessed using
direct addressing; the indirect representation is so called because x must be accessed
using indirect addressing.
The choice of direct or indirect representation is a key design decision in run-time
organization. Implementations of imperative languages like Pascal and C adopt the
direct representation wherever possible, because values can be accessed more efficiently
by direct addressing, and because the overheads of heap storage management are avoid-
ed. Implementations of functional languages like ML and Haskell usually adopt the
indirect representation, because it simplifies the implementation of polymorphic func-
tions. Implementations of object-oriented languages like Java adopt the direct represent-
ation for primitive types and the indirect representation for objects (see Section 6.7).

Figure 6.1 (a) Direct representation of a value x; (b) indirect representation of a value x;
(c) indirect representation of a value y, of the same type as x but requiring
more space.
176 Programming Language Processors in Java

Indirect representation is essential for types whose values vary greatly in size. For
example, a list or dynamic array may have any number of elements, and clearly the total
amount of space depends on the number of elements. For types such as this, indirect
representation is the only way to satisfy the constant-size requirement. This is illustrated
in Figure 6.l(b) and (c) where, although the values x and y occupy different amounts of
space, the handles to x and y occupy the same amount of space.
We now survey representations of the more common types found in programming
languages. We shall assume direct representation wherever possible, i.e., for primitive
types, records, disjoint unions, and static arrays. But we shall see that indirect represen-
tation is necessary for dynamic arrays and recursive types.
We shall use the following notation:
#T stands for the cardinality of type T, i.e., the number of distinct values of type T.
For example, # [ B o o l e a n ] = 2.
size T stands for the amount of space (in bits, bytes, or words) occupied by each value
of type T. If indirect representation is used, only the handle is counted.
We use emphatic brackets to enclose a specific type-denoter, as in # [ B o o l e a n ] or
sizel[Boolean] or sizel[array 8 o f C h a r ] .
If a direct representation is chosen for values of type T, we can assert the inequality:
size T 2 log2 (#T), or equivalently 2(si~e
T, 2 #T (6.1)
where size T is expressed in bits. This follows from the nonconfusion requirement: in n
bits we can represent at most 2n distinct values if we are to avoid confusion.

6.1.1 Primitive types

The primitive types of a programming language are those whose values are primitive,
i.e., cannot be decomposed into simpler values. Examples of primitive types are Boo-
l e a n , C h a r , and I n t e g e r . Most programming languages provide these types,
equipped with the elementary logical and arithmetic operations.
Machines typically support such types and operations directly, so the choice of
representation is straightforward.
The values of the type B o o l e a n are the truth values false and true. We can
represent a truth value by one word, one byte, or even a single bit. (Since # [ B o o l e a n ] l
= 2, clearly s i z e [ B o o l e a n ] 2 1 bit.)
Using a single bit, the conventional representations are 0 for false and 1 for true.
Using a byte or word, the conventional representations are 00.. .002 for false, and either
00.. .012 or 11.. .1 12 for true. The operations on truth values - negation, conjunction,
and disjunction - can be implemented by the machine's logical NOT, AND, and OR oper-
ations. (See also Exercise 6.2.)
Run-Time Organization 177

The values of the type C h a r are the elements of a character set. Sometimes the
source language specifies a particular character set. For example, Ada specifies the ISO-
Latin1 character set, which consists of 28 distinct characters, and Java specifies the
Unicode character set, which consists of 216 distinct characters. Most programming
languages, however, are deliberately unspecific about the character set. This allows the
compiler writer to choose the target machine's 'native' character set. Typically this
consists of 27 or 28 distinct characters. In any case, the choice of character set
determines the representation of individual characters. For example, I S 0 defines the
representation of character 'A' to be 010000012. We can represent a character by one
byte or one word.
The values of the type I n t e g e r are integer numbers. Obviously we cannot repre-
sent an unbounded range of integers within a fixed amount of space. All major program-
ming languages take account of this in their semantics: I n t e g e r denotes an
implementation-defined' bounded range of integers. The binary representation of
integers is determined by the target machine's arithmetic unit, and almost always
occupies one word. The source language's integer operations can then, for the most part,
be implemented by the corresponding machine operations.
In Pascal and Triangle, I n t e g e r denotes the range -muxint, ..., -1, 0, +1, ...,
+maxint, where the constant maxint is implementation-defined. In this case we have
#[[IntegerlJ = 2 x maxint + 1, and therefore we can specialize (6.1) as follows:

If the word size is w bits, then size[[IntegerJ = w. To ensure that (6.2) is satisfied, the
implementation should define maxint = 2W-1- 1 .
In Java, i n t denotes the range -23l, ..., -1,0, +1, ..., In this case we have
# t i n t ] = 232.

Example 6.1 Primitive data representations in TAM

TAM is the target machine of the Triangle compiler. Storage is organized in 16-bit
words. There are no smaller storage units, but multi-word objects are addressable. The
Triangle primitive types are represented as follows:
Type Representation Size
Boolean 00 ...002 for false; 00 ...012 for true 1 word
Char Unicode representation 1 word
Integer twos-complement representation 1 word
Thus maxint = 215 - 1 = 32767.

' An attribute of a programming language L is implementation-defined if it is not defined by the

specification of L. but must be defined by each individual L language processor.
178 Programming Language Processors in Java

Example 6.2 Primitive data representations in the Intel Pentium

The Intel Pentium processor is ubiquitous. Storage is organized in 8-bit bytes, 16-bit
half-words, 32-bit words, and 64-bit double-words. Primitive types could be represented
as follows:
TYpe Representation Size
Boolean 000000002 for false; 111111112 for true l byte
Char ASCII representation 1 byte
Integer twos-complement representation 1 half-word or 1 word
Thus maxint = 215 - 1 = 32767, or maxint = 231- 1 = 2147483647.

Some programming languages allow the programmer to define new primitive types.
An example is the enumeration type of Pascal. The values of such a type are called
enumerands. Enumerands can be represented by small integers.

Example 6.3 Enumerand representation

Consider the following Pascal type definition:
t y p e C o l o r = ( r e d , o r a n g e , y e l l o w , g r e e n , blue)
This creates a new enumeration type consisting of five enumerands, which we shall
write as red, orunge, yellow, green, and blue. It also binds the identifiers r e d , o r a n g e ,
etc., to these enumerands.'
The enumerands will be represented by 00.. .0002 for red, 00.. .0012 for orange,
00.. .O102 for yellow, 00.. ,0112 for green, and 00.. .1002 for blue. Since # [ c o l o r ] = 5 ,
clearly size[[Color]l2 3 bits. In practice we would use one byte or one word.
0
To generalize, consider the enumeration type defined by:

We can represent each Ii by (the binary equivalent of) i. Since #T = n , size T 2 log2 n
bits.
The enumeration type is equipped with operations such as s u c c (which returns the
successor of the given enumerand) and o r d (which returns the integer representation of
the given enumerand). The representation chosen allows s u c c to be implemented by
the target machine's INC operation (if available). The o r d operation becomes a NOP.

We must distinguish between the identifiers and the enumerands they denote, because the
identifiers could be redeclared.
Run-Time Organization 179

6.1.2 Records
Now we proceed to examine the representation of composite types. These are types
whose values are composed from simpler values.
A record consists of several fields, each of which has an identifier. A record type
designates the identifiers and types of its fields, and all the records of a particular type
have fields with the same identifiers and types. The fundamental operation on records is
field selection, whereby we use one of the field identifiers to access the corresponding
field.
Records occur obviously in Pascal, Ada, and Triangle, and as s t r u c t s in C.
There is an obvious and good direct representation for records: we simply juxtapose
the fields, i.e., make them occupy consecutive positions in storage. This representation
is compact, and makes it easy to implement field selection very efficiently.

Example 6.4 Triangle record representation

Consider the record types and variables introduced by the following Triangle declar-
ations:
type Date = record
y: Integer,
m: Integer,
d: Integer
end ;
type Details = record
female: Boolean,
dob : Date,
status: Char
end ;
var today: Date;
var her: Details

today.y
t0day.m
today.d
her.female
her,

her.status
{
h e .d o .Y
her.dob.m
her.dob.d
rl
Assume for simplicity that each primitive value occupies one word. Then the
variables today and her (after initialization) would look like this:

Each box in the diagram is a word. A variable of type Date (such as today) occupies
three consecutive words, one for each of its fields. A variable of type Details (such
as her) occupies five consecutive words, one for its field female,three for its field
dob,and one for its field status.
180 Programming Language Processors in Java

We can predict not only the total size of each record variable, but also the position of
each field relative to the base of the record. If t o d a y is located at address 100 (i.e., it
occupies the words at addresses 100 through 102), then t o d a y . y is located at address
100, t o d a y .m is located at address 101, and t o d a y . d is located at address 102. In
other words, the fields y , m, and d have offsets of 0, 1, and 2, respectively, within any
record of type D a t e . Likewise, the fields f e m a l e , d o b , and s t a t u s have offsets of
0, 1, and 4, respectively, within any record of type D e t a i l s .
Summarizing:
sizefDa ten = 3 words
addressftoday.y j = addressutoday] + 0
address[ t o d a y .m] = address[today]I + 1
a d d r e s s f t o d a y .d j = addressftoday] 2 +
sizef~etails] = 5 words
addressfher . f e m a l e ] = address[[her]+ 0
addressther .doh] = address[herj + 1
addressfher . d o b .y ] = addressl[her .d o b ] + 0 = uddress[[her]+ 1
addremuher .d o b . mj = addressfher . d o b j + 1 = address([her]l+ 2
address[[her. d o b .d ] = addressther .d o b ] + 2 = addressl[herj + 3
addressuher . s t a t u s ] = addressl[her] + 4
0
We shall use the notation address v to stand for the address of variable v. If the
variable occupies several words, this means the address of the first word. We use
emphatic brackets I[.. .]I to enclose a specific variable-name, as in addressfher .d o b ] .
Let us now generalize from this example. Consider a record type T and variable r:
t y p e T = r e c o r d 1, : T I , ... , I,: T, e n d ; (6.3)
v a r r: T
We represent each record of type T by juxtaposing its n fields, as shown in Figure 6.2. It
is clear that:
size T = size T I + ... + size T, (6.4)
This satisfies the constant-size requirement. If size T 1 , ..., and size T, are all constant,
then size T is also constant.
The implementation of field selection is simple and efficient. To access field Iiof the
record r, we use the following address computation:
addressur .Ii] = address r + (size T I + ... + size Ti_,) (6.5)
Since size T 1 , .. ., and size TiPl are all constant, the address of the field r . Ii is just a
constant offset from the base address of r itself. Thus, if the compiler knows the address
Run-Time Organization 181

of the record, it can determine the exact address of any field, and can generate code to
access the field directly. In these circumstances, field selection is a zero-cost operation!
However, note that some machines have alignment restrictions, which may force
unused space to be left between record fields. Such alignment restrictions invalidate
equations (6.4) and (6.5). (See Exercise 6.9.)

value of type T I
value of type T2

... ...

Figure 6.2 Representation of a record r.

6.1.3 Disjoint unions

A disjoint union consists of a tag and a variant part, in which the value of the tag
determines the type of the variant part. Mathematically we can describe a disjoint union
type as follows:
T = T I + ... + T, (6.6)
In each value of type T, the variant part is a value chosen from one of the types T I , ....
or T,; the tag indicates which one. The fundamental operations on disjoint unions are:
(a) tag testing; and (b) for each component type Ti, projection of the variant part to give
an ordinary value of type Ti. (The projection operation must be designed with care to
avoid any loophole in the type rules of a statically-typed language.)
Disjoint unions occur as variant records in Pascal and Ada, as unions in Algol-68,
and as so-called datatypes in Haskell and ML. In a variant record, the tag is just a field,
and each possible variant is a distinct field (or tuple of fields); projection is then similar
to field selection from an ordinary record. In the other languages mentioned, projection
is done by pattern matching.
A suitable representation for a disjoint union is juxtaposition of the tag and variant
part. But there is a complication: the variant part may have several possible types, and
therefore several possible sizes. Therefore, we must be careful to satisfy the constant-
size requirement. Fortunately, this is not difficult.

Example 6.5 Pascal variant record representation

Consider the following Pascal variant record type:
182 Programming Language Processors in Java

type Number = record

case acc: Boolean of
true: ( i: Integer );
false: ( r: Real )
end ;
var num: Number
Every value of type Number has a tag field, named acc,and a variant part. The value
of the tag determines the form of the variant part. If the tag is true,the variant part is an
integer field named i.If the tag is false,the variant part is a real field named r.
Assume that a truth value or integer occupies one word, but a real number occupies
two words. Then the variable num would look like this:
n u m . acc
n u m .i 01
n Uu m
n m. ar c c El
3.1416

Some values of type Number occupy two words; others occupy three words. This
apparently contradicts the constant-size requirement, which we wish to avoid at all
costs. We want the compiler to allocate a fixed amount of space to each variable of type
Number,and let it change form within this space. To be safe we must allocate three
words: one word for the tag field, and two words for the variant part. The fields i and r
can be overlaid within the latter two words. When the tag is true,one word is unused
(shaded gray in the diagram), but this is a small price to pay for satisfying the constant-
size requirement. Thus:
size[[NumberJ = 3 words

Now consider the following variant record type, which illustrates an empty variant
and a variant with more than one field:
type Shape = (point, circle, box);
Figure = record
case s: Shape of
point: ( ) ;
circle: ( r: Integer ) ;
box : ( h, w: Integer )
end ;
var fig: Figure
Every value of type Figure has a tag field, named s, and a variant part. The value of
the tag (point,circle,or box) determines the form of the variant part. If the tag is point,
the variant part is empty. If the tag is circle,the variant part is an integer field named r.
If the tag is box,the variant part is a pair of integer fields named h and w.
Run-Time Organization 183

Assume that each primitive value occupies one word. Then the variable f i g would
look like this:

(The enumerands point, circle, and box would be represented by small integers, as
discussed in Section 6.1.1 .)
It is easy to see that:
size[Figure] = 3 words
address[[fig.sJ = address[[fig]+O
address[[f i g . r ] = address[[£i g ] + 1
address[[fig.h] = address[[fig]+l
address[ f i g .wJ = address[[f i g ] + 2

Let us now generalize. Consider a Pascal variant record type T and variable u:
type T = record
c a s e Itag: Ttag of
v ] : (I1: T I ) ;
.. .
vn: ( I n : T n )
end ;
var u: T
where each variant is labeled by one possible value of the type Ttag = { v l , ..., v,,}. We
represent each record of type T by juxtaposing its tag field and variant part. Within the
variant part we overlay the different variants, which are of types T I ,T2, ..., and Tn. This
representation is shown in Figure 6.3. It is clear that:
size T = size Ttag + max (size T I , ..., size Tn) (6.8)
This satisfies the constant-size requirement. If size T,,,, size TI, . . ., and size T, are all
constant, then size T is also constant.
The operations on variant records are easily implemented. To access the tag and
variant fields of the variant record u, we use the following address computations:
address[[u.ItagJ = address u + 0 (6.9)
addressru .Ii] = address u + size Ttag (6.10)
- both being constant offsets from the base address of u.

This analysis can easily be generalized to variants with no fields or many fields, as in
Example 6.5.
184 Programming Language Processors in Java

Figure 6.3 Representation of a disjoint union (variant record) u.

6.1.4 Static arrays

An array consists of several elements, which are all of the same type. The array has a
bounded range of indices (which are usually integers), and for each index it has exactly
one element. The fundamental operation on arrays is indexing, whereby we access an
individual element by giving its index; in general, this index is evaluated at run-time.
A static array is an array whose index bounds are known at compile-time. A suitable
direct representation for a static array is to juxtapose the array elements, in order of
increasing indices. The indexing operation is implemented by a run-time address
computation.
For the moment we make the simplifying assumption that the lower index bound is
zero. This is the case in the programming languages Triangle, C, and Java. (Later we
shall relax this assumption.)

Example 6.6 Triangle array representation

Consider the array types and variables introduced by these Triangle declarations:
type Name = array 6 of Char;
type Coding = record
c : Char, n: Integer
end ;
var me : Name;
var us: array 2 of Name;
var code: array 3 of Coding
The variable me is just an array of six characters (indexed from 0 through 5). The
variable us is an array of two elements, each of which is itself an array of six
characters. The variable code is an array of three elements, each of which is a record
with two fields.
Assume again that each primitive value occupies one word. Then the variables me,
us,and code would be represented as follows:
Run-Time Organization 185

It is easy to see that:

size[[Name] = 6 x sizeI[Char] = 6 words
s i z e u a r r a y 2 o f Name] = 2 x size([Name] = 12 words
s i z e [ [ a r r a y3 o f C o d i n g ] = 3 x sizeI[Coding] = 6 words
address[me [ 0 ] ] = addressl[me] + 0
addressI[me [ 3 1 ] = addressl[me] + 3
addressI[me [ i ]1 = address[[me]+ i
address[[code[ 2 I ] = addressl[code] + 4
address[code [ i ]1 = address[[code]l+ 2i
address[[code[ i ] .n] = addressI[code [ i l ] + 1 = addressl[code] + 2i + 1
addressI[us [ i ]]l = address[usJ + 6i
addressl[us [ i l [ j ]]l = addressI[us [ i ]] +j = address[us] + 6i +j
0
Let us now generalize from this example. Consider a Triangle array type T and array
variable a:
t y p e T = a r r a y n o f Telem; (6.1 1 )
var a: T
Each array of type T has n elements, indexed from 0 through n-1. We represent each
array by juxtaposing its elements, as shown in Figure 6.4. It is clear that:
size T = n x size Tel,, (6.12)
This satisfies the constant-size requirement. The number of elements n is constant, so if
size Telem is constant, then size T is also constant.
Since the elements of the array a are positioned in order of increasing index, and
since the first element has index 0, the element with index i is addressed as follows:
addressI[a [ i I ] = address a + ( i x size Tele,) (6.13)
Here size Telem is known at compile-time, but (in general) the value of i is known only
at run-time. Thus array indexing implies a run-time address computation.
Run-Time Organization 187

But what is the significance of this number 102? It is just addressfgrade [O] I). We
call this address the origin of the array g r a d e . An array's origin coincides with its base
address only if its lower bound is zero.
Similarly, addressfgnp [ i ] I) = addressfgnp [ O ] I) + i, where the origin of the array
g n p is addressl[gnp [ 01 I) = addressfgnpI) - 2000. Of course, this particular array has
no element with index 0, but that does not prevent us from using its origin (which is just
a number!) to compute the addresses of its elements at run-time.
17
Let us now generalize. Consider a Pascal array type T and array variable a:
type T = a r r a y [ I . . u ] of Telem; (6.14)
var a: T
The constants 1 and u are the lower and upper index bounds, respectively, of the array
type. Each array of type T has (u - 1 + 1 ) elements, indexed from 1 through u. As before,
we represent each array by juxtaposing its elements, as shown in Figure 6.5. It is clear
that:
size T = ( u - I + 1 ) x size Telem (6.15 )
Again, this satisfies the constant-size requirement, since 1 and u are constant.
The element of array a with index i is addressed as follows:
addressfa [ i l l = address a + (i - 1) x size Telem
= address a - ( I x size Telem)+ (i x size Telem)
From this we can determine the origin addressfa [0] I), and use it to simplify the
formula:
addressta [ 0 I ] = address a - ( I x size Telem) (6.16)
addressl[a [il I) = addressUa [O] I) + ( i x size Te& (6.17)
Equation (6.17) has the same form as (6.13). The only difference is that a [ 0 I no longer
need be the first element of the array a. Indeed, a [O] might not even exist! But that
does not matter, as we saw in Example 6.7, because address[a [ O ] I) is just a number.
There is more to array indexing than an address computation. An index check is also
needed, to ensure that the evaluated index lies within the array's index bounds. When an
array of the type T of (6.14) is indexed by i, the index check must ensure that:

Since the index bounds 1 and u are known at compile-time, the compiler can easily
generate such an index check.
188 Programming Language Processors in Java

Figure 6.5 Representation of a static array a.

6.1.5 Dynamic arrays

A dynamic array is an array whose index bounds are not known until run-time. Dyn-
amic arrays are found in Algol and Ada. In such languages, different dynamic arrays of
the same type may have different index bounds, and therefore different numbers of
elements. How then can we make dynamic arrays satisfy the constant-size requirement?
We are forced to adopt an indirect representation, in which the dynamic array's
handle (also called an array descriptor) contains not only a pointer to the array's
elements but also the array's index bounds. The handle has a constant size.

Example 6.8 Ada dynamic array representation

Consider the array type and variables introduced by the following Ada declarations:
type String is array (Integer range <>) of Character;
d: String(1 .. k);
s: String (m .. n - 1);
This declares a new array type String,and two variables of type String.The first
variable, d, contains elements indexed from I to the value of k, and the second variable,
s , contains elements indexed from the value of m to the value of n-1.

The values of type String are arrays of characters, indexed by integers. Different
arrays of type String may have different index bounds; moreover, these index bounds
may be evaluated at run-time. Operations such as concatenation and lexicographic
comparison are applicable to any arrays of type String,even if they have different
numbers of elements. But any attempt to assign one array of type String to another
will fail at run-time unless they happen to have the same number of elements.
A suitable representation for arrays of type String is as follows. Each array's
handle contains the array's origin, i.e., the address of the (possibly notional) element
with index 0. The handle also contains the array's lower and upper index bounds. The
array's elements are stored separately.
Suppose that the variables k, m, and n turn out to have values 7, 0, and 4, respec-
tively. Then the array d will have index bounds 1 and 7, and the array s will have index
bounds 0 and 3. The arrays will look like this:
Run-Time Organization

{ ?%%bound
upper bound
rv
origin
lower bound
upper bound

handle elements

Each array's handle occupies 3 words exactly (assuming that integers and addresses
occupy one word each). The elements of d occupy 7 words, whereas the elements of s
occupy 4 words (assuming that characters occupy one word each). Since the elements
are stored separately, we take size[String]l to be the size of the handle:
J 3 words
size[r~tringI=
Likewise, we shall take address[[dIJ to be the address of d's handle. The address of
element d ( 0 ) is stored at offset 0 within the handle. Thus the address of an arbitrary
element can be computed as follows:

where content(x) stands for the content of the word at address x.

Let us now generalize. Consider an Ada array type T and array variable a :
t y p e T i s a r r a y ( I n t e g e r r a n g e < > ) of Telem; (6.19)

We represent each array of type T by a handle, consisting of an address and two

integers, as shown in Figure 6.6. Thus:
size T = address-size + 2 x size[[Inte g e r ] (6.20)
where address-size is the amount of space required to store an address - usually one
word. Equation (6.20) clearly satisfies the constant-size requirement.
The declaration of array variable a is elaborated as follows. First the expressions E l
and E2 are evaluated to yield a's index bounds. Suppose that their values turn out to be 1
and u, respectively. Space is then allocated for (u - 1 + 1) elements, juxtaposed in the
usual way, but located separately from a's handle. The array's origin is computed as
follows:
addressl[a ( 0 ) 1] = addressl[a ( 1) ]I - (1 x size Telem) (6.21)
Run-Time Organization

Table 6.1 Typical instructions in a register machine

I Instruction I Meaning I
1 STORE Ri a I Store the value in register i at address a. I
I LOAD Ri x I Fetch the value of x and place it in register i. I
ADD Ri x Fetch the value of x and add it to the value in register i.
SUB Ri x Fetch the value of x and subtract it from the value in register i.
I MULT Ri x I Fetch the value of x and multiply it into the value in register i. I

The object code for expression evaluation in registers is efficient but rather compli-
cated. A compiler generating such code must assign a specific register to each
intermediate result. It is important to do this well, but quite tricky. In particular, a
problem arises when there are not enough registers for all the intermediate results. (See
Exercise 6.1 1.)
A very different kind of machine is one that provides a stack for holding
intermediate results. This allows us to evaluate expressions in a very natural way. Such
a machine typically provides instructions like those listed in Table 6.2.

Example 6.11 Expression evaluation in a stack machine

To evaluate the expression ' ( a * b) + ( 1 - ( c * 2 ) ) ' on our stack machine, we could
use the sequence of instructions shown below left. Note the one-to-one correspondence
with the same expression's postfix representation, shown below right.
LOAD a a
LOAD b b
MULT *
LOADL 1 1
LOAD c c
LOADL 2 2
MULT *
SUB -
ADD +
Figure 6.7 shows the effect of each instruction on the stack, assuming that the stack
is initially empty.3
0

In Figure 6.7 and throughout this book, the stack is shown growing downwards, with the stack
top nearest the bottom of the diagram. If this convention seems perverse, recall the convention
for drawing trees in computer science textbooks! Shading indicates the unused space beyond
the stack top.
Run-Time Organization 195

These desirable and simple properties of evaluation on the stack hold true regardless
of how complicated the expression is. An expression involving function calls can be
evaluated in just the same way. Likewise, an expression involving operands of different
types (and therefore different sizes) can be evaluated in just the same way.

( 1 ) After LOADL 0: (2) After LOAD n: (3) After LT:

........ value of O<n
........ value of n

(4) After LOAD n: (5) After CALL odd: ( 6 ) After AND:

Figure 6.8 Evaluation of ' ( 0 < n) / \ odd (n)' on a stack.

Example 6.12 Evaluation of function calls in a stack machine

To evaluate the expression '(0 < n ) / \ odd ( n )' on our stack machine, we could use
the following sequence of instructions:
LOADL 0
LOAD n
LT
LOAD n
CALL odd
AND
Figure 6.8 shows the effect of each instruction on the stack, assuming that the stack
is initially empty. The instructions 'LT' and 'AND' are analogous to 'ADD', 'SUB', etc.,
in that each replaces two values at the stack top by a single value, but some of the values
involved are truth values rather than integers.
Note the analogy between 'CALL odd' and instructions like 'ADD', 'LT', etc. - each
takes its argument(s) from the stack top, and replaces them by its result.
0
Run-Time Organization 205

LOAD d [ SB] - for procedure Q to fetch the value of a global variable

LOAD d [ LB I - for procedure Q to fetch the value of a variable local to itself
LOAD d [ L1 ] - for procedure Q to fetch the value of a variable local to P

where in each case d is the appropriate address displacement.

Now consider snapshot (5), also taken when procedure P has called procedure Q, but
this time indirectly through S.At this time also, LB points to the frame that contains Q's
local variables, and L1 points to the underlying frame that contains P's local variables.
So the above instructions will still work correctly. No register points to the frame that
contains S's local variables. This is correct, because Q may not directly access these
variables.
The following snapshot (6) illustrates a situation where R,the most deeply-nested
procedure, has been activated by Q. Now register LB points to R's frame, register L1
points to the frame belonging to Q (the procedure immediately enclosing R), and register
L2 points to the frame belonging to P (the procedure immediately enclosing Q). This
allows R to access not only its own local variables, but also variables local to Q and p:
LOAD d [SB] - for procedure R to fetch the value of a global variable
LOAD d [LB] - for procedure R to fetch a variable local to itself
LOAD d[L11 - for procedure R to fetch a variable local to Q
LOAD d [L21 - for procedure R to fetch a variable local to P
But no register points to the frame containing S's local variables, since R may not
directly access these variables.
0
By arranging for registers L1, L2, etc., to point to the correct frames, we allow each
procedure to access nonlocal variables. To achieve this, we need to add a third item to
the link data in each frame. Consider a routine (procedure or function) R that is enclosed
by routine R' in the source program. In a frame that contains variables local to routine R:
The static link is a pointer to the base of an underlying frame that contains variables
local to R'. The static link is set up when R is called. (This will be demonstrated in
Section 6.5.1 .)
The static links were shown in Figure 6.15. Notice that the static link in a frame for
Q always points to a frame for p, since it is p that immediately encloses Q in the source
program. Similarly, the static link in a frame for R always points to a frame for Q, and
the static link in a frame for S always points to a frame for P. (The static link in a frame
for P always points to the globals, but that static link is actually redundant.)
The layout of a stack frame is now as shown in Figure 6.16. Since there are now
three words of link data, the local variables now start at address displacement 3. Figure
6.17 shows the layout of frames for the procedures in Figure 6.14.
Run-Time Organization 207

And so on.
The collection of registers LB, L I , L2, ..., and SB is often called the display. The
display allows access to local, nonlocal, and global variables. The display changes
whenever a routine is called or returns.
The critical property of the display is that the compiler can always determine which
register to use to access any variable. A global variable is always addressed relative to
SB. A local variable is always addressed relative to LB. A nonlocal variable is addressed
relative to one of the registers L1, L2, .... The appropriate register is determined entirely
by the nesting levels of the routines in the source program.
We assign routine levels as follows: the main program is at routine level 0; the body
of each routine declared at level 0 is at routine level 1; the body of each routine declared
at level 1 is at routine level 2; and so on.
Let v be a variable declared at routine level 1, and let v's address displacement be d.
Then the current value of v is fetched by various parts of the code as follows:
If 1 = 0 (i.e., v is a global variable):
LOAD d[SB] - for any code to fetch the value of v

If 1 > 0 (i.e., v is a local variable):

LOAD d [ LB ] - for code at level 1 to fetch the value of v

LOAD d[Ll] - for code at level 1+1 to fetch the value of v

LOAD d[L2] - for code at level 1+2 to fetch the value of v

Storing to variable v is analogous.

Routines
A routine (or subroutine) is the machine-code equivalent of a procedure or function in a
high-level language. Control is transferred to a routine by means of a call instruction (or
instruction sequence). Control is transferred back to the caller by means of a return
instruction in the routine.
When a routine is called, some arguments may be passed to it. An argument could
be, for example, a value or an address. There may be zero, one, or many arguments. A
routine may also return a result - that is if it corresponds to a function in the high-level
language.
We have already studied one aspect of routines, namely allocation of storage for
local variables. In this section we study other important aspects:
protocols for passing arguments to routines and returning their results
how static links are determined
Run-Time Organization 21

proc W (i: Integer) -

let const s - i * i
in
begin
putint (F(i, s ) ) ;
putint (F(s, s ) )
end
in
begin
getint (var g) ;
w (g+l)
end
This (artificial) program reads an integer, and writes the cube and fourth power of it
successor.

(1) Just after reading (2) Just before call to (3) Just after (4) Just before call to
g: W: computing s: F:

( 5 ) Just before return (6) Just after return (7) Just after return
from F: from I?: from W:

arg. i

Figure 6.20 Arguments and results in Example 6.20.

Run-Time Organization 231

An object is a group of instance variables, to which a group of instance methods are

attached.
An instance variable is a named component of a particular object.
An instance method is a named operation, which is attached to a particular object and
is able to access that object's instance variables.
An object class (or just class) is a family of objects with similar instance variables and
identical methods.
In a pure 00 language, all instance variables would be private, leaving the instance
methods as the only way to operate on the objects. In practice, most 00 languages (such
as Java and C++) allow the programmer to decide which of the instance variables are
public and which are private. Anyway, this issue does not affect their representation.
An instance-method call explicitly identifies a particular object, called the receiver
object,and a particular instance method attached to that object. In Java, such a method
call has the form:
Eo.I(E1, ..., E,l)
The expression Eo is evaluated to yield the receiver object. The identifier I names an
instance method attached to that object. The expressions E l , ... , En are evaluated to
yield the arguments passed to the method.
Although an object is somewhat similar to a record, the representation of an object
must reflect the close association between the object and its instance methods. From an
object we must be able to locate the attached instance methods. In turn, each instance
method must somehow 'know' which object it is attached to.

Example 6.32 Java object representation (single class)

Consider the following Java class:
class Point {
/ / A Point object represents a geometric point located at (x,y).
protected int x, y;
(1) public Point (int x, int y) {
this.x = x; this.y = y;

(2) public void move (int dx, int dy) {.

this.x += dx; this.y += dy;
1
(3) public float area () {
return 0.0;
1
Run-Time Organization 249

public Student (String name, Date dob,

String studentId) { ... 1
public void enrol (Course course) { ... )
1
Draw the representations of the Person,Staff,Faculty and Student
class-objects and an example object of each class, as illustrated in Exam-
ple 6.33.

6.26" Using the class definitions from Exercise 6.25, consider the following hypo-
thetical class definition:
class ~eachingAssistantextends Staff, Student

This would be an example of multiple inheritance (which is not supported in

Java). Consider how an object in the class TeachingAssistant could be
represented. Note that such an object would contain only a single occurrence of
the instance variables of Person,which would be inherited from both Staff
and Student.
258 Programming Language Processors in Java

Example 7.5 Code templates for Triangle values and variables

Code templates (7.10), (7.11), and (7.12b) assume that every Mini-Triangle value
occupies one word exactly. This is justified because Mini-Triangle supports only truth
values and integers, which occupy one word each in TAM.
The full Triangle language, on the other hand, supports a variety of types including
arrays and records. A value or variable of type Twill occupy a number of words given
by size T. (See Section 6.1.) For Triangle we must generalize the code templates to take
this into account:
fetch [IJ = (7.13)
LOAD ( s ) d [ S B ] where s = size(type of I),
d = address bound to I (relative to SB)
assign ([Q = (7.14)
STORE ( s 1 d [ SBI where s = size(type of I),
d = address bound to I (relative to SB)
elaborate [var I : TJ =
PUSH s where s = size T
We shall use these more general code templates from now on. They are still valid for
Mini-Triangle, in which size T is always 1.

7.1.2 Special-case code templates

There are often several ways to translate a given source-language phrase to object code,
some more efficient than others. For example, the TAM code to evaluate the expression
'n + 1' could be:
(a) LOAD n o r (b) LOAD n
LOADL 1 CALL succ
CALL add
Object code (a) follows code template (7.9d). That code template is always applicable,
being valid for any binary operator and for any subexpressions. Object code (b) is
correct only in the special case of the binary operator '+' being applied to the literal
value 1. When applicable, this special case gives rise to more efficient object code. It
could be specified as follows:
evaluate [El + 13 =
evaluate El
CALL succ
A special-case code templute is one that is applicable only to phrases of a special
form. Invariably such phrases are also covered by a more general code template. A
Code Generation 259

special-case code template is worth having if phrases of the special form occur fre-
quently, and if they allow translation into particularly efficient object code. The follow-
ing example illustrates another common special case.

Example 7.6 Mini-Triangle constant declarations

The right side of a constant declaration is frequently a literal, as in:
let
...
const n - 7

Code template (7.12a) specifies that the code 'elaborate [const n - 71' will deposit
the value 7 in a suitable cell (at the current stack top). Whenever n is used, code
template (7.10) specifies that the value will be loaded from that cell. The following
translation illustrates these code templates:

execute [[letconst n - 7;
vari: Integer
in i : = n*n]
' elaborate [[const n - 71
elaborate [[vari : Integer]

execute [[i : = n*n]

I
The first instruction 'LOADL 7' makes space for the constant n on the stack top.
LOADL 7
PUSH 1
LOAD n
LOAD n
CALL mult
STORE i
POP(O)2

Instructions of the form 'LOAD n' fetch the constant's value, wherever required. The
final instruction 'POP ( 0 ) 2' pops the constant and variable off the stack.
A much better translation is possible: simply use the literal value 7 wherever n is
fetched. This special treatment is possible whenever an identifier is bound to a known
value in a constant declaration. This is expressed by the following special-case code
templates:
fetch [[Ill = (7.16)
LOADL v where v = value bound to I(if known)
elaborate [[const I- IL] =
(i.e., no code)
In (7.17) no code is required to elaborate the constant declaration. It is sufficient that the
value of the integer-literal IL is bound to I for future reference. In (7.16) that value is
incorporated into a LOADL instruction. Thus the object code is more efficient in both
places. The following alternative translation illustrates these special-case code
templates:
260 Programming Language Processors in Java

I elaborate [const n 71-

elaborate [var i : Integer] PUSH 1
execute [[let const n - 7 ; LOADL 7
vari: Integer execute [[i := nxn1] LOADL 7
in i : = nXn] CALL mult
STORE i
i
POP(0) 1
In this object code, each applied occurrence of n has been translated to the literal value
7, and the instruction to elaborate the declaration of n has been eliminated.
0

7.2 A code generation algorithm

A code specification does more than specify a translation from the source language to
object code. It also suggests an algorithm for performing this translation. This algorithm
traverses the decorated AST representing the source program, emitting target-machine
instructions one by one. Both the order of traversal and the instructions to be emitted are
determined straightforwardly by the code templates.
In this section we see how to develop a code generator from a code specification.
We illustrate this with the Mini-Triangle code specification of Example 7.2.

7.2.1 Representation of the object program

Since its basic function is to generate an object program consisting of target-machine
instructions, the code generator must obviously define representations of instructions
and instruction sequences. This is easy, as the following example illustrates.

Example 7.7 Representing TAM instructions

A code generator that generates TAM object code must represent TAM instructions and
their fields (see Section (2.2):
public class Instruction {
public byte op; // op-code (0 .. 15)
public byte r; // register field (0 .. 15)
public byte n; // length field (0 .. 255)
public short d ; // operand field (-32767 .. +32767)
pub1 ic static final byte / / op-codes (Table C.2)
LOADop = 0, LOADAop = 1,
LOADIop = 2, LOADLop = 3 ,
STOREop = 4, STOREIop = 5,
262 Programming Language Processors in Java

Many of these visitor methods will simply be encoding methods. For example, the
visitorlencoding methods for commands will be vi sitAssigncommand,visit-
Callcommand, etc., and their implementations will be determined by the code
templates for 'execute I[V : =a',
'execute ([I ( E ) ]', etc.

Table 7.2 Summary of visitorlencoding methods for the Mini-Triangle code generator.
I Phrase class I Visitor/encoding method I Behavior of visitorlencoding method I
I Program I visitprogram I Generate code as specified by 'run P'. I
Command visit...Command Generate code as specified by 'execute C'.
Expression visit . ..Expression Generate code as specified by 'evaluate E' .
V-name visi t...Vname Return an entity description for the given value-
or-variable-name(explained in Section 7.3.)
Declaration visit...Declaration Generate code as specified by 'elaborate D'.
I Type-denoter I visit...TypeDenoter I Return the size of the given type. I

Example 7.8 Mini-Triangle-to-TAM code generator

Let us design a code generator that translates Mini-Triangle to TAM object code. We
shall assume the code specification of Example 7.2, and the definition of AST and its
subclasses in Example 4.19.
The code generator will include visitorlencoding methods for commands, expres-
sions, and declarations:
public Object visit...Command
(...Commandcorn, Object arg) ;
/ / Generate code as specified by 'execute corn'.
public Object visit...Expression
(...Expressionexpr, Object arg) ;
/ / Generate code as specified by 'evaluate expr'.
public Object visit ...Declaration
(...Declaration decl, Object arg) ;
/ / Generate code as specified by 'elaborate decl'.
There will be one visitorlencoding method for each form of command (visit-
AssignmentCommand, visi tI fCommand, visi twhilecommand, etc.). Each
such method will have an argument com of the appropriate concrete subclass of Com-
mand (AssignmentCommand,Ifcommand,Whilecommand, e t ~ . ) .Each such
method will also have an Obj ect argument and an Object result, but for the moment
these will not actually be needed.
Likewise there will be one visitorlencoding method for each form of expression, and
one visitorlencoding method for each form of declaration.
Value-or-variable-names cannot be mapped so simply on to the visitor pattern. There
are two code functions for value-or-variable-names, fetch and assign, each with its own
code template. So we need distinct visitor and encoding methods. The encoding
methods will be:
private void encodeFetch (Vname vname);
/ / Generate code as specified by 'fetch vname'.
private void encodeAssign (Vname vname);
/ / Generate code as specified by 'assign vname'.
Each of these encoding methods will call the visitor method (visit...vname) to find
out information about the run-time representation of vname. However, they will use
this information differently: one to generate a LOAD instruction, the other to generate a
STORE instruction.

There is a single encoding method for a program, visitprogram, that will

generate code for the entire program:
public Object visitprogram (Program prog, Object arg);
/ / Generate code as specified by 'run prog'.
The visitorlencoding methods of the Mini-Triangle code generator are summarized
in Table 7.2.
Now that we have designed the code generator, let us implement some of the
encoding methods. The following method generates code for a complete program, using
code template (7.7):
public Object visitprogram run ([a
=
(Program prog,
Object arg) {
prog.C.visit(this, arg); execute C
emit(Instruction.HALTop, 0, 0 , 0); HALT
1
(For ease of comparison, we show each code template alongside the corresponding code
generator steps.)
Now let us implement the visitorlencoding methods for commands. Each such
method translates one form of command to object code, according to the corresponding
code template (7.8a-f):
public Object visitAssignCommand execute [V : = ZQ =
(AssignCommand com,
Object arg) {
Code Generation

public Object visitIntegerExpression evaluate [[IL] =

(IntegerExpression expr,
Object arg) {
short v = valuation(expr.IL.spelling);
emit(Instruction.LOADLop, 0 , 0 , v); LOADL v
return null;
1
public Object visitVnameExpression evaluate I[VJ =
(VnameExpression expr,
Object arg) {
encodeFetch(expr.V); fetch V
return null;
1
public Object visitUnaryExpression evaluate [[OEJ =
(UnaryExpression expr,
Object arg) {
expr.E.visit(this, arg); evaluate E
short p = address of primitive routine
named expr -0 ;
emit(Instruction.CALLop, CALL p
Instruction.SBr,
Instruction.PBr, p);
return null;
I
public Object visitBinaryExpression evaluate [ E l 0
(BinaryExpression expr, Ezll =
Object arg) {
expr.El.visit(this, arg); evaluate E l
expr.E2.visit(this, arg); evaluate E2
short p = address of primitive routine
named expr .0;
emit(Instruction.CALLop, CALL p
Instruction.SBr,
Instruction.PBr, p);
return null;
1
In visit IntegerExpression,we used the following auxiliary function:
private static short valuation (String intLit)
/ / Return the value of the integer-literal spelled intLit.
The visitorlencoding methods for declarations, and the encoding methods encod
Fetch and encodeassign,will be implemented in Example 7.13.
Code Generation 267

for control structures. Thereafter Sections 7.3 and 7.4 deal with the problems of generat-
ing code for declared constants and variables, procedures, functions, and parameters.

7.2.3 Control structures

The code generator appends one instruction at a time to the object program. It can easily
determine the address of each instruction, simply by counting the instructions as they
are generated.
Source-language control structures, such as if-commands and while-commands, are
implemented using unconditional and conditional jump instructions. The destination
address (i.e., the address of the instruction to which the jump is directed) is the operand
field of the jump instruction. A backward jump causes no problem, because the jump
instruction is generated after the instruction at the destination address, so the destination
address is already known. But a forward jump is awkward, because the jump instruction
must be generated before the instruction at the destination address, and the destination
address cannot generally be predicted at the time the jump instruction is generated.
Fortunately, there is a simple solution to the problem of forward jumps, a technique
known as backpatching. When the code generator has to generate a forward jump, it
generates an incomplete jump instruction, whose destination address is temporarily set
to (say) zero. At the same time the code generator records the address of the jump
instruction in a local variable. Later, when the destination address becomes known, the
code generator goes back and patches it into the jump instr~ction.~
The following example illustrates the method. Recall that the code generator
maintains a variable, nextInstrAddr, that contains the address of the next
instruction to be generated, and is incremented whenever an instruction is appended to
the object program. (See Example 7.7.)

Example 7.9 Backpatching

Recall code template (7.8e):
execute [whi 1e E do Cj =
JUMP h
g : execute C
h : evaluate E
JUMPIF (1) g
Here g stands for the address of the first instruction of the object code 'execute C', and h
stands for the address of the first instruction of the object code 'evaluate E'. Let us see
how v i s i tWhilecomrnand should implement this code template.

A similar solution to a similar problem is also used in one-pass assemblers.

Code Generation 269

short g = nextInstrAddr; g:
com.C.visit(this, arg); execute C
short h = nextInstrAddr; h:
patch(j, h);
com.E.visit(this, arg); evaluute E
emit(Instruction.JUMPIFop, 1, JUMPIF ( 1) g
Instruction.CBr, g);
return null;
1
public Object visitIfCommand execute [if E
(Ifcommand com, then C1
Object arg) C else C2] =
com.E.visit (this, arg) ; evaluate E
short i = nextInstrAddr; 1:
emit(~nstruction.JUMPIFop,0, JUMPIF(0) g
Instruction.CBr, 0);
com.Cl.visit(this, arg); execute C1
short j = nextInstrAddr; j:
emit(Instruction.JUMPop, 0, JUMP h
~nstruction.CBr,0);
short g = nextInstrAddr;
patch(i, 9);
com.C2 .visit(this, arg); execute C2
short h = nextInstrAddr; h:
patch(j, nextInstrAddr);
return null;
1
Here we have used the following auxiliary method for patching instructions:
private void patch (short addr, short d) {
/ / Store d in the operand field of the instruction at address addr.
code [addr].d = d;
1

7.3 Constants and variables

In a source program, the role of each declaration is to bind an identifier I to some entity,
such as a value, variable, or procedure. Within the scope of its declaration, there may be
many applied occurrences of I in expressions, commands, and so on. Each applied
occurrence of I denotes the entity to which I was bound.
Code Generation 27 1

Each applied occurrence of b should be translated to the value 10 (more precisely, to

the target-machine representation of lo), and each applied occurrence of i should be
translated to the address 4. So the subcommand 'i := i * b' should be translated to the
following object code:
LOAD 4[SB] - fetch from the address bound to i
LOADL 10 - fetch the value bound to b
CALL mult - multiply
STORE 4 [SB] - store to the address bound to i
Now let us see how this treatment of identifiers can be achieved. The code generator
first visits the declarations. It creates an entity description for the known value 10, and
attaches that entity description to the declaration of b at (1). It creates an entity descrip-
tion for the known address 4, and attaches that entity description to the declaration of i
at (2). Figure 7.1 (b) shows the AST at this point.
Thereafter, when the code generator encounters an applied occurrence of b, it
follows the link to the declaration (1). From the entity description attached to (1) it
determines that b denotes the known value 10. Likewise, when the code generator
encounters an applied occurrence of i,it follows the link to the declaration (2). From
the entity description attached to (2) it determines that i denotes the known address 4.
0
Example 7.11 Accessing an unknown value
Consider the following Mini-Triangle command:
let var x: Integer
in
let const y - 365 + x
in
putint (y)
Figure 7.2 shows the decorated AST representing this command. The applied occur-
rences of x and y at (3) and (4) have been linked to the corresponding declarations at (1)
and (2), respectively.
The variable declaration binds the identifier x to a newly allocated integer variable.
To be concrete, let us suppose that its address is 5 (relative to SB).
The constant declaration binds the identifier y to an integer value that is unknown at
compile-time. So the code generator cannot simply translate an applied occurrence of y
to the value that it denotes. (Contrast the constant declaration of Example 7.10.)
Fortunately, there is a simple solution to this problem. The code generator translates
the constant declaration to object code that evaluates the unknown value and stores it at
a known address. Suppose that the value of y is to be stored at address 6 (relative to
SB). Then the applied occurrence of y in 'putint(y)' should be translated to an
instruction to fetch the value contained at address 6:
Code Generation 273

LOAD 6CSBI - fetch the value bound to y

CALL p u tin t - write it
The code generator first visits the declarations. It creates an entity description for the
known address 5 , and attaches that entity description to the declaration of x at (1). It
creates an entity description for an unknown value at address 6, and attaches that entity
description to the declaration of y at (2). These entity descriptions are shown in Fig-
ure 7.2.
Thereafter, whenever the code generator encounters an applied occurrence of y, it
follows the link to the declaration (2). From the entity description attached to (2) it
determines that y denotes the unknown value contained at address 6.

Program
I
LetCommand
I
LetCommand
I
(2) 1
ConstDeclaration 4--\,

\, SimpleV.

lent. 1nt.Lit. Op. Ident. Ident. Ident.

! ; ! ;
! j

! x x putint y

known address unknown value

address = 5
address = 6

Figure 7.2 Entity descriptions for a known address and an unknown value.

In summary, the code generator handles declarations and applied occurrences of

identifiers as follows:
When it encounters a declaration of identifier I, the code generator binds I to a newly
created entity description. This entity description contains details of the entity bound
to I.
278 Programming Language Processors in Java

elaborate [var I : T]I =

PUSH s where s = size T
elaborate [[Dl; D2]=
elaborate Dl
elaborate D2
These are implemented by the following visitorlencoding methods:
public Object visitConstDeclaration
(ConstDeclaration decl,
Object arg) {
short gs = shortValueOf(arg);
if (dec1.E instanceof
IntegerExpression) { elaborate [[cons
IntegerLiteral IL = I - ILJj =
((IntegerExpression) dec1.E).IL; (no code)
decl.entity = new KnownValue
(1, valuation(IL.spel1ing));
return new Short(0);
1 else { elaborate [[cons.
short s = shortValueOf( I - a
decl.E.visit(this, arg)); evaluate E
decl.entity = new Unknownvalue
(s, gs);
return new Short(s);
1
1
public Object visitVarDeclaration elaborate [var
(VarDeclaration decl, I:Tj
Object arg) {
short gs = shortValueOf(arg);
short s = shortValueOf(decl.T.visit
(this, null)); T
s = size
emit(Instruction.PUSHop, 0, 0, s); PUSH s
decl.entity = new KnownAddress
(1, g s ) ;
return new Short(s);
1
public Object visitSequentialDeclaration elaborate [Dl;
(SequentialDeclaration decl, 0211 =
Object arg) {
short gs = shortValueOf(arg);
short sl = shortValueOf(
decl.Dl.visit(this, arg)); elaborate Dl
280 Programming Language Processors in Java

These are implemented by the following encoding methods:

private void encodeAssign (Vname vname, short s) {
RuntimeEntity entity =
(RuntimeEntity) vname.visit(this, null);
short d = ((KnownAddress) entity) .address;
emit(Instruction.STOREop, s, Instruction.SBr, d);
1
private void encodeFetch (Vname vname, short s) {
RuntimeEntity entity =
(RuntimeEntity) vname.visit(this, null);
if (entity instanceof KnownValue) {
short v = ((KnownValue) entity) .value
emit(Instruction.LOADLop, 0, 0, v);
1 else {
short d = (entity instanceof Unknownvalue) ?
((Unknownvalue) entity).address :
((KnownAddress) entity) .address;
emit(Instruction.LOADop, s, Instruction.SBr, d);

In encodeAssign we can safely assume that entity is an instance of Known-

Address. (The contextual analyzer will already have checked that I is a variable
identifier.) In encodeFetch, however, entity could be an instance of
KnownValue,Unknownvalue,or KnownAddress.
Both encodeFetch and encodeAssign visit vname.The corresponding visitor
method simply returns the corresponding entity description:
public Object visitsimplevname
(Simplevname vname, Object arg) {
return vname.I.decl.entity;
1
(Recall that the contextual analyzer has linked each applied occurrence of identifier I to
the corresponding declaration of I. The field decl represents this link. Therefore,
1 .decl . entity points to the entity description bound to I.)
Finally, method encode starts off code generation with no storage allocated:
public void encode (Program prog) {
prog.visit(this, new Short(0));
I 7.3.3 Stack storage allocation
Code Generation 28 1

Consider now a source language with procedures and local variables. As explained in
Section 6.4, stack storage allocation is appropriate for such a language. The code
generator cannot predict a local variable's absolute address, but it can predict the
variable's address displacement relative to the base of a frame - a frame belonging to
the procedure within which the variable was declared. At run-time, a display register
will point to the base of that frame, and the variable can be addressed relative to that
register. The appropriate register is determined entirely by a pair of routine levels
known to the code generator: the routine level of the variable's declaration, and the
routine level of the code that is addressing the variable. (See Section 6.4.2 for details.)
To make the code generator implement stack storage allocation, we must modify the
form of addresses in entity descriptions. The address of a variable will now be held as a
pair (1, d), where 1 is the routine level of the variable's declaration, and d is the
variable's address displacement relative to its frame base. As in Section 6.4.2, we assign
a routine level of 0 to the main program, a routine level of 1 to the body of each
procedure or function declared at level 0, a routine level of 2 to the body of each
procedure or function declared at level 1, and so on.

Example 7.14 Storage allocation for global and local variables

Recall the Triangle program of Figure 6.14. The same program is outlined in Figure 7.3,
with each procedure body shaded to indicate its routine level.
Entity descriptions are shown attached to the variable declarations in the source
program. (This is for clarity. In reality, of course, the entity descriptions would be
attached to the sub-ASTs that represent these declarations, as in Figures 7.1 and 7.2.)
The addresses of the global variables g l and g2 are shown as (0, 0) and (0, I),
meaning displacements of 0 and 1, respectively, relative to the base of the level-0 frame
(i.e., the globals).
The addresses of the local variables p l and p2 are shown as (1, 3) and (1, 4),
meaning displacements of 3 and 4, respectively, relative to the base of a level-1 frame.
The address of the local variable q is shown as (2, 3), meaning a displacement of 3
relative to the base of a level-2 frame. And so on.
Notice that the address displacements of local variables start at 3. The reason is that
the first three words of a frame contain link data, as shown in Figure 6.16.
0
284 Programming Language Processors in Java

public class Unknownvalue extends RuntimeEntity {

public EntityAddress address; / / the address where the
/ / unknown value is stored

public class KnownAddress extends RuntimeEntity {

public EntityAddress address; / / the known address itself
...

public class EntityAddress {

public byte level;
public short displacement;

In the Mini-Triangle code generator, we enhance the visitorlencoding methods as

follows:
public Object visit ...Command
(...Command com, Object arg) {
Frame frame = (Frame) arg;
... / / Generate code as specified by 'execute corn'.
/ / frame . level is the routine level of com.
/ / frame. size is the amount of frame storage already in use.
return null;
1
public Object visit ...Expression
(...Expressionexpr, Object arg) {
Frame frame = (Frame) arg;
.. . / / Generate code as specified by 'evaluate expr'.
/ / frame . level is the routine level of expr.
/ / frame. size is the amount of frame storage already in use.
return new Short (size of expr's result) ;

public Object visit ...Declaration

(...Declaration decl, Object arg) {
Frame frame = (Frame) arg;
... / / Generate code as specified by 'elaborate decl'.
/ / frame . leve1 is the routine level of dec1.
/ / frame. size is the amount of frame storage already in use.
return new Short (amount of extra storage allocated by dec 1) ;
I
We can provide encodeAssign and encodeFetch with explicit Frame
arguments:
Code Generation 2E

private void encodeAssign

(Vname vname, Frame frame, short s) {
... / / Generate code as specified by 'assign vname'.
/ / frame . level is the routine level of vname.
/ / s is the size of the value to be assigned.
1
private void encodeFetch
(Vname vname, Frame frame, short s) {
... / / Generate code as specified by 'fetch vname'.
/ / frame . level is the routine level of vname.
/ / s is the size of the value to be assigned.
1
The following method implements code template (7.19):
private void encodeAssign
(Vname vname, Frame frame, short s) {
RuntimeEntity entity =
(RuntimeEntity) vname.visit(this, null);
EntityAddress address =
((KnownAddress) entity).address;
emit(Instruction.STOREop, s ,
displayRegister(frame.leve1, address.level),
address.disp1acement);
1
The following method implements code templates (7.16) and (7.18):
private void encodeFetch (Vname vname,
Frame frame, short s) {
RuntimeEntity entity =
(RuntimeEntity) vname.visit (this, null) ;
if (entity instanceof KnownValue) C
short v = ((KnownValue) entity) .value;
emit(Instruction.LOADLop, 0 , 0 , v) ;
1 else
EntityAddress address =
(entity instanceof Unknownvalue) ?
((Unknownvalue) entity).address :
((KnownAddress) entity) .address;
emit(Instruction.LOADop, s,
displayRegister(frame.level,
address.level), address-displacement);
1
1
The following auxiliary method displayRegister implements equations (7.20):
286 Programming Language Processors in Java

private byte displayRegister

(byte currentlevel, byte entitylevel)
{ ... 1

The following methods show how the entity descriptions are now set up:
public Object visitConstDeclaration
(ConstDeclaration decl, Object arg) {
Frame frame = (Frame) arg;
if (dec1.E instanceof IntegerExpression) {
IntegerLiteral IL =
((IntegerExpression) dec1.E).IL;
decl.entity = new KnownValue
(1, valuation(IL.spelling));
return new Short(0);
1 else {
short s =
shortValueOf(decl.E.visit(this, frame));
decl.entity = new Unknownvalue
(s, frame.leve1, frame.size);
return new Short(s);
1
1
public Object visitvar~eclaration
(VarDeclaration decl, Object arg) {
Frame frame = (Frame) arg;
short s = shortValueOf(decl.T.visit(this,null));
emit(Instruction.PUSHop,0, 0, s);
decl.entity = new KnownAddress
(1, frame.leve1, frame.size);
return new Short(s);
1
When the appropriate visitorlencoding method is called to translate a procedure body,
the frame level must be incremented by one and the frame size set to 3, leaving just
enough space for the link data:
Frame outerFrame = ... ;
Frame 1ocalFrame = new Frame(outerFrame.level + 1, 3);
Finally, method encode starts off with a frame at level 0 and with no storage
allocated:
public void encode (Program prog) {
Frame globalFrame = new Frame(0, 0);
prog.visit(this, globalFrame);
1
288 Programming Language Processors in Java

elaborate [[procI ( ) - Cj =
JUMP g
e : execute C
RETURN ( 0 ) 0
g:
The generated routine body consists simply of the object code 'execute C' followed by a
RETURN instruction. The two zeros in the RETURN instruction indicate that the routine
has no result and no arguments. Since we do not want the routine body to be executed at
the point where the procedure is declared, only where the procedure is called, we must
generate a jump round the routine body. The routine's entry address, e, must be bound
to I for future reference.
The code template specifying translation of a procedure call would be:
execute [[I ( ) 1 = (7.24)
CALL(SB) e where e = entry address of routine bound to I
This is straightforward. The net effect of executing this CALL instruction will be simply
to execute the body of the routine bound to I.
0
Example 7.18 Object code for Mini-Triangle plus global procedures
The following extended Mini-Triangle program illustrates a procedure declaration and
call:
let
var n: Integer;
proc P ( ) -
n : = n * 2
in
begin
n : = 9;
PO
end
The corresponding object program illustrates code templates (7.23) and (7.24):
elaborate [[var n :
Integer] I 0: PUSH 1
I : JUMP 7
2: LOAD 0 [SBI
elaborate [[procp ( ) - < execute ([n : = n*2] 2
n : = n*2] 4: CALL mu1 t
5: STORE 0 [SBI
i 6: RETURN (0) 0
Code Generation 289

execute [n : = 9j 7: LOADL 9
execute [[begin n : = 9 ; 8: STORE 0 [SBI
P ( ) endl 9: CALL (SB) 2
10: POP(0) 1
11: HALT
The corresponding decorated AST and entity descriptions are shown in Figure 7.4.
0
A function is translated in much the same way as a procedure. The only essential
difference is in the code that returns the function result.

Program
I
Letcommand

!
! \
\
! \
! Ident. i Ident. Ident. Ident. OD. 1nt.Lit. Ident. 1nt.Lit. Ident.

known address
address = 2

Figure 7.4 Entity description for a known routine.

290 Programming Language Processors in Java

Example 7.19 Code templates for Mini-Triangle plus global functions

Suppose that Mini-Triangle is to be extended with parameterless functions. The
syntactic changes are as follows:
Declaration ..- ...
( func ldentifier ( ) :Type-denoter - Expression (7.25)
Expression .-
.- ...
I Identifier ( 1
As in Example 7.17, we shall assume that all function declarations are global.
The code template specifying translation of a function declaration to TAM code
would be:
elaborate [func I ( ) : T - EJ = (7.27)
JUMP g
e : evaluate E
RETURN ( s ) 0 where s = size T
s:
This RETURN instruction returns a result of size s, that result being the value of E. The
function has no arguments, so the RETURN instruction removes 0 words of arguments
from the stack.
The code template specifying translation of a function call to TAM code would be:
evaluate [ I( )J = (7.28)
CALL (SB) e where e = entry address of routine bound to I
which is similar to (7.24).

7.4.2 Nested procedures and functions

Now consider a source language that allows procedures and functions to be nested, and
allows them to access nonlocal variables. In this case the implementation needs static
links, as explained in Section 6.4.2. The call instruction (or instruction sequence) must
designate not only the entry address of the called routine but also an appropriate static
link.
Suppose that a procedure is represented by a routine R in the object code. R's entry
address is known to the code generator, as we have already seen. The appropriate static
link for a call to R will be the base address of a frame somewhere in the stack. This base
address is not known to the code generator. But the code generator can determine which
display register will contain that static link, at the time when R is called. The appropriate
register is determined entirely by a pair of routine levels known to the code generator:
the routine level of R's declaration and the routine level of the code that calls R.
292 Programming Language Processors in Java

Example 7.21 Code templates for Mini-Triangle plus nested procedures

Consider again the language Mini-Triangle extended with parameterless procedures.
The syntax is unchanged from Example 7.17, but now we shall allow nested procedure
declarations.
The code template for a procedure declaration is unchanged:
elaborate [proc I ( ) - a= (7.29)
JUMP g
e : execute C
RETURN (0) 0
g:
but now the entity description bound to I must include the address pair (1, e), where 1 is
the current routine level, and e is the entry address.
The code template for a procedure call becomes:
execute [ I ( ) 1 = (7.30)
CALL(r) e where (1, e ) = address of routine bound to I,
cl = current routine level,
r = display-register(c1, 1)
The net effect of executing this CALL instruction will be to execute the command C that
is the body of the procedure bound to I, using the content of register r as the static link.
The latter is determined using the auxiliary function display-register, which is defined
by equations (7.20).

Example 7.22 Code generation for Mini-Triangle plus nested procedures

Code template (7.29) would be implemented by the following new visitorlencoding
method:
public Object visitProcDeclaration elaborate [proc I
(~roc~eclaration decl, ()-c'n =
Object arg) {
Frame outerFrame = (Frame) arg;
short j = nextInstrAddr; j :
emit(Instruction.JUMPop, 0 , JUMP g
Instruction.CBr, 0);
short e = nextInstrAddr; e:
decl.entity = n e w ~nownRoutine
(2, outerFrame.leve1, e ) ;
Frame 1ocalFrame = new Frame
(outerFrame.leve1 + 1, 3);
decl.C.visit(this, 1ocalFrame); execute C
Code Generation 293

emit(Instruction.RETURNop, 0, 0, 0); RETURN(0) 0

short g = nextInstrAddr; g:
~atch(j,9);
return new Short(0);
3
This assumes a new kind of entity description:
public class KnownRoutine extends RuntimeEntity {
public EntityAddress address;
...
1
where address.level is the level of the routine and address.displacement is
its entry address.
Code template (7.30) would be implemented by the following visitorlencoding
method:
public Object visitCallCommand execute [ I ( )I =
(CallCommand com,
Object arg) {
Frame frame = (Frame) arg;
EntityAddress address =
((KnownRoutine)
com.1.decl.entity).address
emit(Instruction.CALLop,
displayRegister(
frame.leve1,
address.level),
Instruction.CBr,
address.displacement);
return null;
1

7.4.3 Parameters
Now let us consider how the code generator implements parameter passing. Every
source language has one or more parameter mechanisms, the means by which
arguments are associated with the corresponding formal parameters.
As explained in Section 6.5.1, a routine protocol is needed to ensure that the calling
code deposits the arguments in a place where the called routine expects to find them. If
the operating system does not impose a routine protocol, the language implementor must
design one, taking account of the source language's parameter mechanisms and the
target machine architecture.
Code Generation 295

Example 7.24 Code templates for procedures with parameters

Consider Mini-Triangle extended with procedures, and constant and variable param-
eters. For simplicity we shall assume that each procedure has a single formal parameter.
The syntactic changes, for procedure declarations and procedure calls, are as follows:
Declaration ..-
..- ...
I proc ldentifier ( Formal-Parameter ) - Command (7.31)
Formal-Parameter ::= ldentifier : Type-denoter
I var ldentifier : Type-denoter
Command ..- ...
( ldentifier ( Actual-Parameter )
Actual-Parameter ::= Expression
I var V-name
Production rules (7.32a) and (7.34a) are concerned with constant parameters; production
rules (7.32b) and (7.34b) are concerned with variable parameters.
The code template for a procedure declaration is now:
elaborate [[proc I ( FP ) - CJ = (7.35)
JUMP g
e : execute C
RETURN ( 0 ) d where d = size of formal parameter FP
g:
Since the TAM routine protocol requires the caller to push the argument on to the stack,
the routine body itself contains no code corresponding to the formal parameter FP.
The code template specifying translation of a procedure call to TAM code is now:
execute [[I ( A P ) ]I = (7.36)
pass-argument A P
CALL ( r ) e where (I, e) = address of routine bound to I,
cl = current routine level,
r = display-register(c1, 1)
The code templates for actual parameters are:
pass-argument [[a =
evaluate E
pass-argument [var Vj =
fetch-address V
Code template (7.37b) uses a new code function for value-or-variable-names:
fetch-address : V-name -+ Instruction* (7.38)
where tfetch-address V' is code that will push the address of the variable V on to the
stack top.
296 Programming Language Processors in Java

The code templates for value-or-variable-names are generalized as follows:

fetch [ZJj =
(i) if I is bound to a known value:
LOADL v where v = value bound to I
(ii) if I is bound to an unknown value or known address:
LOAD(s) d [ r ] wheres=size(typeofI),
(1, 6)= address bound to I,
cl = current routine level,
r = display-register(c1, I)
(iii) if I is bound to an unknown address:
LOAD(1) d [ r ]
LOAD1 ( s ) where s = size(type of I),
(I, d) = address bound to I,
e l = current routine level,
r = display-register(c1, 1)
assign [[a =
(i) if I is bound to a known address:
STORE ( s ) d [ r ] where s = size(type of I),
(I, 4 = address bound to I,
cl = current routine level,
r = display-register(c1, I)
(ii) if I is bound to an unknown address:
LOAD(1) d[r]
STORE1 ( s ) where s = size(type of I),
(I, 6)= address bound to I,
cl = current routine level,
r = display-register(c1, I)
fetch-address [[Ij =
(i) if I is bound to a known address:
LOADA d [ r ] where (I, d) = address bound to I,
cl = current routine level,
r = display-register(c1, I)
(ii) if I is bound to an unknown address:
LOAD ( 1) d [ r ] where (I, d) = address bound to I,
e l = current routine level,
r = display-register(c1, 1)
Code Generation 297

7.5 Case study: code generation in the

Triangle compiler
The Triangle code generator consists of a package Triangle.CodeGenerator that
contains the Encoder,and the classes for the various kinds of run-time entity. The
Encoder class depends on the package Triangle.Abstract SyntaxTrees,
which contains all of the class definitions for ASTs, and on the package TAM,which
contains the definition of the Triangle abstract machine.
The Triangle code generator was designed and implemented using techniques
similar to those described in this chapter. Some extensions were necessary to deal with
particular features of Triangle. Here we briefly discuss some of these extensions.

7.5.1 Entity descriptions

The Triangle code generator deals with a wide variety of entities and entity descriptions,
some of which we have not yet met. The following kinds of entity description are used:
Known value: This describes a value bound in a constant declaration whose right side
is a literal, e.g.:
const daysperweek - 7;
const currency - ' $ I

Unknown value: This describes a value bound in a constant declaration, if obtained by

evaluating an expression at run-time, e.g.:'
const area - length * breadth;
const nu1 - chr (0)
It also describes an argument value bound to a constant parameter, e.g., the value
bound to n in:
func odd (n: Integer) : Boolean - ...
Known address: This describes an address allocated and bound in a variable declar-
ation. The code generator represents each address by a (level, displacement) pair, as
described in Section 7.3.3.
Unknown address: This describes an argument address bound to a variable parameter,
e.g., the address bound to n in:
proc inc (var n: Integer) - ...

' In principle, nu1 in this example could be treated as bound to a known value. However, the
code generator would have to be enhanced to evaluate the expression 'chr ( 0 ) ' itself, using a
technique called constant folding.
298 Programming Language Processors in Java

Known routine: This describes a routine bound in a procedure or function declaration,

e.g., the routines bound to inc and odd in the above examples.
Unknown routine: This describes an argument routine bound to a procedural or
functional parameter, e.g., the routine bound to p in:
proc filter (func p (x: Integer): Boolean;
var 1: IntegerList) - ...
Primitive routine: This describes a primitive routine provided by the abstract
machine. Primitive routines are bound in the standard environment to operators and
identifiers, e.g., to '+', '<', eof,and get.
Equality routine: This describes one of the primitive routines provided by the abstract
machine for testing (in)equality of two values. Equality routines are generic, in that
the values can be of any size. Equality routines are bound to the operators '=' and
'\='.
Field: This describes a field of a record type. Every record field has a known offset
relative to the base of the record (see Section 6.1.2), and the field's entity description
includes this offset.
Type representation: This describes a type. Every type has a known size, which is
constant for all values of the type (see Section 6.1), and the type's entity description
includes that size.

7.5.2 Constants and variables

A value-or-variable-name in the Triangle program identifies a constant or variable.
Either a constant or a variable may be used as an expression operand, but only a variable
may be used on the left side of an assignment command. These two usages give rise to
two different code functions on value-or-variable-names:
fetch : V-name -t Instruction*
assign : V-name 4 Instruction*
In the little language Mini-Triangle used as a running example in this chapter, a
value-or-variable-name was just an identifier (declared in a constant or variable declar-
ation). Accordingly, fetch was defined by a single code template (plus a special-case
code template), and assign by a single code template.
More realistic programming languages have composite types, and operations to
select components of composite values and variables. In Triangle, a record value-or-
variable-name can be subjected to field selection, and an array value-or-variable-name
can be indexed.
Code Generation 295

Example 7.25 Addressing composite variables

Consider the following Triangle declarations:
type Name = array 15 of Char;
TelNumber = array 10 of Char;
Entry = record
name: Name;
num: TelNumber
end ;
Directory = record
count: Integer;
entry: array 100 of Entry
end ;
var dir: Directory
Now, the following are all value-or-variable-names:
dir
dir . count
dir.entry
dir.entry[i]
dir.entry[i].num
dir.entry[i].name
dir.entry [i].name[j]
The code generator will compute the following type sizes:
size[[Name] = 15x1 = 15words
size([TelNumber] = 10x1 = lowords
sizel[Entry] = 15 + 10 = 25 words
size[[array 100 of Entry] = 100 x 25 = 2500 words
sizeIDirectory] = 1 + 2500 = 2501 words
It will also compute the offsets of the fields of record type Entry:
offset([name] = 0 words
offsetl[num] = 15 words

and those of record type Directory:

offset[count] = 0 words
offset[[entry] = 1 word
As in Section 6.1, we use the notation address v for the address of variable (or
constant) v. For the various components of dir we find:
addresstdir .count] = addressl[dir] + 0
address[dir . entry] = address([dir] + 1
addressIdir . entry [i]1 = address[dir] + 1 + 25i
Code Generation 301

now nearly all programs - even operating systems - are written in high-level languages.
So it makes more sense for the machine to support the code generator by, for example,
providing a simple regular instruction set. A lucid discussion of the interaction between
code generation and machine design may be found in Wirth (1986).
Almost all real machines have general-purpose andlor special-purpose registers;
some have a stack as well. The number of registers is usually small and always limited.
It is quite hard to generate object code that makes effective use of registers. Code
generation for register machines is therefore beyond the scope of this introductory
textbook. For a thorough treatment, see Chapter 9 of Aho et al. (1985).
The code generator described in this chapter works in the context of a multi-pass
compiler: it traverses an AST that represents the entire source program. In the context of
a one-pass compiler, the code generator would be structured rather differently: it would
be a collection of methods, which can be called by the syntactic analyzer to generate
code 'on the fly' as the source program is parsed. For a clear account of how to organize
code generation in a one-pass compiler, see Welsh and McKeag (1980).
The sheer diversity of machine architectures is a problem for implementors. A
common practice among software vendors is to construct a family of compilers, trans-
lating a single source language to several different target machine languages. These
compilers will have a common syntactic analyzer and contextual analyzer, but a distinct
code generator will be needed for each target machine. Unfortunately, a code generator
suitable for one target machine might be difficult or impossible to adapt to a dissimilar
target machine. Code generation by pattern matching is an attractive way to reduce the
amount of work to be done. In this method the semantics of each machine instruction is
expressed in terms of low-level operations. Each source-program command is translated
to a combination of these low-level operations; code generation then consists of finding
an instruction sequence that corresponds to the same combination of operations. A
survey of code generation by pattern matching may be found in Ganapathi et al. (1982).
Fraser and Hansen (1995) describe in detail a C compiler with three alternative
target machines. This gives a clear insight into the problems of code generation for
dissimilar register machines.

Exercises

Section 7.1
7.1 The Triangle compiler uses code template (7.8e) for while-commands, but
many compilers use the following alternative code template:
302 Programming Language Processors in Java

execute ( [ w h i1e E d o a
=
g : evaluate E
JUMPIF ( 0 ) h
execute C
JUMP g
h:
Convince yourself that the alternative code template is semantically equivalent
to (7.8e).
Apply the alternative code template to determine the object code of:
execute ( [ w h i l en > 0 do n := n - 21
Compare with Example 7.3, and show that the object code is less efficient.
Why, do you think, is the alternative code template commonly used?

7.2" Suppose that Mini-Triangle is to be extended with the following commands:

(a) VI , V2 := El , E2
This is a simultaneous assignment: both El and E2 are to be evaluated,
and then their values assigned to the variables V1 and V2,respectively.
(b) C1 , C2
This is a collateral command: the subcommands C 1 and C2 are to be exe-
cuted in any order chosen by the implementor.
(c) if EthenC
This is a conditional command: if E evaluates to true, C is executed,
otherwise nothing.
(d) repeat C u n t i l E
This is a loop command: E is evaluated at the end of each iteration (after
executing C), and the loop terminates if its value is true.
(e) r e p e a t C 1w h i l e E d o C2
This is a loop command: E is evaluated in the middle of each iteration
(after executing C 1 but before executing C2),and the loop terminates if its
value is false.
Write code templates for all these commands.

7.3* Suppose that Mini-Triangle is to be extended with the following expressions:

(a) i f E l then E2 e l s e E3
This is a conditional expression: if E evaluates to true, E2 is evaluated,
otherwise E3 is evaluated. (E2 and E3 must be of the same type.)
Code Generation 303

(b) let D i n E
This is a block expression: the declaration D is elaborated, and the resul-
tant bindings are used in the evaluation of E.

(c) begin C ; yield E end

Here the command C is executed (making side effects), and then E is
evaluated.
Write code templates for all these expressions.

Section 7.2
7.4* Implement the visitor/encoding methods visit ...Expression (along the
lines of Example 7.8) for the expressions of Exercise 7.3.

7.5* Implement the visitor/encoding methods visit ...Command(along the lines

of Example 7.8) for the commands of Exercise 7.2. Use the technique
illustrated in Example 7.9 for generating jump instructions.

Section 7.3
7.6 Classify the following declarations according to whether they bind identifiers
to known or unknown values, variables, or routines.
(a) Pascal constant, variable, and procedure declarations, and Pascal value,
variable, and procedural parameters.

(b) ML value and function declarations, and ML parameters.

7.7" Suppose that Mini-Triangle is to be extended with a for-command of the form

'forI from El to Ez do C', with the following semantics. First, the expres-
sions Eland E2are evaluated, yielding the integers rn and n, respectively. Then
the subcommand C is executed repeatedly, with I bound to the integers rn, rn+ 1,
..., n in successive iterations. If m > n, C is not executed at all. The scope of I
is C, which may fetch I but may not assign to it.
(a) Write a code template for the for-command.

(b) Use it to implement a visitor/encoding method visitForCommand

(along the lines of Example 7.13).

7.8" Suppose that Mini-Triangle is to be extended with array types, as found in

Triangle itself. The relevant extensions to the Mini-Triangle grammar are:
V-name ..- ...
I V-name [ Expression I
304 Programming Language Processors in Java

Type-denoter ::= ...

( array Integer-Literal of Type-denoter
(a) Modify the Mini-Triangle code specification accordingly.
(b) Modify the Mini-Triangle code generator accordingly.

Section 7.4
7.9" Modify the Mini-Triangle code generator to deal with parameterized pro-
cedures, using the code templates of Example 7.24.

7.10" A hypothetical programming language's function declaration has the form

'func I ( FP ) : T - C', i.e., its body is a command. A function body may
contain one or more commands of the form ' r e s u l t E'. This command
evaluates expression E, and stores its value in an anonymous variable
associated with the function. On return from the function, the latest value
stored in this way is returned as the function's result.
(a) Modify the Mini-Triangle code specification as if Mini-Triangle were ex-
tended with functions of this form.
(b) Modify the Mini-Triangle code generator accordingly.
CHAPTER EIGHT

Interpretation

An interpreter takes a source program and executes it immediately. Immediacy is the

key characteristic of interpretation; there is no prior time-consuming translation of the
source program into a low-level representation.
In an interactive environment, immediacy is highly advantageous. For example, the
user of a command language expects an immediate response from each command; it
would be absurd to expect the user to enter an entire sequence of commands before
seeing the response from the first one. Similarly, the user of a database query language
expects an immediate answer to each query. In this mode of working, the 'program' is
entered once and then discarded.
The user of a programming language, on the other hand, is much more likely to
retain the program for further use, and possibly further development. Even so,
translation from the programming language to an intermediate language followed by
interpretation of the intermediate language (i.e., interpretive compilation) is a good
alternative to full compilation, especially during the early stages of program
development.
In this chapter we study two kinds of interpretation:
iterative interpretation
recursive interpretation.
Iterative interpretation is suitable when the source language's instructions are all
primitive. The instructions of the source program are fetched, analyzed, and executed,
one after another. Iterative interpretation is suitable for real and abstract machine codes,
for some very simple programming languages, and for command languages.
Recursive interpretation is necessary if the source language has composite instruc-
tions. (In this context, 'instructions' could be statements, expressions, andlor declar-
ations.) Interpretation of an instruction may trigger interpretation of its component
instructions. An interpreter for a high-level programming language or query language
must be recursive. However, recursive interpretation is slower and more complex than
iterative interpretation, so we usually prefer to compile high-level languages, or at least
translate them to lower-level intermediate languages that are suitable for iterative
interpretation.
308 Programming Language Processors in Java

/ / Data store ...

public short[] data = new short[DATASIZE];
/ / Registers ...
public short PC;
public short ACC;
public byte status;
public static final byte / / statusvalues
RUNNING = 0, HALTED = 1, FAILED = 2 ;
1
Here the code store is represented by an array of instructions, code;the data store is
represented by an array of words, data;and the registers are represented by variables
PC,ACC,and status.
The following class will implement the Hypo loader and emulator:
public class HypoInterpreter extends Hypostate I
public void load ( ) {
... / / Load the program into the code store,
/ / starting at address 0.
1
public void emulate ( ) {
... / / Run the program contained in the code store,
/ / starting at address 0.
1

The following method is the emulator proper. Its control structure is a switch-
statement within a loop, preceded by initialization of the registers. Each case of the
switch-statement follows directly from Table 8.1.
public void emulate () {

/ / Initialize ...
PC = 0; ACC = 0; status = RUNNING;

/ / Fetch the next instruction ...

HypoInstruction instr = code[PC++l;
/ / Analyze this instruction . . .
byte op = instr.op;
short d = instr.d;
/ / Execute this instruction ...
switch (op) {
Interpretation 309

case STOREop: data[d] = ACC; break;

case LOADop: ACC = data[d]; break;
case LOADLop: ACC = d; break;
case ADDop: ACC += data [dl ; break;
case SUBO~: ACC - = data [dl ; break;
case JUMPO~: PC = d; break;
case JUMPZop: if (ACC == 0 ) PC = d; break;
case HALTop: status = HALTED; break;
default: status = FAILED;

This emulator has been kept as simple as possible, for clarity. But it might behave
unexpectedly if, for example, an ADD or SUB instruction overflows. A more robust
version would set status to FAILED in such circumstances. (See Exercise 8.1 .)
0
When we write an interpreter like that of Example 8.1, it makes no difference
whether we are interpreting a real machine code or an abstract machine code. For an
abstract machine code, the interpreter will be the only implementation. For a real
machine code, a hardware interpreter (processor) will be available as well as a software
interpreter (emulator). Of these, the processor will be much the faster. But an emulator
is much more flexible than a processor: it can be adapted cheaply for a variety of
purposes. An emulator can be used for experimentation before the processor is ever
constructed. An emulator can also easily be extended for diagnostic purposes. (Exercises
8.2 and 8.3 suggest some of the possibilities.) So, even when a processor is available, an
emulator for the same machine code complements it nicely.

Table 8.1 Instruction set of the hypothetical machine Hypo.

Op-code Instruction Meaning

0 STORE d word at address d t ACC
I LOAD d ACC t word at address d
2 LOADL d ACC t d
3 ADD d ACC t ACC + word at address d
4 SUB d ACC t ACC -word at address d
5 JUMP d P C t d
6 JUMPZ d PCtd.ifACC=O
1 7 1 HALT I sto~
execution I
Interpretation 3 1 1

8.1.2 Iterative interpretation of command languages

Command languages (such as the UNIX shell language) are relatively simple languages.
In normal usage, the user enters a sequence of commands, and expects an immediate
response to each command. Each command will be executed just once. These factors
suggest interpretation of each command as soon as it is entered. In fact, command
languages are specifically designed to be interpreted. Below we illustrate interpretation
of a simple command language.

Example 8.2 Interpreter for Mini-Shell

Consider a simple command language, Mini-Shell, that allows us to enter commands
such as:
delete a b c
create f
list
edit f
/bin/sort f
print f 2
quit
The above is an example of a script, which is just a sequence of commands. Each
command is to be executed as soon as it is entered.
Mini-Shell provides several built-in commands. In addition, any executable program
(such as / b i n / s o r t ) can be run simply by giving the name of the file containing it. A
command can be passed any number of arguments, which may be filenames or literals.
The commands and their meanings are given in Table 8.2.

Table 8.2 Commands in Mini-Shell.

Command Argument(s) Meaning
create filename Create an empty file with the given name.
delete filename, . .. filename, Delete all the named files.
edit filename Edit the named file.
list I none I List the names of all files owned by the current user.
-- -
- -
-

print filename number Print the given number of copies of the named file.
Run execut- filename arg, .. . arg, Run the executable program contained in the named
able program file, with the given arguments.
3 12 Programming Language Processors in Java

The syntax of a script is as follows:

Script ..-
. Command* (8.1)
Command ..- Command-Name Argument* end-of-line (8.2)
Argument ..- Filename
1 Literal

Command-Name ::= create

1 delete
I edit
I list
I print
I wit
I Filename

Production rules for Filename and Literal have been omitted here.
In the Mini-Shell interpreter, we can represent commands as follows:
public class MiniShellCommand {
public String name;
public String[] args;
1
The following class represents the Mini-Shell state:
public class ~iniShellState {
/ / File store ...
public ... ;
/ / Registers ...
public byte status; / / RUNNINGorHALTEDorFAILED
public static final byte / / status values
RUNNING = 0 , HALTED = 1, FAILED = 2 ;
1
There is no need for either a code store or a code pointer, since each command will be
executed only once, as soon as it is entered.
The following class will implement the Mini-Shell interpreter:
public class Minishell extends Minishellstate {

public void interpret ( ) {

... / / Execute the commands entered by the user,
/ / terminating on command quit.
Interpretation 3 13

public MiniShellCommand readlnalyze 0 {

... / / Read, analyze, and return the next command entered by the user.

public void create (String fname) {

. . . / / Create an empty file with the given name.
I
public void delete (String[] £names) {
... / / Delete all the named files.

public void edit (String fname) {

.. . / / Edit the named file.
I
public void list ( ) {
... / / List names of all files owned by the current user.
1
public void print (String fname, String copies) {
... / / Print the given number of copies of the named file.
I
public void exec (String fname, String[] args) {
... / / Run the executable program contained in the named file, with
/ / the given arguments.

It will be convenient to combine fetching and analysis of commands. This is done by

method readAnalyze.
The following method is the interpreter proper. It just reads, analyzes, and executes
the commands, one after another:
public void interpret () {

/ / Initialize .. .
status = RUNNING;
do C
/ / Fetch and analyze the next instruction ...
MiniShellCommand com = readAnalyze0 ;
/ / Execute this instruction . . .

else if (com.name.equals("delete"))
delete(com.args);
3 14 Programming Language Processors in Java

else if (com.name.equals("edit"))
edit (com.args[O]) ;
else if (com.name.equals("list"))
list ( ) ;
else if (com.name.equals("print"))
print(com.args[O], com.args[l]);
else if (com.name.equals("quit"))
status = HALTED;
else / / executable program
exec (com.name, com.args) ;
} while (status == RUNNING);

8.1.3 Iterative interpretation of simple programming

languages
Iterative interpretation is also possible for certain programming languages, provided that
a source program is just a sequence of primitive commands. The programming language
must not include any composite commands, i.e., commands that contain subcommands.
In the iterative interpretation scheme, the 'instructions' are taken to be the
commands of the programming language. Analysis of a command consists of syntactic
and perhaps contextual analysis. This makes analysis far slower and more complex than
decoding a machine-code instruction. Execution is controlled by the form of command,
as determined by syntactic analysis.

Example 8.3 Interpreter for Mini-Basic

Consider a simple programming language, Mini-Basic, with the following syntax
(expressed in EBNF):
Program ...- Command* (8.5)
Command ...- Variable = Expression (8.6a)
1 read Variable (8.6b)
( write Variable (8.6~)
( go Label (8.6d)
( if Expression Relational-Op Expression (8.6e)
go Label
I stop (8.60
Expression .
..- primary-Expression (8.7a)
I Expression Arithmetic-Op primary-Expression (8.7b)
Interpretation 3 15

primary-Expression ::= Numeral

I Variable
I ( Expression )

Arithmetic-Op ::= + [ - [ * I / (8.9a-d)

Relational-Op
Variable . a 1 b 1 c 1 ... 1 z (8.1 1a-z)
Label .
..- Digit Digit* (8.12)
A Mini-Basic program is just a sequence of commands. The commands are
implicitly labeled 0, 1, 2, etc., and these labels may be referenced in g o and i f
commands. The program may use up to twenty-six variables, which are predeclared.
The semantics of Mini-Basic programs should be intuitively clear. All values are real
numbers. The program shown in Figure 8.3 reads a number (into variable a ) , computes
its square root accurate to two decimal places (in variable b), and writes the square root.
It is easy to imagine a Mini-Basic abstract machine. The Mini-Basic program is
loaded into a code store, with successive commands at addresses 0, 1, 2, etc. The code
pointer, CP, contains the address of the command due to be executed next.
The program's data are held in a data store of 26 cells, one cell for each variable.
Figure 8.3 illustrates the code store and data store. Figure 8.4 shows how the abstract
machine's state would change during the first few execution steps of the square-root
program, assuming that the number read is 10.0.
We must decide how to represent Mini-Basic commands in the code store. The
choices, and their consequences, are as follows:

(a) Source text: Each command must be scanned and parsed at run-time (i.e., every
time the command is fetched from the code store).
(b) Token sequence: Each command must be scanned at load-time, and parsed at run-
time.
(c) AST: All commands must be scanned and parsed at load-time.
Choice (a), illustrated in Figure 8.3, would slow the interpreter drastically. Choice (c) is
better but would slow the loader somewhat. Choice (b) is a reasonable compromise, so
let us adopt it here:
class T o k e n {
byte k i n d ;
String spelling;
1
class ScannedCornrnand {
Token [ 1 t o k e n s ;
1
Interpretation 3 17

public abstract class Command {

/ / A Command object is an AST representing a Mini-Basic command.
public void execute (MiniBasicState state);
/ / Execute the command, using state.

And similarly for expression ASTs:

public abstract class Expression {
/ / An Expression object is an AST representing a Mini-Basic
/ / expression.
public float evaluate (MiniBasicState state);
/ / Evaluate the expression, using state,and return its result.

Later we shall define concrete subclasses for particular forms of commands and expres-
sions. These will implement the methods execute and evaluate,which we shall
call interpreting methods.
Note that we must allow the interpreting methods to access the state of the Mini-
Basic abstract machine, hence their argument state.The following class will represent
the abstract machine state:
public class ~iniBasicState {
public static final short CODESIZE = 4096;
public static final short DATASIZE = 26;
/ / Code store . ..
public ScannedCommand[] code =
ScannedCommand[CODESIZE];
/ / Data store . . .
public float[] data = new float[DATASIZEl;
/ / Registers . ..
public short CP;
public byte status;
public static final byte / / status values
RUNNING = 0, HALTED = 1, FAILED = 2;

Here the code store is represented by an array of scanned commands, code.The

data store is represented by an array of real numbers, data, indexed by variable
addresses (using 0 for a, 1 for b, ..., 25 for z). The registers are represented by
variables Cp and status.
318 Programming Language Processors in Java

The following class will define the Mini-Basic interpreter:

public class MiniBasicInterpreter
extends Mini~asicState {
gublic void load ( 1 {
. . . / / Load the program into the code store, starting at address 0.
1
gublic void run ( ) {
... / / Run the program in the code store, starting at address 0.
1
public static Command parse
(ScannedCommand scannedcom) {
... / / Parse scannedcom,and return the corresponding
/ / command AST.
1
1
Note that we need a method, here called parse, to parse a scanned command an
translate it to an AST.
The following method is the interpreter proper. It just fetches, analyzes, and execute
the commands, one after another:
public void run () {

/ / Initialize ...
CP = 0; status = RUNNING;
do {
...
/ / Fetch the next instruction
ScannedCommand scannedcom = code[CP++l;
/ / Analyze this instruction ...
Command analyzedcom = parse(scannedC0m);
/ / Execute this instruction ...
analyzedCom.execute((MiniBasicState) this);
) while (status == RUNNING);
1
Now we must define how to represent and execute analyzed commands. We intrc
duce a subclass of Command for each form of command in Mini-Basic:
public class Assigncommand extends Command {
byte V ; / / left-side variable address
Expression E ; / / right-side expression
Interpretation 3 I!

public void execute (MiniBasicState state) {

state.data[V] = E.evaluate(state);
1
1
public class GoCommand extends Command {
short L ; / / destination label
public void execute (MiniBasicState state) {
state.CP = L;
1
1
public class Ifcommand extends Command {
Token R; / / relational-op
Expression El, E2 ; / / subexpressions
short L ; / / destination label
public void execute (MiniBasicState state) {
float numl = El.evaluate(state);
float nwn2 = E2.evaluate(state);
if compare(R, numl, num2)
state.CP = L;
1
private static boolean compare
(Token relop, float numl, float num2) {
. . . / / Return the result of applying relational operator
/ / relop to numl and num2.
1
1
public class Stopcommand extends Command {

public void execute (MiniBasicState state) {

state.status = state.HALTED;
1
1
(The Command subclasses Readcommand and Writecommand, and the various
Expression subclasses, are omitted here. See Exercise 8.5.)
Study the object-oriented design of this interpreter. Once we decided to represent
each command by an AST, we had to introduce the abstract class Command,and its
subclasses Assigncommand,GoCommand,etc. We then found it convenient to equip
each subclass of Command with an interpreting method, execute, allowing the
interpreter to use dynamic method selection to select the right code to execute a particu-
lar command. However, these interpreting methods were outside the MiniBasic-
Interpreter class, so we had to pass the abstract machine state to them via their
argument state.
320 Programming Language Processors in Java

The alternative to dynamic method selection would have been to make the interpret-
er test the subclass of each command before executing it, along the following lines:
/ / Execute this instruction ..
if (analyzedcom instanceof AssignCommand) {
AssignCommand com = (Assigncommand) analyzedcom;
data[com.V] = evaluate(c0m.E);
1
else if (analyzedcom instanceof GoCommand) {
GoCommand corn = (Gocommand) analyzedcom;
CP = c0m.L;
I
else ...
But this would not be in the true spirit of object-oriented design!

8.2 Recursive interpretation

Modern programming languages are higher-level than the simple programming
language of Example 8.3. In particular, commands may be composite: they may contain
subcommands, subsubcommands, and so on.
It is possible to interpret higher-level programming languages. However, the iterat-
ive interpretation scheme is inadequate for such languages. Analysis of each command
in the source program entails analysis of its subcommands, recursively. Likewise, exec-
ution of each command entails execution of its subcommands, recursively. Thus we are
driven inexorably to a two-stage process, whereby the entire source program is analyzed
before interpretation proper can begin. This gives rise to the recursive interpretation
scheme:
fetch and analyze the program
execute the program
where both analysis and execution are recursive.
We must decide how the program will be represented at each stage. If it is supplied
in source form, 'fetch and analyze the program' must perform syntactic and contextual
analysis of the program. A decorated AST is therefore a suitable representation for the
result of the analysis stage. Therefore 'execute the program' will operate on the pro-
gram's decorated AST.

Example 8.4 Interpreter for Mini-Triangle

Consider a recursive interpreter for the programming language Mini-Triangle of
Interpretation 32 1

Examples 1.3 and 1.8. Assume that the analyzed program is to be represented by a
decorated AST. The source program will be subjected to syntactic and contextual
analysis, and also storage allocation, before execution commences.
We must choose a representation of Mini-Triangle values. These include not only
truth values and integers, but also undefined (which is the initial value of a variable).
The following classes represent all these types of values:
public abstract class Value I 1
public class IntValue extends Value {
public short i;
I
public class BoolValue extends Value {
public boolean b;
1
public class UndefinedValue extends Value {

We assume that each of these classes is equipped with a suitable constructor.

The following class will define the abstract machine state:
public class MiniTriangleState {

public static final short DATASIZE = ... ;

/ / Code store ...
Program program; / / decorated AST
/ / Data store . . .
Value [ 1 data = new Value [DATASIZE];
/ / Register ...
byte status;
public static final byte / / status values
RUNNING = 0 , HALTED = 1, FAILED = 2;
1
Here we represent the data store, as usual, by an array. The 'code store' is just the
decorated AST representing the Mini-Triangle program. We assume the class AST,and
its subclasses Program,Command,Expression,Declaration,etc., defined in
Example 4.19.
The following class will implement the Mini-Triangle interpreter. In particular,
methods fetchAnalyze and run will implement the two stages of the recursive
interpretation scheme.
public class MiniTriangleProcessor
extends MiniTriangleState implements Visitor {
322 Programming Language Processors in Java

public void fetchAnalyze ( ) {

... / / Load the program into the code store, after
/ / subjecting it to syntactic and contextual analysis.
1
public void run ( ) {
... / / Run the program contained in the code store.
1
/ / Visitorlinterpreting methods .. .
public Object visit ...Command
(...Commandcom, Object arg);
/ / Execute com,returning null (and ignoring arg).
public Object visit ...Expression
(...Expressionexpr, Object arg);
/ / Evaluate expr,returning its result (and ignoring arg).
public Object visit...Declaration
(...Declarationdecl, Object arg);
/ / Elaborate decl,returning null (and ignoring arg).
/ / Other interpreting methods ...
private Value fetch (Vname vname);
/ / Return the value of the constant or variable vname.
private void assign (Vname vname, Value val);
/ / Assign val to the variable vname.
/ / Auxiliary methods
private static short valuation
(IntegerLiteral intlit);
/ / Return the value of intLit.
private static Value applyunary
(Operator op, Value val);
/ / Return the result of applying unary operator op to val.
private static Value applyBinary
(Operator op, Value vall, Value va12);
/ / Return the result of applying binary operator op to vall and va12.
private static void callstandardproc
(Identifier id, Value val);
/ / Call the standard procedure named id,passing val as its argument.

This Mini-Triangle processor is a visitor object (see Section 5.3.2), in which the visitor
methods act as interpreting methods.
Interpretation

The visitorlinterpreting methods for commands are implemented as follows:

public Object visitAssignCommand
(Assigncommand com, Object arg) {
Value val = (Value) corn.E .visit(this, null) ;
assign(com.V, val) ;
return null;
1
public Object visitcallcommand
(Callcommand corn, Object arg) {
Value val = (Value) com.E.visit(this, null);
callStandardProc(com.I, val);
return null;
I
public Object visitSequentialCommand
(Sequentialcommand com, Object arg) {
corn.Cl .visit(this, null);
com.C2.visit(this, null);
return null;
1
public Object visitIfCommand
(Ifcommand corn, Object arg) {
BoolValue val = (BoolValue) com.E.visit(this, null)
if (val.b) com.Cl .visit(this, null);
else com.C2.visit(this, null);
return null;
1
public Object visitWhileCommand
(Whilecommand corn, Object arg) {
for ( ; ; ) C
BoolValue val = (BoolValue)
com.E.visit(this, null) ;
if ( ! va1.b) break;
corn.C.visit(this, null);
1
return null;
1
public Object visitLetCommand
(Letcommand corn, Object arg) {
corn.D.visit(this, null);
corn.C.visit(this, null);
return null;
1
324 Programming Language Processors in Java

The visitorlinterpreting methods for expressions are implemented as follows:

public Object visitIntegerExpression
(IntegerExpression expr, Object arg)
return new IntValue(valuation(expr.IL));
1
public Object visitVnameExpression
(VnameExpression expr, Object arg) {
return fetch(expr.V);
1
public Object visitUnaryExpression
(UnaryExpression expr, Object arg) {
Value val = (Value) expr.E.visit(this, null);
return applyUnary(expr.0, val);
1
public Object visitBinaryExpression
(BinaryExpresion expr, Object arg) {
Value vall = (Value) expr.El.visit(this, null);
Value va12 = (Value) expr.E2.visit(this, null);
return applyBinary(expr.0, vall, va12);
1
The visitorlinterpreting methods for declarations are implemented as follows:
public Object visitConstDeclaration
(ConstDeclaration decl, Object arg)
KnownAddress entity = (KnownAddress) decl.entity;
Value val = (Value) decl.E.visit(this, null);
data[entity.address] = val;
return null;
1
public Object visitVarDeclaration
(VarDeclaration decl, Object arg) {
KnownAddress entity = (KnownAddress) decl.entity;
data[entity.addressl = new UndefinedValue();
return null;
1
public Object visitSequentialDeclaration
(SequentialDeclaration decl,
Object arg) {
decl.Dl.visit(this, null);
decl.D2.visit(this, null);
return null;
1
Interpretation :

/ / Code store . . .
public Instruction[I code =
new Instruction[COD~SIZE];
/ / Data store . . .
public short[] data = new short[DATASIZE];
/ / Registers . ..
public short final CB = 0;
public short CT ;
public short final PB = CODESIZE;
public short final PT = CODESIZE + 28;
public short final SB = 0;
public short ST ;
public short final HB = DATASIZE;
public short HT ;
public short LB ;
public short CP;
public byte status ;
public static final byte / / status values
RUNNING = 0, HALTED = 1, FAILED = 2;
1
The following class implements the TAM interpreter proper:
public class Interpreter extends State {

public void loadprogram ( ) {

. .. / / Load the program into the code store, starting at address 0.
. .. / / Set CT to the address of the last instruction + 1.
1
public void runprogram ( ) {
... / / Run the program contained in the code store,
/ / starting at address 0.
1

The interpreter proper is as follows. Its main control structure is a switch-statem1

within a loop. There is one case for each of the fifteen valid op-codes, and a default c:
for invalid op-codes:
public void runprogram () {

/ / Initialize ...
ST = SB; HT = HB; LB = SB; CP = CB;
Status = RUNNING;
328 Programming Language Processors in Java

do {
/ / Fetch the next instruction ...
Instruction instr = code[CP++];
/ / Analyze this instruction . . .
byte op = instr.op;
byte r = instr.r;
byte n = instr-n;
short d = instr.d;
/ / Execute this instruction ..
switch (op) {
case LOADop: ...
case LOADAop: ...
case LOADIop: ...
case LOADLop: ...
case STOREop: ...
case STOREIop: ...
case CALLop: ...
case CALLIop: ...
case RETURNop: ...
case PUSHop: ...
case POPop: .. .
case JUMPop: ...
case JUMPIFO~:...
case HALTop: status = HALTED; break;
default : status = FAILED;
1
) while (status == RUNNING);
1
The fact that TAM is a stack machine gives rise to many differences in detail from
an interpreter for a register machine. Load instructions push values on to the stack, and
store instructions pop values off the stack. For example, the TAM LOADL instruction is
interpreted as follows:
case LOADLop:
data[ST++] = d;
break ;
(Register ST points to the word immediately above the stack top, as shown in Fig-
ure C. 1 .)
Further differences arise from the special design features of TAM (outlined in
Section 6.8).
Interpretation 329

I Addressing and registers

The operand of a LOAD,LOADA,or STORE instruction is of the form 'd [ r ] ', where r is
usually a display register, and d is a constant displacement. The displacement d is added
to the current content of register r.
The display registers allow addressing of global variables (using SB), local variables
(using LB), and nonlocal variables (using L1, L2, . . .). The latter registers are related to
LB by the invariants L1 = content(LB), L2 = content(content(LB)), and so on - see
(6.25-27) in Section 6.4.2.
As explained in Section 6.8, it is not really worthwhile to have separate registers for
access to nonlocal variables. The cost of updating them (on every routine call and
return) outweighs the benefit of having them immediately available to compute the
addresses of nonlocal variables. In the TAM interpreter, therefore, L1, L2, etc., are only
pseudo-registers: their values are computed only when needed, using the above
invariants. This is captured by the following auxiliary method in the interpreter:
private s t a t i c short relative (short d, byte r) {
/ / Return the address defined by displacement d relative to register r.
switch (r) C
...
case SBr: return d + SB;
case LBr: return d + LB;
case Llr: return d + data[LB];
case L2r: return d + data[data[LBll;
.. .

For example, the LOAD and STORE instructions (on the simplifying assumption that the
length field n is 1) would be interpreted as follows:
case LOADop: {
short addr = relative (d, r) ;
data[ST++] = data[addr] ;
break ;
1
case STOREop: {
I short addr = relative (d, r) ;
data [addr] = data [--ST];
break ;
1
The operand of a CALL,JUMP,or JUMPIF instruction is also of the form 'd [ r l ',
where r is generally CB or PB, and d is a constant displacement. As usual, the displace-
ment d is added to the content of register r. The auxiliary method relative also
handles these cases.
Interpretation 33 1

indeed. Its control structures were more typical of a low-level language, making it
unattractive for serious programmers. More recently, 'structured' dialects of Basic have
become more popular, and compilation has become an alternative to interpretation.
Recursive interpretation is less common. However, this form of interpretation has
long been associated with Lisp (McCarthy et al. 1965). A Lisp program is not just
represented by a tree: it is a tree! Several features of the language - dynamic binding,
dynamic typing, and the possibility of manufacturing extra program code at run-time -
make interpretation of Lisp much more suitable than compilation. A description of a
Lisp interpreter may be found in McCarthy et al. (1965). Lisp has always had a devoted
band of followers, but not all are prepared to tolerate slow execution. A more recent
successful dialect, Scheme (Kelsey et al. 1998), has discarded Lisp's problematic
features in order to make compilation feasible.
It is noteworthy that two popular programming languages, Basic and Lisp, both
suitable for interpretation but otherwise utterly different, have evolved along somewhat
parallel lines, spawning structured dialects suitable for compilation!
Another example of a high-level language suitable for interpretation is Prolog. This
language has a very simple syntax, a program being a flat collection of clauses, and it
has no scope rules and few type rules to worry about. Interpretation is almost forced by
the ability of a program to modify itself by adding and deleting clauses at run-time.

Exercises
Make the Hypo interpreter of Example 8.1 detect the following exceptional
conditions, and set the status register accordingly:
(a) overflow;
(b) invalid instruction address;
(c) invalid data address.
(Assume that Hypo may have less than 4096 words of code store and less than
4096 words of data store, thus making conditions (b) and (c) possible.)

Make the Hypo interpreter of Example 8.1 display a summary of the machine
state after executing each instruction. Display the contents of ACC and CP, the
instruction just executed, and a selected portion of the data store.

Make the Hypo interpreter of Example 8.1 into an interactive debugger.

Provide the following facilities: (a) execute the next instruction only (single-
step); (b) set or remove a breakpoint at a given instruction; (c) execute
instructions until the next breakpoint; (d) display the contents of ACC and CP;
(e) display a selected portion of the data store; (f) terminate execution.

Write an emulator for a real machine with which you are familiar.
Interpretation 333

Expressions, operators, and variables are unchanged, but labels are removed.
Write a recursive interpreter for this structured dialect.

8.10** The TAM interpreter (Section 8.3) sacrifices efficiency for clarity. For
example, the fetch/analyze/execute cycle could be combined and replaced by a
single switch-statement of the form:
switch ((instr = c o d e [ C P + + ] ) . o p ) {
case LOADop: ...

Another efficiency gain could be achieved by holding the top one or two stack
elements in simple variables, and possibly avoiding the unnecessary updating
of the stack pointer during a long sequence of arithmetic operations. (This is
effectively turning TAM into a register machine!)
Consider these and other possible improvements to the TAM interpreter, and
develop a more efficient implementation. Compare your version with the origi-
nal TAM interpreter, and measure the performance gain.
CHAPTER NINE

Conclusion

The subject of this book is programming language implementation. As we study this

subject, we should remember that implementation is only part of the programming
language l$e cycle, where it takes its place along with programming language design
and specification. In Section 9.1 we discuss the programming language life cycle,
emphasizing the interactions among design, specification, and implementation. We also
distinguish between cheap, low-quality implementations (prototypes) and high-quality
implementations.
This naturally leads to a discussion of quality issues in implementation. In previous
chapters we have concentrated on introducing the basic methods of compilation and
interpretation, and relating these to the source language's specification. Correctness of
the implementation, with respect to the language specification, has been our primary
consideration. Quality of the implementation is a secondary consideration, although still
very important. The key quality issues are error reporting and efficiency. Sections 9.2
and 9.3 discuss these issues, as they arise both at compile-time and at run-time.

9.1 The programming language life cycle

Every programming language has a life cycle, which has some similarities to the well-
known software life cycle. The language is designed to meet some requirement. A
formal or informal specification of the language is written in order to communicate the
design to other people. The language is then implemented by means of language pro-
cessors. Initially, a prototype implementation might be developed so that programmers
can try out the language quickly. Later, high-quality (industrial-strength) compilers will
be developed so that realistic application programming can be undertaken.
As the term suggests, the programming language life cycle is an iterative process.
Language design is a highly creative and challenging endeavor, and no designer makes a
perfect job at the first attempt. The experience of specifying or implementing a new
language tends to expose irregularities in the design. Implementors and programmers
might discover flaws in the specification, such as ambiguity, incompleteness, or incon-
sistency. They might also discover unpleasant features of the language itself, features
that make the language unduly hard to implement efficiently, or unsatisfactory for
programming.
Conclusion 335

In any case, the language might have to be redesigned, respecified, and reimple-
mented, perhaps several times. This is bound to be costly, i.e., time-consuming and ex-
pensive. It is necessary, therefore, to plan the life cycle in order to minimize costs.
Figure 9.1 illustrates a life cycle model that has much to recommend it. Design is
immediately followed by specification. (This is needed to communicate the design to
implementors and programmers.) Development of a prototype follows, and development
of compilers follows that. Specification, prototyping, and compiler development are
successively more costly, so it makes sense to order them in this way. The designer gets
the fastest possible feedback, and costly compiler development is deferred until the
language design has more or less stabilized.

Specification
I

Manuals,
Ad Compilers

Figure 9.1 A programming language life cycle model.

9.1.1 Design
The essence of programming language design is that the designer selects concepts and
decides how to combine them. This selection is, of course, determined largely by the
intended use of the language. A variety of concepts have found their way into program-
ming languages: basic concepts such as values and types, storage, bindings, and abstrac-
tion; and more advanced concepts such as encapsulation, polymorphism, exceptions,
and concurrency. A single language that supports all these concepts is likely to be very
large and complex indeed (and its implementations will be large, complex, and costly).
Therefore a judicious selection of concepts is necessary.
Conclusion 337

different contexts (assignment, array indexing, loop parameters); whereas Algol from
the start had just one class o f expression, permissible in all contexts.
Similarly, formal specification o f semantics tends to encourage semantic simplicity
and regularity. Unfortunately, few language designers yet attempt this. Semantic
formalisms are much more difficult to master than BNF. Even then, writing a semantic
specification o f a real programming language (as opposed to a toy language) is a
substantial task. Worst o f all, the designer has to specify, not a stable well-understood
language, but one that is gradually being designed and redesigned. Most semantic
formalisms are ill-suited to meet the language designer's requirements, so it is not
surprising that almost all designers content themselves with writing informal semantic
specifications.
The advantages o f formality and the disadvantages o f informality should not be
underestimated, however. Informal specifications have a strong tendency to be inconsis-
tent or incomplete or both. Such specification errors lead to confusion when the langu-
age designer seeks feedback from colleagues, when the new language is implemented,
and when programmers try to learn the new language. O f course, with sufficient invest-
ment o f effort,most specification errors can be detected and corrected, but an informal
specification will probably never be completely error-free. The same amount o f effort
could well produce a formal specification that is at least guaranteed to be precise.
The very act o f writing a specification tends to focus the designer's mind on aspects
o f the design that are incomplete or inconsistent. Thus the specification exercise
provides valuable and timely feedback to the designer. Once the design is completed,
the specification (whether formal or informal) will be used to guide subsequent
implementations o f the new language.

Prototypes
A prototype is a cheap low-quality implementation o f a new programming language.
Development o f a prototype helps to highlight any features o f the language that are hard
to implement. The prototype also gives programmers an early opportunity to try out the
language. Thus the language designer gains further valuable feedback. Moreover, since
a prototype can be developed relatively quickly, the feedback is timely enough to make
a language revision feasible. A prototype might lack speed and good error reporting; but
these qualities are deliberately sacrificed for the sake o f rapid implementation.
For a suitable programming language, an interpreter might well be a useful
prototype. An interpreter is very much easier and quicker to implement than a compiler
for the same language. The drawback o f an interpreter is that an interpreted program
will run perhaps 100 times more slowly than an equivalent machine-code program.
Programmers will quickly tire o f this enormous inefficiency,once they pass the stage o f
trying out the language and start to use it to build real applications.
A more durable form o f prototype is an interpretive compiler. This consists o f a
translator from the programming language to some suitable abstract machine code,
338 Programming Language Processors in Java

together with an interpreter for the abstract machine. The interpreted object program
will run 'only' about 10 times more slowly than a machine-code object program.
Developing the compiler and interpreter together is still much less costly than
developing a compiler that translates the programming language to real machine code.
Indeed, a suitable abstract machine might be available 'off the shelf', saving the cost of
writing the interpreter.
Another method of developing the prototype implementation is to implement a
translator from the new language into an existing high-level language. Such a translation
is usually straightforward (as long as the target language is chosen with care). Clearly
the existing target language must already be supported by a suitable implementation.
This was precisely the method chosen for the first implementation of C++, which used
the cf r o n t translator to convert the source program into C.
Development of the prototype must be guided by the language specification, whether
the specification is formal or informal. The specification tells the implementor which
programs are well-formed (i.e., conform to the language's syntax and contextual
constraints) and what these programs should do when run.

9.1.4 Compilers
A prototype is not suitable for use over an extended period by a large number of
programmers building real applications. When it has served its purpose of allowing
programmers to try out the new language and provide feedback to the language
designer, the prototype should be superseded by a higher-quality implementation. This
is invariably a compiler - or, more likely, a family of compilers, generating object code
for a number of target machines. Such a high-quality implementation is referred as an
industrial-strength compiler.
The work that went into developing a prototype need not go to waste. If the
prototype was an interpretive compiler, for example, we can bootstrap it to make a
compiler that generates real machine code (see Section 2.6).
Development of compilers must be guided by the language specification. A syntactic
analyzer can be developed systematically from the source language's syntactic specifi-
cation (see Chapter 4). A specification of the source language's scope rules and type
rules should guide the development of a contextual analyzer (see Chapter 5). Finally, a
specification of the source language's semantics should guide the development of a code
specification, which should in turn be used to develop a code generator systematically
(see Chapter 7).
In practice, contextual constraints and semantics are rarely specified formally. If we
compare separately-developed compilers for the same language, we often find that they
are consistent with respect to syntax, but inconsistent with respect to contextual con-
straints and semantics. This is no accident, because syntax is usually specified formally,
and therefore precisely, and everything else informally, leading inevitably to misunder-
standing.
Conclusion 339

9.2 Error reporting

All programmers make errors - frequently. A high-quality language processor assists
the programmer to locate and correct these errors. Here we examine detection and
reporting of both compile-time and run-time errors.

9.2.1 Compile-time error reporting

The language specification defines a set of well-formed programs. A minimalist view of
a compiler's function is that it simply rejects any ill-formed program. But a good-quality
compiler should be more helpful.
As well as rejecting an ill-formed program, the compiler should report the location
of each error, together with some explanation. It should at least distinguish between the
major categories of compile-time error:
Syntactic error: missing or unexpected characters or tokens. The error report might
indicate what characters or tokens were expected.
Scope error: a violation of the language's scope rules. The error report should
indicate which identifier was declared twice, or used without declaration.
Type error: a violation of the language's type rules. The error report should indicate
which type rule was violated, and/or what type was expected.
Ideally the error report should be self-explanatory. If this is not feasible, it should at
least refer to the appropriate section of the language specification.
If the compiler forms part of an integrated language processor, and thus the pro-
grammer can switch very easily between editing and compiling, it is acceptable for the
compiler to halt on detecting the first error. The compiler should highlight the erroneous
phrase and pass control immediately to the editor. The programmer can then correct the
error and reinvoke the compiler.
On the other hand, a 'batch' or 'software tool' compiler - one intended to compile
the entire source program without interaction with the programmer - should detect and
report as many errors as it can find. This allows the programmer to correct several errors
after each compilation. This requirement has a significant impact on the compiler's
internal organization. After detecting and reporting an error, the compiler should attempt
error recovery. This means that the compiler should try to get itself into a state where
analysis of the source program can continue as normally as possible. Unfortunately,
effective error recovery is difficult.

Example 9.1 Reporting syntactic errors

The following Triangle program fragment contains some common syntactic errors:
Conclusion 341

Example 9.2 Reporting contextual errors

The following Triangle program fragment contains scope and type errors:
let
var phonenum: Integer;
var local: Boolean
in
begin
... ,
if phonenum [ O ] = ' 0 ' then
locale : = false
else
...
end
These errors should be detected during contextual analysis.
Consider the expression at (1). The phrase 'phonenum[O] ' clearly violates the
indexing operation's type rule, since phonenum is not of array type. But what error
recovery is appropriate? It is not at all obvious what type should be ascribed to
'phonenum[ 0 ] ', to allow type checking to continue. If the type checker ascribes the
type int, for example, then at the next step it will find that the operands of '=' appear to
violate that operator's type rule (one operand being int and the other chur), and it will
generate a second error report, which is actually spurious. Fortunately, the result type of
'=' does not depend on the types of its operands, so the type checker should obviously
ascribe the type bool to the expression 'phonenum[ 0 ] = ' 0 ' '. At the next step the
type checker will find that this expression satisfies the if-command's type rule.
At (2), there is an applied occurrence of an identifier, locale,that has not been de-
clared, in violation of a scope rule. Again, what error recovery is appropriate? Suppose
that the type checker arbitrarily chooses int as the type of locale.Subsequently the
type checker will find that the assignment command's type rule appears to be violated
(one side being int and the other bool), and again it will generate a spurious error report.

To facilitate error recovery during type checking, it is useful for the type checker to
ascribe a special improper type, error-type, to any ill-typed expression. The type
checker can then ignore error-type whenever it is subsequently encountered. This
technique would avoid both the spurious error reports mentioned in Example 9.2.
As these examples illustrate, it is easy for a compiler to discover that the source
program is ill-formed, and to generate error reports; but it is difficult to ensure that the
compiler never generates misleading error reports. There is a genuine tension between
the task of compiling well-formed source programs and the need to make some sense of
ill-formed programs. A compiler is structured primarily to deal with well-formed source
programs, so it must be enhanced with special error recovery algorithms to make it deal
reasonably with ill-formed programs.
342 Programming Language Processors in Java

Syntactic error recovery is particularly difficult. At one extreme, an over-ambitious

error recovery algorithm might induce an avalanche of spurious error reports. At the
opposite extreme, an over-cautious error recovery algorithm might skip a large part of
the source program and fail to detect genuine syntactic errors.

9.2.2 Run-time error reporting

Run-time error reporting is a completely different but equally important problem.
Among the more common run-time errors are:
arithmetic overflow
division by zero
out-of-range array indexing
These errors can be detected only at run-time, because they depend on values computed
at run-time.'
Some run-time errors are detected by the target machine. For example, overflow
may result in a machine interrupt. But in some machines the only effect of overflow is to
set a bit in the condition code register, and the object program must explicitly test this
bit whenever there is a risk of overflow.
Other run-time errors are not detected by the machine at all, but instead must be
detected by tests in the object program. For example, out-of-range array indexing might
result in computing the address of a word that is not actually part of the array. This is
usually not detected by the machine unless the computed address is outside the
program's address space.
These examples illustrate only typical machine behavior. Real machines range from
one extreme, where no run-time errors are detected automatically, to the opposite
extreme, where all the more common run-time errors are detected automatically. The
typical situation is that some run-time errors are detected by hardware, leaving others to
be detected by software.
Where a particular run-time error is not detected by hardware, the compiler should
generate code to test for the error explicitly. In array indexing, for example, the compile^
should generate code not only to evaluate the index but also to check whether it lies
within the array's index range.

' If the language is dynamically typed, i s . , a variable can take values of different types a
different times, then type errors also are run-time errors. However, we do not conside
dynamically-typed languages here.
Conclusion 343

Example 9.3 Detecting array indexing errors

The following Triangle program fragment illustrates array indexing:
let
v a r name: a r r a y 4 of Char;
v a r i: I n t e g e r
in
begin

...
end
Assume that characters and integers occupy one word each, and that the addresses of
global variables name and i are 200 and 204, respectively. Thus name occupies words
200 through 203; and the address of name [ i ] is 200 + i, provided that 0 I i I 3.
The Triangle compiler does not currently generate index checks. The assignment
command at (1 ) will be translated to object code like this (omitting some minor details):
LOADL 48 - fetch the blank character
LOAD 2 0 4 - fetch the value of i
LOADL 2 00 - fetch the address of name [ 0 ]
CALL a d d - compute the address of name [ i 3
STORE1 - store the blank character at that address
This code is dangerous. If the value of i is out of range, the blank character will be
stored, not in an element of name, but in some other variable - possibly of a different
type. (If the value of i happens to be 4, then i itself will be corrupted in this way.)
We could correct this deficiency by making the compiler generate object code with
index checks, like this:
LOADL 4 8 - fetch the blank character
LOAD 2 0 4 - fetch the value of i
LOADL 0 -fetch the lower bound of name
LOADL 3 - fetch the upper bound of name
CALL rangecheck -check that the index is within range
LOADL 2 0 0 - fetch the address of name [ 0 ]
CALL a d d - compute the address of name [ i I
STORE1 - store the blank character at that address

The index check is italicized for emphasis. The auxiliary routine rangecheck, when
called with arguments i, rn, and n, is supposed to return i if rn 5 i I n, or to fail
otherwise. The space cost of the index check is three instructions, and the time cost is
three instructions plus the time taken by rangecheck itself.
0
344 Programming Language Processors in Java

Software run-time checks are expensive in terms of object-program size and speed.
Without them, however, the object program might overlook a run-time error, eventually
failing somewhere else, or terminating with meaningless results. And, let it be empha-
sized, if a compiler generates object programs whose behavior differs from the language
specification, it is simply incorrect. The compiler should, at the very least, allow the
programmer the option of including or suppressing run-time checks. Then a program's
unpredictable behavior would be the responsibility of the programmer who opts to
suppress run-time checks.
Whether the run-time check is performed by hardware or software, there remains the
problem of generating a suitable error report. This should not only describe the nature of
the error (e.g., 'arithmetic overflow' or 'index out of range'), but should also locate it in
the source program. An error report stating that overflow occurred at instruction address
1234 (say) would be unhelpful to a programmer who is trying to debug a high-level
language program. A better error report would locate the error at a particular line in the
source program.
The general principle here is that error reports should relate to the source program
rather than the object program. Another example of this principle is a facility to display
the current values of variables during or after the running of the program. A simple
storage dump is of little value: the programmer cannot understand it without a detailed
knowledge of the run-time organization assumed by the compiler (data representation,
storage allocation, layout of stack frames, layout of the heap, etc.). Better is a symbolic
dump that displays each variable's source-program identifier, together with its current
value in source-language syntax.

Example 9.4 Reporting run-time errors

Consider the Triangle program fragment of Example 9.3. Suppose that an out-of-range
index is detected at (1). The following error report and storage dump are expressed
largely in object-program terms:
Array indexing error at instruction address 1234
Data store at this point:
address content

This information is hard to understand, to put it mildly. It is not clear which array
indexing operation failed. There is no indication that some of the words in the data store
constitute an array. There is no distinction between different types of data such as
integers and characters.
Conclusion 345

The following error report and storage dump are expressed more helpfully in source-
program terms:
Array indexing error at line 45.
Data store at this point:
name = ['J', 'a', 'v', 'a']
i = 10
Here the programmer can tell at a glance what went wrong.

But how can the source-program line number be determined at run-time? One
possible technique is this. We dedicate a register (or storage cell) that will contain the
current line number. The compiler generates code to update this register whenever
control passes from one source-program line to another. Clearly, however, this
technique is costly in terms of extra instructions in the object program.
An alternative technique is as follows. The compiler generates a table relating line
numbers to instruction addresses. If the object program stops, the code pointer is used to
search the table and determine the corresponding line number. This technique has the
great advantage of imposing no time or space overheads on the object program. (The
line-number table can be stored separately from the object program, and loaded only if
required.)
The generation of reliable line-number information, however, is extremely difficult
in the presence of heavily-optimized code. In this case, the code generator may have
eliminated some of the original instructions, and substantially re-ordered others, making
it very difficult to identify the line number of a given instruction. In the worst case, a
single instruction may actually be part of the code for several different lines of source
code.
To generate a symbolic storage dump requires more sophisticated techniques. The
compiler must generate a 'symbol table' containing the identifier, type, and address of
each variable in the source program, and the identifier and entry address of each
procedure (and function). If the object program stops, using the symbol table each (live)
variable can be located in the data store. The variable's identifier can be printed along
with its current value, formatted according to its type. If one or more procedures are
active at the time when the program stops, the store will contain one or more stack
frames. To allow the symbolic dump to cover local variables, the symbol table must
record which variables are local to which procedures, and the procedure to which each
frame belongs must be identified in some way. (See Exercise 9.16.)
This problem is compounded on a register machine, where a variable might be
located in a register and not in the store. It is also compounded for heavily-optimized
code, where several variables with disjoint lifetimes may share the same memory
location.
346 Programming Language Processors in Java

9.3 Efficiency
When we consider efficiency in the context of a compiler, we must carefully distinguish
between compile-time efficiency and run-time efficiency. They are not the same thing at
all; indeed, there is often a tradeoff between the two. The more a compiler strives to
generate efficient (compact and fast) object code, the less efficient (bulkier and slower)
the compiler itself tends to become.
The most efficient compilers are those that generate abstract machine code, where
the abstract machine has been designed specifically to support the operations of the
source language. Compilation is simple and fast because there is a straightforward trans-
lation from the source language to the target language, with few special cases to worry
about. Such is the Triangle compiler used as a case study in this book. Of course, the
object code has to be interpreted, imposing a significant speed penalty at run-time.
Compilers that generate code for real machines are generally less efficient. They
must solve a variety of awkward problems. There is often a mismatch between the
operations of the source language and the operations provided by the target machine.
The target-machine operations are often irregular, complicating the translation. There
might be many ways of translating the same source program into object code, forcing
the compiler writer to implement lots of special cases in an attempt to generate the best
possible object code.

9.3.1 Compile-time efficiency

Let us examine a compiler from the point of view of algorithmic complexity. Ideally, we
would like the compiler to run in O(n) time,' where n is some measure of the source
program's size (for example, the number of tokens). In other words, a 10-fold increase
in the size of the source program should result in a 10-fold increase in compilation time.
A compiler that runs in 0(n2) time is normally unacceptable: a 10-fold increase in the
size of the source program would result in a 100-fold increase in compilation time! In
practice, O(n log n) might be an acceptable compromise.
If all phases of a compiler run in O(n) time, then the compiler as a whole will run in
O(n) time.' But if just one of the phases runs in 0(n2) time, then the compiler as a whole

* The 0-notation is a way of estimating the efficiency of a program. Let n be the size of the
program's input. If we state that the program's running time is O(n), we mean that its running
time is proportional to n. (The actual running time could be lOOn or 0.01n.) Similarly, O(n log
n) time means time proportional to n log n, 0(n2) time means time proportional to n2, and so
on. In estimates of algorithmic complexity, the constants of proportionality are generally less
important than the difference between, for example, O(n) and 0(n2).
Suppose that phase A runs in time an, and phase B in time bn (where a and b are constants).
Then the combination of these phases will run in time an + bn = (u + b)n, which is still O(n).
348 Programming Language Processors in Java

CALL a d d
CALL sub
STORE a
As we saw in Chapter 7, a simple efficient code generator can easily perform this
translation. The code generator has no registers to worry about.
Now suppose that the target machine has a pool of registers and a typical one-
address instruction set. Now the command might be translated to object code like this:
LOAD R1 b
MULT R1 c
LOAD R2 d
LOAD R3 e
MULT R3 f
ADD R2R3
SUB R1 R2
STORE R1 a
Although this is comparatively straightforward, some complications are already evident
The code generator must allocate a register for the result of each operation. It musl
ensure that the register is not reused until that result has been used. (Thus R1 cannot bc
used during the evaluation of 'd + (e*£ 1 ', because at that time it contains the unusec
result of evaluating 'bXc'.)Furthermore, when the right operand of an operator is 2
simple variable, the code generator should avoid a redundant load by generating, foi
example, 'MULTR1 c' rather than 'LOADR2 c' followed by 'MULTR1 R2'.
The above is not the only possible object code, nor even the best. One improvemenl
is to evaluate 'd + (e*f ) ' befare 'b*c'.A further improvement is to evaluate ' ( e*f )
+ d' instead of 'd + (e*f ) ', exploiting the commutativity of '+'. The combined effec
of these improvements is to save an instruction and a register:
LOAD R1 e
MULT R1 f
ADD R1 d
LOAD R2 b
MULT R2 c
SUB R2 R1
STORE R2 a
The trick illustrated here is to evaluate the more complicated subexpression of a binarj
operator first.
But that is not all. The compiler might decide to allocate registers to selectec
variables throughout their lifetimes. Supposing that registers R6 and R7 are thu:
allocated to variables a and d, the object code could be further improved as follows:
Conclusion 349

LOAD R1 e
MULT R1 f
ADD R1 R7
LOAD R6 b
MULT R6 c
SUB R6 R1

Several factors make code generation for a register machine rather complicated.
Register allocation is one factor. Another is that compilers must in practice achieve code
improvements of the kind illustrated above - programmers demand nothing less!
But even a compiler that achieves such improvements will still generate rather
mediocre object code (typically four times slower than hand-written assembly code). A
variety of algorithms have been developed that allow a compiler to generate much more
efficient object code (typically twice as slow as hand-written assembly code). These are
called code transformation (or code optimizationr) algorithms. Some of the more
common code transformations are:
Constant folding: If an expression depends only on known values, it can be evaluated
at compile-time rather than run-time.
Common subexpression elimination: If the same expression occurs in two different
places, and is guaranteed to yield the same result in both places, it might be possible
to save the result of the first evaluation and reuse it later.
Code movement: If a piece of code executed inside a loop always has the same effect,
it might be possible to move that code out of the loop, where it will be executed fewer
times.

Example 9.6 Corzstarztfolding

Consider the following Java program fragment:
static double pi = 3.1416;
.. .
double volume = 4 1 3 * pi * r * r * r ;
The compiler could replace the subexpression '4 / 3 * p i ' by 4.1888. This constant
folding saves a run-time division and multiplication. The programmer could have
written ' 4 . 1 8 8 8 * r * r * r' in the first place, of course, but only at the expense of
making the program less readable and less maintainable.
The following illustrates a situation where only the compiler can do the folding.

' The more widely used term, code optimization, is actually inappropriate: it is infeasible for a
compiler to generate truly optimal object code.
350 Programming Language Processors in Java

Consider the following Triangle program fragment:

t y p e Date = r e c o r d
y: I n t e g e r , m: I n t e g e r , d : I n t e g e r
end ;
var h o l : array 6 of Date
...
h o l [ 2 ] .m := 12
The relevant addressing formula is:
address[hol [ 2 ] .mJ = addres:

(assuming that each integer occupies one word). Furthermore, if the compiler decidt
that addressl[hol]l = 20 (relative to SB), then addressl[hol [2 I .m]l can be folded 1
the constant address 27. This is shown in the following object code:
LOADL 12
STORE 27 [ SB]
Address folding makes field selection into a compile-time operation. It even makc
indexing of a static array by a literal into a compile-time operation.
I

Example 9.7 Common subexpression elimination

Consider the following Triangle program fragment:
var x: Integer; v a r y: I n t e g e r ; var z: Integer

Here the subexpression 'x-y' is a common subexpression. If the compiler takes I

special action, the two occurrences of this subexpression will be translated into tv
separate instruction sequences, as in object code (a) below. But their results are guara
teed to be equal, so it would be more efficient to compute the result once and then col
it when required, as in object code (b) below.
(a) LOAD x (b) LOAD x
LOAD y LOAD y
CALL sub CALL sub - computes the value of x-:
LOAD x LOAD - 1 [ST] - copies the value of x-y
LOAD y LOAD z
CALL sub CALL add
LOAD z CALL mult
CALL add
CALL mult
Conclusion 35 1

Now consider the following Triangle program fragment:

type T = ... ;
var a: array 10 of T; var b: array 20 of T

Here there is another, less obvious, example of a common subexpression. It is revealed

in the addressing formulas for a [ i] and b [ i] :

where i is the value of variable i,and where we have assumed that each value of type T
occupies four words.
The common subexpression 'x-y'could have been eliminated by modifying the
source program. But the common subexpression ' i x 4' can be eliminated only by the
compiler, because it exists only at the target machine level.
0

Example 9.8 Code movement

Consider the following Triangle program fragment:
var name: array 3 of array 10 of Char
...
i : = 0;
while i < 3 do
begin
j : = 0;
while j < 10 do
begin name[il [ j l := ' I ; j := j + 1 end;
i : = i + l
end
The addressing formula for name [ i] [ j] is:

(assuming that each character occupies one word). A straightforward translation of this
program fragment will generate code to evaluate addressl[name]+ (i x 10) inside the
inner loop. But this code will yield the same address in every iteration of the inner loop,
since the variable i is not updated by the inner loop.
The object program would be more efficient if this code were moved out of the inner
loop. (It cannot be moved out of the outer loop, of course, because the variable i is
updated by the outer loop.)
0
352 Programming Language Processors in Java

Constant folding is a relatively straightforward transformation, requiring only local

analysis, and is performed even by simple compilers. For example, the Triangle
compiler performs constant folding on address formulas.
Other code transformations such as common subexpression elimination and code
movement, on the other hand, require nontrivial analysis of large parts of the source
program, to discover which transformations are feasible and safe. To ensure that
common subexpression elimination is safe, the relevant part of the program must be
analyzed to ensure that no variable in the subexpression has been updated between the
first and second evaluations of the subexpression. To ensure that code can be safely
moved out of a loop, the whole loop must be analyzed to ensure that the movement does
not change the program's behavior.
Code transformation algorithms always slow down the compiler, in an absolute
sense, even when they run in O(n) time. But some of these algorithms, especially ones
that require analysis of the entire source program, may consume as much as 0(n2) time.
Code transformations are only occasionally justified. During program development,
when the program is compiled and recompiled almost as often as it is run, fast compil-
ation is more important than generating very efficient object code. It is only when the
program is ready for production use, when it will be run many times without recompil-
ation, that it pays to use the more time-consuming code transformation algorithms.
For an industrial-strength compiler, a sensible compromise is to provide optional
code transformation algorithms. The programmer (who is the best person to judge) can
then compile the program without code transformations during the development phase,
and can decide when the program has stabilized sufficiently to justify compiling it with
code transformations.

9.4 Further reading

More detailed discussions of the major issues in programming language design and
specification, and their interaction, may be found in the concluding chapters of the
companion textbooks by Watt (1990, 1991). Interesting accounts of the design of a
number of major programming languages - including Ada, C, C++, Lisp, Pascal,
Prolog, and Smalltalk - may be found in Bergin and Gibson (1996).
A formal specification of a programming language makes a more reliable guide to
the implementor than an informal specification. More radically, it might well be feasible
to use a suitable formal specification of a programming language to generate an imple-
mentation automatically. A system that does this is called a compiler generator.
Development of compiler generators has long been a major goal of programming
languages research.
Good-quality compiler generators are not yet available, but useful progress has been
made. From a syntactic specification we can generate a scanner and parser, as described
354 Programming Language Processors in Java

9.5" Consider the following Triangle program fragment:

var a: array ... of Integer
...
i : = m - 1; j : = n; pivot : = a[nl;
while i < j do
begin
i : = i + 1; while a[i] < pivot do i : = i + 1;
j : = j - 1; while a [ j J > pivot do j : = j - 1;
if i < j then
begin
t : = a[i]; a[i] : = a[j]; a[j] : = t
end
end ;
t : = a[i]; a[i] : = a[n]; a[n] : = t

(a) Find out the object code that would be generated by the Triangle
compiler.
(b) Write down the object code that would be generated by a Triangle com-
piler that performs code transformations such as constant folding,
common subexpression elimination, and code movement.

Projects with the Triangle language processor

All of the following projects involve modifications to the Triangle language processor,
so you will need to obtain a copy. (It is available from our Web site. See page xv of the
Preface for downloading instructions.)
Nearly every project involves a modification to the language. Rather than plunging
straight into implementation, you should first speciSy the language extension. Do this by
modifying the informal specification of Triangle in Appendix B, following the same
style.

9.6** Extend Triangle with additional loops as follows.

(a) A repeat-command:
repeat C until E
is executed as follows. The subcommand C is executed, then the expres-
sion E is evaluated. If the value of the expression is true, the loop termi-
nates, otherwise the loop is repeated. The subcommand C is therefore
executed at least once. The type of the expression E must be Boolean.
(b) A for-command:
for I from El to E2 do C
Conclusion 355

is executed as follows. First, the expressions E l and E2 are evaluated,

yielding the integers m and n (say), respectively. Then the subcommand C
is executed repeatedly, with identifier I bound in successive iterations to
each integer in the range rn through n. If m > n, C is not executed at all.
(The scope of I is C, which may use the value of I but may not update it.
The types of El and E2 must be Integer.)Here is an example:
for n from 2 to m do
if prime(n) then
putint ( n )

9.7"" Extend Triangle with a case-command of the form:

case E of
IL1: C 1 ;
... ,
IL, : C,;
else: Co
This command is executed as follows. First E is evaluated; then if the value of
E matches the integer-literal ILi, the corresponding subcommand Ci is
executed. If the value of E matches none of the integer-literals, the
subcommand Co is executed. (The expression E must be of type Integer,
and the integer-literals must all be distinct.) Here is an example:
case t0day.m of
2: days := if leap then 29 else 2 8 ;
4: days := 30;
6: days := 30;
9: days := 30;
11: days := 30;
else: days := 31

9.8** Extend Triangle with an initializing variable declaration of the form:

var I : = E
This declaration is elaborated by binding I to a newly created variable. The
variable's initial value is obtained by evaluating E. The lifetime of the variable
is the activation of the enclosing block. (The type of I will be the type of E.)

9.9"" Extend Triangle with unary and binary operator declarations of the form:
func 0 ( I I : TI) : T - E
func 0 (Il: T I , 12: T2) : T - E
Operators are to be treated like functions. A unary operator application '0 E' is
to be treated like a function call ' 0 ( E )', and a binary operator application 'El
0 E2' is to be treated like a function call ' 0 ( E l, E 2 ) '.
356 Programming Language Processors in Java

Here are some examples:

func -- (i: Integer) : Integer - 0 - n;
func * * (b: Integer, n: Integer) : Integer -
i f n = O
then 1
else n * (b * * (n-1)) ! assuming that n > 0
(Notes: The Triangle lexicon, Section B.8, already provides a whole class of
operators from which the programmer may choose. The Triangle standard envi-
ronment, Section B.9, already treats the standard operators '+', '-', '*', etc.,
like predeclared functions.)

Replace Triangle's constant and variable parameters by value and result

parameters. Design your own syntax.

Extend Triangle with enumeration types. Provide a special enumeration type

declaration of the form:

which creates a new and distinct primitive type with n values, and respectively
binds the identifiers 11,..., and In to these values. Make the generic operations
of assignment, '=', and '\=' applicable to enumeration types. (They are appli-
cable to all Triangle types.) Provide new operations of the form 'succE' (suc-
cessor) and 'predE' (predecessor), where succ and pred are keywords.

Extend Triangle with a new family of types, string n, whose values are
strings of exactly n characters (n 2 1). Provide string-literals of the form
" el .. .en " . Make the generic operations of assignment, '=', and ' \ =' applicable
to strings. Provide a new binary operator '<<' (lexicographic comparison). Fi-
nally, provide an array-like string indexing operation of the form ' V I E ]',
where V names a string value or variable. (Hint: Represent a string in the same
way as a static array.)
Or:
Extend Triangle with a new type, String,whose values are character strings
of any length (including the empty string). Provide string-literals of the form
" el.. .en " (n 2 0). Make the generic operations of assignment, '=', and ' \ =' ap-
plicable to strings. Provide new binary operators '<<' (lexicographic compari-
son) and '++' (concatenation). Finally, provide an array-like string indexing
operation of the form ' V [ E ]', and a substring operation of the form
' V I E I:E2]', where V names a string value or variable. But do not permit
string variables to be selectively updated. (Hint: Use an indirect representation
for strings. The handle should consist of a length field and a pointer to an array
of characters stored in the heap. In the absence of selective updating, string
assignment can be implemented simply by copying the handle.)
APPENDIX A

Answers to Selected Exercises

Specimen answers to about half of the exercises are given here. Some of the answers are
given only in outline.

Answers 1
1.1 Other kinds of language processor: syntax checkers, cross-referencers, pretty-
printers, high-level translators, program transformers, symbolic debuggers, etc.

1.4 Mini-Triangle expressions: (a) and (e) only. (Mini-Triangle has no functions,
no unary operators, and no operator '>='.)
Commands: (f) and 0 ) only. (Mini-Triangle procedures have exactly one pa-
rameter each, and there is no if-command without an else-part.)
Declarations: (I), (m), and (0). (Mini-Triangle has no real-literals, and no multi-
ple variable declarations.)

1.5 AST:

Whilecommand

AssignCommand

'1 l - l -
VnameExpr.
SimpleV.
Int.Expr. VnameExpr.
SimpleV. I
SimpleV.
I
Ident. Ident. 1nt.Lit.
I
Ident.
SimpleV.
I
Ident.

b n 0 b faise
Answers to Selected Exercises 363

Answers 3
3.3 The contextual errors are (i) 'Logical'is not declared; (ii) the expression of
the if-command is not of type bool; and (iii) 'yes' is not declared:

Program

I I
Assigncommand Assigncommand
r-l

Ident. Ident.

3.5 In brief, compile one subprogram at a time. After parsing a subprogram and
constructing its AST, perform contextual analysis and code generation on the
AST. Then prune the AST: replace the subprogram's body by a stub, and retain
only the part(s) of the AST that will be needed to compile subsequent calls to
the subprogram (i.e., its identifier, formal parameters, and result type if any).
The maximum space requirement will be for the largest subprogram's AST,
plus the pruned ASTs of all the subprograms.

3.6 This restructuring would be feasible. It would be roughly similar to Answer

3.5, although the interleaving of syntactic analysis, contextual analysis, and
code generation would be more complicated.

Answers 4
4.3 After repeated left factorization and elimination of left recursion:
Numeral ::= Digits (. Digits I E) (e Sign Digits I r)
364 Programming Language Processors in Java

Digits ..-
. Digit Digit*

4.4 (a) {C,J,Pl

(b) IO, L2, a, bl
(c) starters[Digitl) = starters[Digits]l = sturters[Numerall) = (0, 1, 2, 3)
(d) starters[SubjectJ = {I, a, the); startersl[Objectl) = {me, a, the).

4.9 Parsing methods (with enhancements italicized):

private void parsecommand ( ) {
int expval = parseExpression ( ) ;
accept('=');
print (expval);
1
private int parseExpression 0 {
int expval = parseNumeral0;
while (currentchar == ' + '
1 ) currentchar == I - '

1 ( currentchar == { I * ' )

char op = currentchar;
acceptIt ( 1 ;
int numval = parseNumeral() ;
switch (op) i
case ' + ' : expval += numval; break;
case . expval -= numval; break;
' - I -

case expval *= numval; break;

I * ' :

1
1
return expval;
1
private int parseNumera1 ( ) {
int numval = parseDigit ( ) ;
while (isDigit(currentChar))
numval = 1O*numval + parseDigit ( ) ;
return numval;
3
private byte parseDigit ( ) {
if ( ' 0 '<= currentchar && currentchar <= ' 9 ' )
byte digval = currentchar - '0';
currentchar = next input character;
return digval;
1 else
report a lexical error
Answers to Selected Exercises

4.11 (a) Refine 'parse X I Y to:

if (currentToken.kind is in startersl[XJ
)
parse X
else i f ( currentToken.kind is in starters[lYJ
)
parse Y
else
report a syntactic error
This is correct if and only if starters[[XJand startersl[).?l
are disjoint.
(b) Refine 'parse [a'
to:
if (currentToken. kind is in startersI[XJ
)
parse X
This is correct if and only if startersl[Xllis disjoint from the set of tok
that can follow [a in this particular context.
(c) Refine 'parse X+' to:
do
parse X
w h i l e (currentToken.kind is in startersl[q)
;

This is correct if and only if startersl[XJis disjoint from the set of tok
that can follow X+ in this particular context.

4.12 After left factorization:

single-Command ::= ...
( i f Expression then single-Command
( e l s e single-Command I E )
I ...
The tokens that can follow a single-command are {else,end].This set is
disjoint from starters[el se single-Command] = {else],so the gramma
not LL(1). (In fact, no ambiguous grammar is LL(I).)
The parsing method obtained by converting this production rule would be:
p r i v a t e void parseSingleCommand ( ) {
s w i t c h (currentToken.kind) {

case Token.IF: {
acceptIt ( ) ;
parseExpression();
accept(Token.THEN);
parseSingleCommand();
368 Programming Language Processors in Java

(b) To display an AST:

public abstract class AST {

public void display (byte level);

public class ~ o n t e r r n i n a lextends

~~~ AST {

public void display (byte level) {

for (int i = 0; i < level; i++)
print(" " ) ;
switch (this.tag) {
case AST.PROGRAM:
println("Program"); break;
...
1
for (int i = 0;
i this.children.length; i++)
this.children[i].display(level+l);
1
1

public class TerminalAST extends AST {

public void display (byte level) {

for (int i = 0; i < level; i++)
print(" " ) ;
switch (this.tag) {
case AST.IDENTIFIER:
print("1dentifier " ) ; break;

4.18 This lexical grammar is ambiguous. The scanning procedure would turn (

follows:
private byte scanToken ( ) {
switch (currentchar) {
Answers to Selected Exercises 369

case ' a 8 : case ' b l :case 'c':case 'dl:

...
case ' y ' : case ' z ' :
takeIt ( ;
while (isLetter(currentChar)
I I isDigit(currentChar1)
takeIt ( ) ;
return Token.IDENTIFIER;

case 'i':
takeIt0; take('£');
return Token.IF;
case ' t ' :
takeIt ( ) ; take('hl)
; take('e0)
; take('nl
) ;
return Token-THEN;
case ' e n :
takeIt0; take('l8);take('sl);take('el);
return Token.ELSE;

This method will not compile. Moreover, there is no reasonable way to fix it.

Answers 5
5.2 One possibility would be a pair of subtables, one for globals and one for locals.
(Each subtable could be an ordered binary tree or a hash table.) There would
also be a variable, the current level, set to either global or locul. Constructor
IdentificationTable would set the current level to global, and would
empty both subtables. Method enter would add the new entry to the global or
local subtable, according to the current level. Method retrieve would search
the local subtable first, and if unsuccessful would search the global subtable
second. Method openscope would change the current level to local. Method
closescope would change it to global, and would also empty the local
subtable.

5.3 Constructor IdentificationTable would make the stack contain a single

empty binary tree. Method enter would add the new entry to the tormost bi-
nary tree. Method retrieve would search the binary trees in turn, starting
with the topmost, and stopping as soon as it finds a match. Method open-
Answers to Selected Exercises 37 1

5.9 Undecorated AST:

TypeDeclaration
* TypeDeclaration

I
FieldList

SimpleT.
I
Ident.
I
Ident. Ident. Ident. 1d&. 1dent. 1dent.

1nt~ist 1nt~ode

After elimination of type identifiers:

TypeDeclaration TypeDeclaration
I I
f -4
PointerTypeD.
I >
RecordTypeDenoter
I

r-l
'
Field
FieldList

I
Field
I

Ident. Ident. Ident. int Ident.

IntList ~ n t ~ o d hd
e t1

The AST has been transformed to a directed graph, with the mutually recursive
types giving rise to a cycle.
The complication is that the e q u a l s method must be able to compare two
(possibly cyclic) graphs for structural equivalence. It must be implemented
carefully to avoid nontermination.

5.10 Consider the function call 'I ( E )'. Check that I has been declared by a
function declaration, say 'f unc I (I': T') : T - E". Check that the type of
the actual parameter E is equivalent to the formal parameter type T'. Infer that
the type of the function call is T.
372 Programming Language Processors in Java

Answers 6
6.3 Advantage of single-word representation:
It is economical in storage.
Advantages of double-word representation:
It is closer to the mathematical (unbounded) set of integers.
Overflow is less likely.

6.5 (a)
pixel [redl
pixel [orange]
pixel [yellow]
pixel [green]
... pixel [blue]
freq['zr1

(b) Every Tindex has a minimum value, min a maximum value, n:

Tindex; and an ord function that maps the values of the type
consecutive integers. Thus:
size T = ( u - 1 + 1) x size Telem
address[[ a [ 0 ] ]I = address a - ( I x size Tele,)
address[ a [ i ] 1 = address[[ a [ 0 ] + (ord (i) x size Telem)
where 1 = ord(min Tindex) and u = ord(max Tinde,).

6.6 For two-dimensional arrays:

size T = rn X n X size Tele,
address1 a [ il [ j ] 1 = address a + (i x (n x size Telem)) +
(jX size Telem)

6.8 Make the handle contain the lower and upper bounds in both dimensions,
well as a pointer to the elements. Store the elements themselves row by row (
in Example 6.6). If 1, u, 1', and u' are the values of El, E2, Eg, and E4, respc
tively, then we get:
Answers to Selected Exercises 373

origin
lower bound 1
upper bound I
lower bound 2
upper bound 2
handle row u { t-
elements of type Tele,

(a) Evaluate subexpression '1 - ( c * 2 ) ' before 'a * b':

LOAD R1 c
MULT R1 #2
LOAD R2 #1
SUB R2 R1
LOAD R1 a
MULT R1 b
ADD R1 R2
(b) Save the accumulator's contents to a temporary location (say temp)
whenever the accumulator is needed to evaluate something else:
LOAD c
MULT #2
STORE temp
LOAD #1
SUB temp
STORE temp
LOAD a
MULT b
ADD temp
(In general, more than one temporary location might be needed.)

Address of global variable vi is:

address vi = size T I + . . . + size Ti-,
Only the addresses allocated to the variables are affected by the order of the
variable declarations. The net behavior of the object program is not affected.
374 Programming Language Processors in Java

6.16 Let each frame consist of a static part and a dynamic part. The static part
accommodates variables of primitive type, and the handles of dynamic arrays.
The dynamic part expands as necessary to accommodate elements of dynamic
arrays. The frame containing v would look like this:

link
data

static
part of
frame

dynamic
part of
frame

Since everything in the static part is of constant size, the compiler can
determine each variable's address relative to the frame base. This is not true for
the dynamic part, but the array elements there are always addressed indirectly
through the handles.

6.17 There are three cases of interest. If n = m+l, S is local to the caller. If n = m, S
is at the same level as the caller. If n < m, S encloses the caller.
(a) On call, push S's frame on to the stack. In all cases, set Dn to point to the
base of the new frame. (Note: If n < m, D(n+l), ..., and Dm become
undefined.)
(b) On return, pop S's frame off the stack. If n = m+l, do nothing else. If n =
m, reset Dm to point to the base of the (now) topmost frame. If n < m,
reset other display registers using the static links: D(m-1) t content
(Dm); ...; Dn t content (D(n+l)). (Note: If n = m + l , Dn becomes
undefined.)
There is no need to change DO, D l , ..., D(n-1) at either the call or the return,
since these registers point to the same frames before, during, and after the acti-
vation of S.
Advantages and disadvantages (on the assumption that DO, D l , etc., are all true
registers):
Nonlocal variables can be accessed as efficiently as local or global variables.
Answers to Selected Exercises 3'

(e) e x e c u t e u r e p e a t C 1 w h i l e E do C2J =
JUMP h
g: execute C2
h: executeC1
evaluate E
JUMPIF(1) g
7.3 (a) evaluate[[if El t h e n E2 e l s e E3J =
evaluate El
JUMPIF ( 0 ) g
evaluate E2
JUMP h
g: evaluate E3
h:

(b) e v a l u a t e [ l e t D i n E]I =
elaborate D
evaluate E
POP(n) s ifs>O
where s = amount of storage allocated by D:
n = size (type of E)

7.5 Selected encoding methods:

(a) public O b j e c t v i s i t S i m ~ s s i g n ~ o m r n a n d
(SimAssignCornrnand com, O b j e c t arg) {
c o m . E l . v i s i t ( t h i s , arg);
c o m . E 2 . v i s i t ( t h i s , arg);
encodeAssign(com.V2);
encodeAssign(com.Vl);
return null;
1
(c) public O b j e c t v i s i t I f O n l y C o m m a n d
( I f O n l y C o m m a n d c o m , O b j e c t arg) {
c o m . E . v i s i t ( t h i s , arg) ;
short i = n e x t I n s t A d d r ;
emit(Instruction.JUMPIFop, 0 ,
Instruction.CBr, 0);
c o m . C . v i s i t ( t h i s , arg);
short g = n e x t I n s t r A d d r ;
p a t c h ( i , g) ;
return null;
1
378 Programming Language Processors in Java

(d) public Object visitRepeatCommand

(Repeatcommand com, Object arg) {
short g = nextInstrAddr;
com.C.visit(this, arg) ;
com.E.visit(this, arg);
emit(Instruction.JUMPIFop, 0,
Instruction.CBr, g ) ;
return null;
I

7.7 (a) The most efficient solution is:

execute([forI from El to E2 do C'J =
evaluate E2 - compute final value
evaluate El - compute initial value of I
JUMP h
g: execute C
CALL succ - increment current value of I
h : LOAD - 1 [ S T l - fetch current value of I
LOAD -3 [ST] - fetch final value
CALL le - test current value I final value
JUMPIF(1) g - if so, repeat
POP(0) 2 - discard current and final values
At g and at h, the current value of I is at the stack top (at address
-1 [ST]), and the final value is immediately underlying (at address
-2 [ST] ).
(b) This solution requires the introduction of two new AST classes. The first
is a Command AST used to represent the for-command itself. The second
is a Declaration AST used to represent the (pseudo-)declaration of the for-
command control variable. This is because the identification table stores
Declaration ASTs as attributes.
public class Forcommand extends Command {
...
/ / Declaration of control variable.. .
public ForDeclaration D;
/ / Subphrases of for-command.. .
public Expression El, E2;
public Command C;
1
380 Programming Language Processors in Java

7.10 (a) Reserve space for the result variable just above the link data in the func
tion's frame (i.e., at address 3 [LB]):

JUMP g
e: PUSH n where n = size T
execute C
RETURN ( n ) d where d = size of FP
g:
execute[resul t E]I =
evaluate E
STORE(n) 3 [LB] where n = size (type of E)
(b)
public Object visitFuncDeclaration
(FuncDeclaration decl,
Object arg) {
Frame £ = (Frame) arg;
short i = nextInstrAddr;
emit(Instruction.JUMPop, 0,
Instruction.CBr, 0);
short e = nextInstrAddr;
decl.entity =
new KnownRoutine(2, £.level, e);
Frame £1 = new Frame(f.leve1 + 1, 0);
short d = shortValueOf(
decl.FP.visit(this, £1));
/ / ... creates a run-time entity for the formal parameter,
/ / and returns the size of the parameter.
short n = shortValueOf(
decl.T.visit(this, null));
emit(Instruction.PUSHop, 0 , 0, n);
Frame £2 = new Frame(£.level + 1, 3 + n) ;
decl .C.visit(this, £2);
emit(Instruction.RETURNop, n, 0 , d) ;
short g = nextInstrAddr;
patch(i, 9);
return new Short(0);
1
Answers to Selected Exercises :

public Object visitResultCommand

(Resultcommand com,
Object arg) {
short n =
shortValue0f(com.E.visit(this, arg)) ;
emit(Instruction.STOREop, n,
Instruction-LBr,3);
return null;
1

Answers 8
8.3 In outline:
public abstract class UserCommand {
public abstract void perform
(HypoInterpreter interp);
1

public class Stepcommand extends UserCommand {

public void perform
(HypoInterpreter interp) {
interp.step ( ) ;
1
1

public class Runcommand extends UserCommand {

public void perform
(HypoInterpreter interp) {
do {
interp.step ( ) ;
} while ( ! interp.break[interp.CP]
&& (interp.status ==
HypoState.RUNNING)) ;
1
1
382 Programming Language Processors in Java

public class ShowRegistersCommand

extends UserCommand {

public class ShowDataStoreCommand

extends UserCommand {

public class Terminatecommand

extends UserCommand {
public void perform
(HypoInterpreter interp) {
interp.status = HypoState.HALTED;

public class ToggleBreakpointCommand

extends UserCommand {
public short point ; / / The breakpoint address to toggle
public void perform
(HypoInterpreter interp) {
interp.break[this.point] =
! interp.break[this.point];
1
1

public class HypoInterpreter extends Hypostate {

...
public static boolean[] break =
new boolean[CODESIZEl;
UserCommand command;
private void clearBreakpoints ( ) {
for (int d = 0 ; d < CODESIZE; d++)
break [d++] = false;
1
384 Programming Language Processors in Java

8.8 In outline:
public class Minishell extends Minishellstate {

public MiniShellCommand readAnalyze ( ) {

/ / Read and analyze the next command from the user.

public MiniShellCommand readAnalyze

(FileInputStream script) {
/ / Read and analyze the next command from file script.

public void execute (~iniShellCommandcorn) {

else if (com.narne.equals("ca11")) {
File input = new File(com.args[O]);
FileInputStream script =
new FileInputStream(input);
while (morecommandsinscript) {
MiniShellCommand subcorn =
readAnalyze(script);
execute(subCorn);
1
1 else / / executable program
exec (corn.name, corn.args) ;
1
public void interpret () {

/ / Initialize . . .
status = RUNNING;

/ / Fetch and analyze the next instruction ...

MiniShellCommand com = readAnalyze0;
/ / Execute this instruction ...
execute (com);
) while (status == RUNNING);
1
Answers to Selected Exercises ?

Answers 9
9.5 In outline:
Common subexpressions are: 'i < j ' at points ( 1 ) ; the address of a [ i :
points (2); the address of a [ j I at points (3); the address of a [ n ]at points (4
var a : a r r a y ... of I n t e g e r
...
i : = m - 1 ; j := n; p i v o t := aCn];
w h i l e i < j(') do
begin
i : = i + 1;
while a [ i ~ (<~ p)i v o t do i : = i + 1;
j : = j - 1;
while a [ j ~ (>~ p)i v o t do j : = j - 1 ;
i f i i j(') t h e n
begin
t := a [ i ~ ( ~ ) ;
a [ i ] ( 2 ) := a [ j ](3);
a[j1(3) := t
end
end ;
t := a[i](2);
a[i](2) := a [ n ~ ( ~ ) ;
a [ n ~ ( :~= ) t
APPENDIX B

Informal Specification of the

programming Language Triangle

B.l Introduction
Triangle is a regularized extensible subset of Pascal. It has been designed as a model
language to assist in the study of the concepts, formal specification, and implementation
of programming languages.
The following sorts of entity can be declared and used in Triangle:
A value is a truth value, integer, character, record, or array.
A variable is an entity that may contain a value and that can be updated. Each variable
has a well-defined lifetime.
A procedure is an entity whose body may be executed in order to update variables. A
procedure may have constant, variable, procedural, and functional parameters.
A function is an entity whose body may be evaluated in order to yield a value. A
function may have constant, variable, procedural, and functional parameters.
A type is an entity that determines a set of values. Each value, variable, and function
has a specific type.
Each of the following sections specifies part of the language. The subsection headed
Syntax specifies its grammar in BNF (except for Section B.8 which uses EBNF). The
subsection headed Semantics informally specifies the semantics (and contextual
constraints) of each syntactic form. Finally, the subsection headed Examples illustrates
typical usage.

B.2 Commands
A command is executed in order to update variables. (This includes input-output.)
388 Programming Language Processors in Java

Syntax
A single-command is a restricted form of command. (A command must be enclosed
between begin ... end brackets in places where only a single-command is allowed.)
Command ..-
.. single-Command
I Command ;single-Command
single-Command ::=
I V-name := Expression
I Identifier ( Actual-Parameter-Sequence
I begin Command end
I let Declaration in single-Command
I if Expression then single-Command
else single-Command
I while Expression do single-Command
(The first form of single-command is empty.)

Semantics
The skip command ' ' has no effect when executed.
The assignment command ' V : = E' is executed as follows. The expression E is
evaluated to yield a value; then the variable identified by V is updated with this value.
(The types of V and E must be equivalent.)
The procedure calling command 'I(APS) ' is executed as follows. The actual-
parameter-sequence APS is evaluated to yield an argument list; then the procedure
bound to I is called with that argument list. (I must be bound to a procedure. APS
must be compatible with that procedure's formal-parameter-sequence.)
The sequential command 'C1 ; C2' is executed as follows. C1 is executed first; then
C2 is executed.
The bracketed command 'beginC end' is executed simply by executing C.
The block command 'let D in C' is executed as follows. The declaration D is
elaborated; then C is executed, in the environment of the block command overlaid by
the bindings produced by D. The bindings produced by D have no effect outside the
block command.
The if-command ' i f E then C1 else C2' is executed as follows. The expression E
is evaluated; if its value is true, then C1 is executed; if its value is false, then C2 is
executed. (The type of E must be Boolean.)
The while-command 'while E do C' is executed as follows. The expression E is
evaluated; if its value is true, then C is executed, and then the while-command is
executed again; if its value is false, then execution of the while-command is com-
pleted. (The type of E must be Boolean.)
Informal Specification of the Programming Language Triangle 389

Examples
The following examples assume the standard environment (Section B.9), and also the
following declarations:
var i: Integer;
var s: array 8 of Char;
var t: array 8 of Char;
proc sort (var a: array 8 of Char) - ...

(b) getint(var i); putint(i); puteol()

(d) if s[i] > s[i+l] then

let var c : Char
in
begin
c : = s[i]; s[i] := s[i+l]; s[i+ll : = c
end
else ! skip
(e) i : = 7 ;
while (i > 0) / \ (s[i] = ' ' ) do
i : = i - 1

Expressions
An expression is evaluated to yield a value. A record-aggregate is evaluated to construct
a record value from its component values. An array-aggregate is evaluated to construct
an array value from its component values.

Syntax
A secondary-expression and a primary-expression are progressively more restricted
forms of expression. (An expression must be enclosed between parentheses in places
where only a primary-expression is allowed.)
Expression .
..- secondary-Expression
I l e t Declaration i n Expression
I i f Expression then Expression e l s e Expression
secondary-Expression ::= primary-Expression
I secondary-Expression Operator primary-Expression
390 Programming Language Processors in Java

Integer-Literal
Character-Literal
V-name
ldentifier ( Actual-Parameter-Sequence )
Operator primary-Expression
( Expression )
{ Record-Aggregate )
[ Array-Aggregate I

Record-Aggregate ..- ldentifier - Expression

I ldentifier - Expression ,Record-Aggregate
Array-Aggregate ..-- Expression
I Expression ,Array-Aggregate

Semantics
The expression 'IL' yields the value of the integer-literal IL. (The type of the expres-
sion is I n t e g e r . )
The expression 'CL' yields the value of the character-literal CL. (The type of the
expression is C h a r . )
The expression 'V', where V is a value-or-variable-name, yields the value identified
by V, or the current value of the variable identified by V. (The type of the expression
is the type of V.)
The function calling expression ' Z ( A P S )' is evaluated as follows. The actual-
parameter-sequence APS is evaluated to yield an argument list; then the function
bound to I is called with that argument list. (I must be bound to a function. APS must
be compatible with that function's formal-parameter-sequence. The type of the
expression is the result type of that function.)
The expression '0E' is, in effect, equivalent to a function call '0 ( E )'.
The expression 'El 0 E2' is, in effect, equivalent to a function call ' 0 ( E l , E 2 ) '.
The expression ' ( E )' yields just the value yielded by E.
The block expression ' l e t D i n E' is evaluated as follows. The declaration D is
elaborated; then E is evaluated, in the environment of the block expression overlaid
by the bindings produced by D. The bindings produced by D have no effect outside
the block expression. (The type of the expression is the type of E.)
The if-expression ' i f El t h e n E2 e l s e E3' is evaluated as follows. The expression
El is evaluated; if its value is true, then E2 is evaluated; if its value is false, then E3 is
evaluated. (The type of El must be B o o l e a n . The type of the expression is the same
as the types of E2 and E3, which must be equivalent.)
Informal Specification of the Programming Language Triangle 391

The expression ' { R A ) ' yields just the value yielded by the record-aggregate R4.(The
-
type of ' { I I E l , ... , In - E n } ' is 'record II: T I , ... , I,: T , end',where the
type of each Eiis Ti. The identifiers I 1, ..., I, must all be distinct.)
The expression ' [ A A I ' yields just the value yielded by the array-aggregate AA. (The
type of ' [ E1 , .. . , En]' is 'arrayn of T ,where the type of every Eiis T.)
The record-aggregate ' I - E' yields a record value, whose only field has the identifier
Iand the value yielded by E.
The record-aggregate 'I - E , RA' yields a record value, whose first field has the
identifier I and the value yielded by E, and whose remaining fields are those of the
record value yielded by RA.
The array-aggregate 'E'yields an array value, whose only component (with index 0)
is the value yielded by E.
The array-aggregate ' E , AA' yields an array value, whose first component (with
index 0) is the value yielded by E, and whose remaining components (with indices 1,
2, ...) are the components of the array value yielded by AA.

Examples
The following examples assume the standard environment (Section B.9), and also the
following declarations:
var current: Char;
type Date - record
y: Integer, m: Integer, d: Integer
end ;
var today: Date;
func multiple (m: Integer, n: Integer) : Boolean -
...
func leap (yr: Integer) : Boolean - ..
(a) {y - t0day.y + 1, m - 1, d - 1)
(b) [ 3 1 , if leap(t0day.y) then 29 else 28,
31, 30, 3 1 , 3 0 , 3 1 , 3 1 , 30, 3 1 , 3 0 , 3 1 1

(d) (multiple(yr, 4) / \ \multiple(yr, 100))

\ / multiple(yr, 400)
(e) let
const shift -
ord('al)- ord('A1);
func capital (ch : Char) : Boolean -
(ord('A1)<= ord(ch))
/ \ (ord(ch) <= ord('Z1))
392 Programming Language Processors in Java

in
if capital (current)
then chr(ord(current) + shift)
else current

B.4 Value-or-variable names

A value-or-variable-name identifies a value or variable.

Syntax
V-name ::= Identifier
1 .
V-name Identifier
I V-name [ Expression I

Semantics
The simple value-or-variable-name 'I' identifies the value or variable bound to I. (I
must be bound to a value or variable. The type of the value-or-variable-name is the
type of that value or variable.)
The qualified value-or-variable-name ' V . I' identifies the field I of the record value or
variable identified by V. (The type of V must be a record type with a field I. The type
of the value-or-variable-name is the type of that field.)
The indexed value-or-variable-name ' V [ E ]' identifies that component, of the array
value or variable identified by V , whose index is the value yielded by the expression
E. If the array has no such index, the program fails. (The type of E must be
Integer,and the type of V must be an array type. The type of the value-or-variable-
name is the component type of that array type.)

Examples
The following examples assume the standard environment (Section B.9), and also the
following declarations:
type Date -
record
m : Integer, d : Integer
end ;
const m a s - {m - 12, d - 2 5 1 ;
var easter : Date;
var holiday : array 10 of Date
(a) easter
(b) m a s
396 Programming Language Processors in Java

form 'funcI', and the argument function is the one bound to I. (1must be bound to a
function, and that function must have a formal-parameter-sequence equivalent to FPS
and a result type equivalent to the type denoted by T.)

Examples
The following examples assume the standard environment (Section B.9):
(a) while \eol( ) do
begin get (var ch); put (ch) end;
geteol(); puteol ( )
(b) proc increment (var count: Integer) -
count : = count + 1

(c) func uppercase (letter: Char) : Char -

if (ord('al) <= ord(1etter))
/ \ (ord(1etter) <= ord('zl))
then chr(ord(1etter)-ord('a1)+ord('A'
) )
else letter
...
if uppercase(request) = ' Q ' then w i t
(d) type Point - record x: Integer, y: Integer end;
proc shiftright (var pt: Point, xshift: Integer) -
pt.x := pt.x + xshift
...
shiftright(var penposition, 10)
(e) proc iteratively (proc p (n: Integer),
var a: array 10 of Integer
let var i: Integer
in
begin
i : = 0;
while i < 10 do
begin p(a[i]); i : = i + 1 end
end ;
var v : array 10 of Integer
...
iteratively(proc putint, var v)
Informal Specification of the Programming Language Triangle 397

A type-denoter denotes a data type. Every value, constant, variable, and function has a
specified type.
A record-type-denoter denotes the structure of a record type.

Syntax
Type-denoter . Identifier
I array Integer-Literal of Type-denoter
I record Record-Type-denoterend
Record-Type-denoter ::= ldentifier : Type-denoter
I ldentifier : Type-denoter , Record-Type-denoter

Semantics
The type-denoter 'I' denotes the type bound to I.
The type-denoter 'array IL of T' denotes a type whose values are arrays. Each
array value of this type has an index range whose lower bound is zero and whose
upper bound is one less than the integer-literal IL. Each array value has one
component of type T for each value in its index range.
The type-denoter 'recordRT end' denotes a type whose values are records. Each
record value of this type has the record structure denoted by RT.
The record-type-denoter 'I : T' denotes a record structure whose only field has the
identifier I and the type T.
The record-type-denoter 'I : T , R T denotes a record structure whose first field has
the identifier I and the type T, and whose remaining fields are determined by the
record structure denoted by RT. I must not be a field identifier of RT.
(Type equivalence is structural:
Two primitive types are equivalent if and only if they are the same type.
The type record ... , I;: T i , .. . end is equivalent to record . .. , li' : Ti', . . .
end if and only if each Ii is the same as I;' and each Ti is equivalent to Ti'.
The type array n of T is equivalent to array n' of T' if and only if n = n' and T
is equivalent to T'.)

Examples
(a) Boolean
(b) array 80 of Char
Informal Specification of the Programming Language Triangle 399

Digit ::= 0111213141516171819

(Note: The symbols space, tab, and end-of-line stand for individual characters that
cannot stand for themselves in the syntactic rules.)

Semantics
The value of the integer-literal d,. ..dido is d,xlOn + ... + d l x 1 0 + do.
The value of the character-literal ' c ' is the graphic character c.
Every character in an identifier is significant. The cases of the letters in an identifier
are also significant.
Every character in an operator is significant. Operators are, in effect, a subclass of
identifiers (but they are bound only in the standard environment, to unary and binary
functions).

Examples
(a) Integer-literals: 0 1 9 87
(b) Character-literals: '%' ' z' ' ' '
(c) Identifiers: x pi vlOl Integer get gasFlowRate
(d) Operators: + * <= \/

Programs
A program communicates with the user by performing input-output.

Syntax
Program ::= Command

Semantics
The program 'C' is run by executing the command C in the standard environment.
Informal Specification of the Programming Language Triangle 401

Standard environment
The standard environment includes the following constant, type, procedure, and
function declarations:
type Boolean - ... ; ! truth values

const false - .. . ; ! the truth value false

const true - ... ; ! the truth value true

type Integer - .. . ; ! integers up to maxint in magnitude
const maxint - ... ; ! implementation-defined maximum integer
type Char - ... ; ! implementation-defined characters
func \ ( b : Boolean) : Boolean -
... ; ! not b , i.e., logical negation
func / \ ( b l : Boolean, b 2 : Boolean) : Boolean -
... ; ! b l and b2, i.e., logical conjunction
func \ / ( b l : Boolean, b 2 : Boolean) : Boolean -
... ; ! b l or b2, i.e., logical disjunction
func + ( i l : Integer, i 2 : Integer) : Integer -
... ; ! il plus i 2 ,
! failing if the result exceeds maxint in magnitude

func - ( i l : Integer, i 2 : Integer) : Integer -

... ; ! il minus i 2 ,
! failing if the result exceeds maxint in magnitude
func * ( i l : Integer, i 2 : Integer) : Integer -
...; ! iltimesi2,
! failing if the result exceeds maxint in magnitude

func / ( i l : Integer, i 2 : Integer) : Integer -

... ; ! i 1divided by i 2 , truncated towards zero,
! failing if i 2 is zero

func / / ( i l : Integer, i 2 : Integer) : Integer -

... ; ! i 1modulo i 2 , failing unless i 2 is positive
func < ( i l : Integer, i 2 : Integer) : Boolean -
... ; ! t r u e i f f i l i s l e s s t h a n i 2
func <= ( i l : Integer, i 2 : Integer) : Boolean -
... ; ! true iff i 1is less than or equal to i 2
func > ( i l : Integer, i 2 : Integer) : Boolean -
... ; ! true iff i 1is greater than i 2
402 Programming Language Processors in Java

func >= (il: Integer, i2: Integer) : Boolean -

... ; ! true iff i1 is greater than or equal to i2
func chr (i: Integer) : Char -
... ; ! character whose internal code is i,
! failing if no such character exists

func ord (c: Char) : Integer -

.. . ; ! internal code of c
func eof () :Boolean -
... ; ! true iff end-of-file has been reached in input
func eol () : Boolean -
... ; ! true iff end-of-line has been reached in input
proc get (var c: Char) -
... ; ! read the next character from input and assign it to c,
! failing if end-of-file already reached

proc put (c: Char) -

... ; ! write character c to output
proc getint (var i: Integer) -
... ; ! read an integer literal from input and assign its value
! to i,failing if the value exceeds maxint in magnitude,
! or if end-of-file is already reached

proc putint (i: Integer) -

... ; ! write to output the integer literal whose value is i
proc geteol () -
... ; ! skip past the next end-of-line in input,
! failing if end-of-file is already reached

proc puteol () -
... ; ! write an end-of-line to output
In addition, the following functions are available for every type T:
func = (vall: T, va12: T) : Boolean -
... ; ! t r u e i f f v a l l i s e q u a l t o v a l 2
func \ = (vall: T, va12: T) : Boolean -
... ! trueiffvallisnotequaltova12
APPENDIX C

Description of the Abstract

Machine TAM

TAM is an abstract machine whose design makes it especially suitable for executing
programs compiled from a block-structured language (such as Algol, Pascal, or Trian-
gle). All evaluation takes place on a stack. Primitive arithmetic, logical, and other
operations are treated uniformly with programmed functions and procedures.

C.l Storage and registers

TAM has two separate stores:
Code Store, consisting of 32-bit instruction words (read only)
Data Store, consisting of 16-bit data words (read-write).
The layouts of both stores are illustrated in Figure C.l
Each store is divided into segments, whose boundaries are pointed to by dedicated
registers. Data and instructions are always addressed relative to one of these registers.
While a program is running, the segmentation of Code Store is fixed, as follows:
The code segment contains the program's instructions. Registers CB and CT point to
the base and top of the code segment. Register CP points to the next instruction to be
executed, and is initially equal to CB (i.e., the program's first instruction is at the base
of the code segment).
The primitive segment contains 'microcode' for elementary arithmetic, logical, input-
output, heap, and general-purpose operations. Registers PB and PT point to the base
and top of the primitive segment.
While a program is running, the segmentation of Data Store may vary:
The stack grows from the low-address end of Data Store. Registers SB and ST point
to the base and top of the stack, and ST is initially equal to SB.
The heap grows from the high-address end of Data Store. Registers HB and HT point
to the base and top of the heap, and HT is initially equal to HB.
410 Programming Language Processors in Java

(4) The return instruction 'RETURN ( n ) d' pops the topmost frame and replaces the d
words of arguments by the n-word result. LB is reset using the dynamic link, and
control is transferred to the instruction at the return address.
Since R's arguments lie immediately below its frame, R can access the arguments
using negative displacements relative to LB. For example:
LOAD(1) -d[LBl - for R to load its first argument (1 word)
LOAD ( 1) -1 [LB] - for R to load its last argument ( I word)
A primitive routine is one that performs an elementary arithmetic, logical, input-
output, heap, or general-purpose operation. The primitive routines are summarized in
Table C.3. Each primitive routine has a fixed address in the primitive segment. TAM
traps every call to an address in that segment, and performs the corresponding operation
directly.

Table C.2 Summary of TAM instructions.

Op-code Instruction mnemonic Effect
0 LOAD(n) d [ r l Fetch an n-word object from the data address (d + register r),
and push it on to the stack.
1 LOADAd[r] Push the data address (d + register r) on to the stack.
2 LOADI(n) Pop a data address from the stack, fetch an n-word object
from that address, and push it on to the stack.
3 LOADL d Push the 1-word literal value d o n to the stack.
4 STORE(n)d[r] Pop an n-word object from the stack, and store it at the data
address (d + register r).
5 STORE1 ( n ) Pop an address from the stack, then pop an n-word object
from the stack and store it at that address.
6 CALL(n) d [ r l Call the routine at code address (d + register r), using the
address in register n as the static link.
7 CALL1 Pop a closure (static link and code address) from the stack,
then call the routine at that code address.
8 RETURN(n)d Return from the current routine: pop an n-word result from
the stack, then pop the topmost frame, then pop d words of
arguments, then push the result back on to the stack.
9 - (unused)
10 PUsHd Push d words (uninitialized) on to the stack.
11 POP(n) d Pop an n-word result from the stack, then pop d more words,
then push the result back on to the stack.
12 JUMP d [ r ] Jump to code address (d + register r).
13 JUMP1 Pop a code address from the stack, then jump to that address.
14 JUMPIF ( n ) d [ r ] Pop a I -word value from the stack, then jump to code
address (d + register r) if and only if that value equals n.
15 HALT Stop execution of the program.
Description of the Abstract Machine TAM 41 1

Table C.3 Summary of TAM primitive routines.

Address 1 Mnemonic Arguments ( Result I Effect
w I w' I Set w'= w .
P B + ~I not
PB+3 I and t2 t' Set t' = tl t2.
A

t l , t2 t' Set t' = tl v t2.

PB+5 I succ 1 i' Seti'=i+l.

i i , i2 i' Set i' = i l + i2.

PB + 9 I sub i l , i2 i' Set i' = i l i2.
-

i l , i2 i' Set i' = il x i2.

PB+llI div i l , i2 i' Set i' = il / i2 (truncated).
PB+121 mod i l , i2 i' Set i' = i l modulo i2.
i l , i2 t' Set t' = true iff il < i2.
il, i2 t' Set t' = true iff i l 5 i2.
i ~i2, t' Set t' = true iff i l 2 i2.
i ] ,i2 t' Set t' = true iff il > i2.
v l , ~ 2n , t' Set t' = true iff vl = v2 (where vl and v2 are
n-word values).
v l , v2, n t' Set t' = true iff vl # v2 (where vl and v2 are
n-word valuesl
-

I ' I Set t' = true iff the next character to be read

is an end-of-line.
Set t' = true iff there are no more characters

PB+21

PB + 23 geteol
end-of-line.
PB + 24 1 DU teol - 1 - 1 Write an end-of-line.
a - Read an integer-literal (optionally preceded
by blanks and/or signed), and store its value
at address a.
. .
1 - Write an integer-literal whose value is i.
n a' Set a ' = address of a newly allocated n-

PB + 28 I dispose n,a I - 1 word object in the heap.

Deallocate the n-word object at address a in
the heau.

(See notes overleaf.)

APPENDIX D

Class Diagrams for the Triangle

Compiler

This appendix uses class diagrams to summarize the structure of the Triangle compiler,
which is available from our Web site (see Preface, page xv).
The Triangle compiler has broadly the same structure as the Mini-Triangle compiler
used throughout the text of this book. It is discussed in more detail in Sections 3.3, 4.6,
5.4, and 7.5.
The class diagrams are expressed in UML (Unified Modeling Language). UML is
described in detail in Booch et al. (1999). However, the following points are worth
noting. The name of an abstract class is shown in italics, whereas the name of a concrete
class is shown in bold. Private attributes and methods are prefixed by a minus sign (-),
whereas public attributes and methods are prefixed by a plus sign (+). The definition of
a class attribute or method is underlined. The name of a method parameter is omitted
where it is of little significance.
414 Programming Language Processors in Java

D.l Compiler
The following diagram shows the overall structure of the compiler, including the
syntactic analyzer (scanner and parser), the contextual analyzer, and the code generator:

Triangle::ErrorReporter
Triang1e::AbstractSyntaxTrees::Visitor
+ *(constructor* ErrorReoorter ( )
+ reportError (: String, : String,
: SourcePosition) : void
+ reportRestriction (: String) : void

A
Triang1e::ContextualAnalyzer::Checker

II + monstructorn Checker (: ErrorRe~orter)

+ check (ast : Program) : void II
+ **constructors Parser r: Scanner, + aconstructora Encoder i: ErrorReoorter)
: ErrorReporter) + encodeRun (: Program, : boolean) : void
+ parseprogram ( ) : Program + saveObjectProgram (: String) : void
I

Triang1e::StdEnvironment

+ <constructor>>Scanner i: SourceFile) + anvTvoe : TvpeDenoter

+ scan ( ) : Token + booleanTv~e: TvoeDenoter
+ charTvoe : Ty~eDenoter
+ errorTvoe : Tv~eDenoter
+ integerTvoe : Tv~eDenoter
...

+ main (: Strinell) : void

- com~ileProgram(sourceName : String. obiectName : S t r i n ~ ,
showingAST : boolean, showinvTable : boolean) : boolean
Class Diagrams for the Triangle Compiler 4 15

D.2 Abstract syntax trees

The diagrams in this section show the class hierarchy used in the representation of
Triangle ASTs. Each major syntactic class is presented in a separate diagram. These
diagrams show class names only, omitting constructors and methods.
The following diagram shows the immediate subclasses of the AST class. Most of
these are abstract classes representing the main syntactic phrases. Note that Formal-
Parameter is a subclass of Declaration in order that formal parameters may be included
in the identification table during contextual analysis.

AST ActualParameter

- ActualParameter-
Sequence

- ArrayAggregate

- Command

- Declaration FormalParumeter

- Expression

- FormalParameter-
Sequence

- Program
416 Programming Language Processors in Java

D.2.1 Commands
The following diagram shows the individual concrete classes for each form of
command:

Command Assigncommand

- Callcommand

Ifcommand

Letcommand
Class Diagrams for the Triangle Compiler 417

D.2.2 Expressions
The following diagram shows the individual concrete classes for each form of
expression:

I Expression

EmptyExpression

VnameExpression

The following diagram shows the individual concrete subclasses for a record
aggregate:
41 8 Programming Language Processors in Java

The following diagram shows the individual concrete subclasses for an array
aggregate:

SingleArrayAggregate

D.2.3 Value-or-variable names

The following diagram shows the individual concrete subclasses for each form of value-
or-variable name:
Class Diagrams for the Triangle Compiler 419

D.2.4 Declarations
The following diagram shows the individual concrete classes for each form of
declaration:

Declaration - BinaryOperator-
Declaration

- ConstDeclaration

FormalParameter
Parameter
- FuncDeclaration
FuncFormal-
Parameter
- ProcDeclaration
ProcFormal-
SequentialDeclaration Parameter

TypeDeclaration
1

H UnaryOperator-
Declaration I
Class Diagrams for the Triangle Compiler 421

D.2.6 Type-denoters
The following diagram shows the individual concrete subclasses for each form of type-
denoter:

TypeDenoter AnyTypeDenoter

- ArrayTypeDenoter

- BoolTypeDenoter

c CharTypeDenoter

ErrorTypeDenoter

FieldTypeDenoter MultipleField-
TypeDenoter

IntTypeDenoter I I SineleField- I
TypeDenoter

SimpleTypeDenoter

D.2.7 Terminals
The following diagram shows the individual concrete subclasses for each form of
terminal node:

Terrninul CharacterLiteral

- Identifier

- IntegerLiteral

- Operator
Class Diagrams for the Triangle Compiler

D.4 Contextual analyzer

The following diagram shows the internal structure of the contextual analyzer.

IdEntry

r - level
- latest
: int
: IdEntry

+ <<constructor>>
IdentificationTable ( )
+ openScope ( ) : void
+ attr : Declaration
+ level : int
+ previous : IdEntry
+ closeScope ( ) : void + constructor* IdEntrv (
+ enter (: String, : Declaration) : void : String, : Declaration,
+ retrieve (: String) : Declaration : int, : IdEntrv)

Checker

I - idTable : IdentificationTable

+ <<constructor>>
Checker (: ErrorReporter)
+ check (: Program) : void
150 Programming Language Processors in Java

The above declarations of the standard environment are not syntactically valid in
Mini-Triangle, and so cannot be introduced by processing a normal input file. In fact,
these declarations are entered into the identification table using a method called estab-
lishStandardEnvironment,which the contextual analyzer calls before checking
the source program.
Once the standard environment is entered in the identification table, the source
program can be checked for any type errors. At every applied occurrence of an
identifier, the identification table will be searched in exactly the same way (regardless of
whether the identifier turns out to be in the standard environment or the source
program), and its corresponding attribute used to determine its type.
0

Type checking
The second task of the contextual analyzer is to ensure that the source program contains
no type errors. The key property of a statically-typed language is that the compiler can
detect any type errors without actually running the program. In particular, for every
expression E in the language, the compiler can infer either that E has some type T or
that E is ill-typed. If E does have type T, then evaluating E will always yield a value of
that type T. If E occurs in a context where a value of type T' is expected, then the
compiler can check that T is equivalent to T', without actually evaluating E. This is the
task that we call type checking.
Here we shall focus on the type checking of expressions. Bear in mind, however,
that some phrases other than expressions have types, and therefore also must be type-
checked. For example, a variable-name on the left-hand side of an assignment command
has a type. Even an operator has a type. We write a unary operator's type in the form
T 1+ T2, meaning that the operator must be applied to an operand of type T I , and will
yield a result of type T2. We write a binary operator's type in the form T1 x T2 4 T3,
meaning that the operator must be applied to a left operand of type T I and a right
operand of type T2,and will yield a result of type T3.
For most statically-typed programming languages, type checking is straightforward.
The type checker infers the type of each expression bottom-up (i.e., starting with literals
and identifiers, and working up through larger and larger subexpressions):
Literal: The type of a literal is immediately known.
Identifier: The type of an applied occurrence of identifier I is obtained from the
corresponding declaration of I.
Unary operator application: Consider the expression '0E', where 0 is a unary
operator of type T I + T2. The type checker ensures that E's type is equivalent to T I ,
and thus infers that the type of ' 0 E' is T2. Otherwise there is a type error.

Download ebooks file Fundamentals of Computer Graphics Fifth Edition Steve Marschner all chapters
100% (3)
Download ebooks file Fundamentals of Computer Graphics Fifth Edition Steve Marschner all chapters
40 pages
Build Your Own Flight Sim in C++ (DOS GameDev) Michael Radtke & Chris Lampton
100% (1)
Build Your Own Flight Sim in C++ (DOS GameDev) Michael Radtke & Chris Lampton
672 pages
LISP in Small Pieces PDF
No ratings yet
LISP in Small Pieces PDF
532 pages
Ranta, A. (2012) - Implementing Programming Languages: An Introduction To Compilers and Interpreters
100% (4)
Ranta, A. (2012) - Implementing Programming Languages: An Introduction To Compilers and Interpreters
226 pages
Modern Compiler Implementation in C
100% (2)
Modern Compiler Implementation in C
23 pages
Apt K.R. From Logic Programming To Prolog (PH, 1997) (ISBN 013230368X) (O) (345s) - CSPL
100% (1)
Apt K.R. From Logic Programming To Prolog (PH, 1997) (ISBN 013230368X) (O) (345s) - CSPL
345 pages
A Retargetable C Compiler Design and Implementation
100% (1)
A Retargetable C Compiler Design and Implementation
578 pages
Advanced+Compiler+Design+and+Implementation 1997
No ratings yet
Advanced+Compiler+Design+and+Implementation 1997
881 pages
1.A Review of Programming Paradigms Throughout The History - With A Suggestion Toward A Future Approach
100% (1)
1.A Review of Programming Paradigms Throughout The History - With A Suggestion Toward A Future Approach
87 pages
Introduction To Compiler Design
100% (1)
Introduction To Compiler Design
186 pages
Algorithms On String Trees and Sequences
No ratings yet
Algorithms On String Trees and Sequences
326 pages
Programming Language Processors in Java Compilers and Interpreters.9780130257864.25356
No ratings yet
Programming Language Processors in Java Compilers and Interpreters.9780130257864.25356
438 pages
Grab Age Collector in C Language
No ratings yet
Grab Age Collector in C Language
11 pages
LLVM Reference Card
No ratings yet
LLVM Reference Card
2 pages
Creating A Language Using Only Assembly Language
No ratings yet
Creating A Language Using Only Assembly Language
73 pages
A Prolog Introduction For Hackers
No ratings yet
A Prolog Introduction For Hackers
9 pages
Principles of Compiler Design - Tutorial 9
100% (1)
Principles of Compiler Design - Tutorial 9
7 pages
CX Programming Langauge
No ratings yet
CX Programming Langauge
75 pages
Threaded Interpretive Languages PDF
No ratings yet
Threaded Interpretive Languages PDF
266 pages
The LLVM Compiler Framework and Infrastructure
No ratings yet
The LLVM Compiler Framework and Infrastructure
44 pages
Backtracking Algorithms
100% (2)
Backtracking Algorithms
42 pages
UPP04 Proceedings PDF
No ratings yet
UPP04 Proceedings PDF
226 pages
A Machine Learning Library in C++
No ratings yet
A Machine Learning Library in C++
4 pages
Create New Language
No ratings yet
Create New Language
26 pages
Introduction To MIPS Assembly Language Programming1
No ratings yet
Introduction To MIPS Assembly Language Programming1
179 pages
Comp Design
100% (1)
Comp Design
604 pages
Elemental Design Patterns - Addison Wesley
No ratings yet
Elemental Design Patterns - Addison Wesley
360 pages
Unconventional Programming Paradigms 2004
No ratings yet
Unconventional Programming Paradigms 2004
226 pages
Build Your Own Lisp
100% (3)
Build Your Own Lisp
194 pages
ARM AssyLang
100% (1)
ARM AssyLang
156 pages
Intelligent Compilers
No ratings yet
Intelligent Compilers
9 pages
Let's Build A Compiler
100% (1)
Let's Build A Compiler
434 pages
Compiling With Continuations Andrew W Appel 2007
No ratings yet
Compiling With Continuations Andrew W Appel 2007
270 pages
Introduction To Theory of Programming Languages
100% (1)
Introduction To Theory of Programming Languages
233 pages
Memory Book C
100% (2)
Memory Book C
350 pages
Aho Ullman Parsing V2
No ratings yet
Aho Ullman Parsing V2
484 pages
Prolog For Programmers Neat
100% (1)
Prolog For Programmers Neat
317 pages
Compiler Design - Software Design Project
0% (1)
Compiler Design - Software Design Project
54 pages
Machine-Learning Paradigms
No ratings yet
Machine-Learning Paradigms
32 pages
How To Write Compiler
No ratings yet
How To Write Compiler
148 pages
Kaleidoscope - Implementing A Language With LLVM in Objective Caml
No ratings yet
Kaleidoscope - Implementing A Language With LLVM in Objective Caml
142 pages
A Tutorial On Algol 68
No ratings yet
A Tutorial On Algol 68
67 pages
Automated ANTLR Tree Walker Generation
No ratings yet
Automated ANTLR Tree Walker Generation
140 pages
Compiler Design PDF
No ratings yet
Compiler Design PDF
313 pages
From The Mailbox: The Origins of DOS - Tim Paterson
No ratings yet
From The Mailbox: The Origins of DOS - Tim Paterson
3 pages
Using A Hierarchy of Domain Specific Languages in Complex Software Systems Design
No ratings yet
Using A Hierarchy of Domain Specific Languages in Complex Software Systems Design
8 pages
Enterprise Cobol Programming Guide
100% (1)
Enterprise Cobol Programming Guide
800 pages
Decision Trees and Boosting: Helge Voss (MPI-K, Heidelberg) TMVA Workshop
No ratings yet
Decision Trees and Boosting: Helge Voss (MPI-K, Heidelberg) TMVA Workshop
30 pages
Kubernetes Basic To Advanced
No ratings yet
Kubernetes Basic To Advanced
4 pages
LISP in Small Pieces
25% (4)
LISP in Small Pieces
11 pages
Graphics Programming
75% (4)
Graphics Programming
42 pages
Flow Based Programming Book
100% (2)
Flow Based Programming Book
377 pages
Clojure High Performance Programming, Second Edition: Become an expert at writing fast and high performant code in Clojure 1.7.0
From Everand
Clojure High Performance Programming, Second Edition: Become an expert at writing fast and high performant code in Clojure 1.7.0
Shantanu Kumar
No ratings yet
Code Beneath the Surface: Mastering Assembly Programming
From Everand
Code Beneath the Surface: Mastering Assembly Programming
Kameron Hussain
No ratings yet
Instant MinGW Starter
From Everand
Instant MinGW Starter
Ilya Shpigor
No ratings yet
Mastering Prolog Programming: From Basics to Expert Proficiency
From Everand
Mastering Prolog Programming: From Basics to Expert Proficiency
William Smith
No ratings yet
Rust In Practice: A Programmers Guide to Build Rust Programs, Test Applications and Create Cargo Packages
From Everand
Rust In Practice: A Programmers Guide to Build Rust Programs, Test Applications and Create Cargo Packages
Rustacean Team
No ratings yet
Clojure Reactive Programming
From Everand
Clojure Reactive Programming
Leonardo Borges
No ratings yet
Racket Unleashed: Building Powerful Programs with Functional and Language-Oriented Programming
From Everand
Racket Unleashed: Building Powerful Programs with Functional and Language-Oriented Programming
Robert Johnson
No ratings yet
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet
Password Recovery For The PIX
No ratings yet
Password Recovery For The PIX
5 pages
Memory Organization
No ratings yet
Memory Organization
36 pages
Linux Lab 2016
100% (1)
Linux Lab 2016
397 pages
IBM Storage Defender Level 1 Quiz Attempt Review Jan 2024 PDF
100% (1)
IBM Storage Defender Level 1 Quiz Attempt Review Jan 2024 PDF
8 pages
How To Add Managed Oacore in Oracle Apps R12.2.x
No ratings yet
How To Add Managed Oacore in Oracle Apps R12.2.x
3 pages
Log Back
No ratings yet
Log Back
30 pages
Experiences With Formal Verification in The Design of High Integrity Embedded System & DAE Initiatives
No ratings yet
Experiences With Formal Verification in The Design of High Integrity Embedded System & DAE Initiatives
44 pages
Sharepoint Tutorial PDF
100% (1)
Sharepoint Tutorial PDF
305 pages
HP
No ratings yet
HP
118 pages
Cache: CIT 595 Spring 2007
No ratings yet
Cache: CIT 595 Spring 2007
11 pages
Epson TM-U220 Series POS Impact Printer Brochure
No ratings yet
Epson TM-U220 Series POS Impact Printer Brochure
2 pages
13.3 Real Numbers & Normalized Floating-Point
No ratings yet
13.3 Real Numbers & Normalized Floating-Point
17 pages
VOLTE Training Anshul
100% (1)
VOLTE Training Anshul
19 pages
Julia (Programming Language)
No ratings yet
Julia (Programming Language)
5 pages
System Info
No ratings yet
System Info
3 pages
LPU PCB REPLACE Manual
No ratings yet
LPU PCB REPLACE Manual
6 pages
Manual
No ratings yet
Manual
17 pages
Android
No ratings yet
Android
100 pages
Bodmas-20 10 20
No ratings yet
Bodmas-20 10 20
4 pages
Building Virtual Machine Labs 2nd Edition Tony Robinson all chapter instant download
No ratings yet
Building Virtual Machine Labs 2nd Edition Tony Robinson all chapter instant download
40 pages
computer library management file new
No ratings yet
computer library management file new
14 pages
User Guide: AC3150 Wireless MU-MIMO Gigabit Router Archer C3150
No ratings yet
User Guide: AC3150 Wireless MU-MIMO Gigabit Router Archer C3150
123 pages
Fronius Explorer
No ratings yet
Fronius Explorer
66 pages
DriveDebug - Software Tools (ABB Drives)
No ratings yet
DriveDebug - Software Tools (ABB Drives)
2 pages
Lab Initialization, GPIO, Basic Scheduling
No ratings yet
Lab Initialization, GPIO, Basic Scheduling
6 pages
Linked List1
No ratings yet
Linked List1
37 pages
VCRP - 2022: COMPUTER/CSE/IT Branch Question Bank
No ratings yet
VCRP - 2022: COMPUTER/CSE/IT Branch Question Bank
14 pages
2899 - APX 16 Quickstart
No ratings yet
2899 - APX 16 Quickstart
1 page
Toshiba - Satellite M645 - SP6001 PDF
No ratings yet
Toshiba - Satellite M645 - SP6001 PDF
4 pages
All 1313 Combined 2
No ratings yet
All 1313 Combined 2
114 pages

Compiler Construction

Uploaded by

Compiler Construction

Uploaded by

COMPILER CONSTRUCTION

Compiler Construction, a modern text written by two leaders in the in the

3.3 Storage Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

7.2.1 Strong LL(k) Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . 127

11.3 Instruction Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

A.2.3 Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321

1.1 Translation and Interpretation

1.2 The Tasks of a Compiler

name # name exp asgn asgn

idn idn name > name name exp name exp

idn idn idn name - name idn name - name

idn idn idn idn

a) Control and data ow

1.3 Data Management in a Compiler

functional aspects of a data object from the implementation aspects by regarding it as an

1.4 Compiler Structure

 Procedure: An abstraction of a single "memoryless" action (i.e. an action with no

Structural Analysis Semantic Analysis

Lexical Analysis Parsing

Code Generation Assembly

Targe Mapping Code Selection

1.5 Notes and References

an ALGOL 60 implementation in an extended version of ALGOL 60. The books by Gries

2.1.1 Syntax, Semantics and Pragmatics

if expression then else

factor factor and factor

variable variable variable

identifier identifier identifier

a) Parse tree (application of concrete syntax rules)

factor factor and factor

variable variable variable

identifier identifier identifier

b) Structure tree (application of abstract syntax rules)

2.1.3 Semantic Properties

 Data objects and structures upon which operations take place

and the case selection

2.2 Data Objects and Operations

2.2.1 Elementary Types

2.2.2 Composite Types

2.2.5 Type Equivalence

c) Normal precedence rules

2.4 Control Structures

2.5 Program Environments and Abstract Machine States

 They possess a value (from a domain determined by their type).

Figure 2.3: Complex Procedure Interactions in Pascal

Contour for procedure outer

Contour for procedure p

Note: Arrows show dynamic predecessor

Contour for procedure outer

Contour for procedure p

2.6 Notes and References

(a) What set of operations should be available on strings?

3.1 Basic Characteristics

3.1.1 Storage Classes

i: Operand is the byte i from the instruction.

3.2 Representation of Language Elements

3.2.1 Elementary Objects

3.2.2 Composite Objects

Figure 3.7: Tree for a Typical Array Access

b) IBM 370 code corresponding to (a)

3.2.4 Control Structures

condition(e, L1, L2)

select(e, k1 , L1,: : : , kn , Ln, L0)

Figure 3.9: Implementation Schemata for Common Control Structures

Transfer: Transfer control to the procedure.

3.3 Storage Management

important part of the optimization of such procedures.

Figure 3.14: Setting the Display

3.3.3 Dynamic Storage Management Using a Heap

3.4 Mapping Speci cations

3.5 Notes and References

(c) e1 =10, e2 =1, e3 =-3

(a) Explain why this fragment is illegal.

4.1 Intermediate Languages

4.1.1 Token Sequence

Figure 4.1: LAX Abstract Token

Procedure: An abstraction of a single "memoryless" action (i.e. an action with no

Data objects and structures upon which operations take place

They possess a value (from a domain determined by their type).

3.4 Mapping Specications

4.2.3 Denition Table

type 2 (context-free ) grammar if each production has the form A ! , A 2 N and 2 V .

(d) if f (ki) 2 T , or if n = 1 and f (ki) = , then ki is a leaf

Further, F = ff g [ fX j X ! 2 P g. Figure 5.9 is an automaton constructed by this

FIRSTk (!) = f j 9 2 T such that ! ) ; = k : g

EFFk (!) = f 2 FIRSTk (!) j @A 2 N; 2 T such that ! )R A ) g

FOLLOWk (!) = f j 9 2 V such that Z ) !; 2 FIRSTk ( )g

pushdown automaton A denes the language L(A) = f j s0 q0 # ) q#; q 2 F; 2 T g. (In

5. If = B for some B 2 N and 2 V , let q0 = [X ! B ;

Rp = f(; ) j = p; Z )R Xpg

P 00 = f[X ! B ; !] ! [B ! ; ] j B ! 2 P; 2 FIRSTk ( !)g

P0 = f[X ! ; !] ! & j 6= ; 2 EFFk (!)g

Pp = f[Xp ! p; !] ! &!g p = 1; : : : ; n

Programming the transition table as a case clause for each state.

b) Parser actions to produce postx

5'. If = B for some B 2 N and 2 V , let q0 = [X ! B ;