(Texts and Monographs in Computer Science) Thomas W. Reps, Tim Teitelbaum - The Synthesizer Generator_ A System for Constructing Language-Based Editors-Springer (1989)
(Texts and Monographs in Computer Science) Thomas W. Reps, Tim Teitelbaum - The Synthesizer Generator_ A System for Constructing Language-Based Editors-Springer (1989)
Editor
David Gries
Advisory Board
F. L. Bauer
S. D. Brookes
C. E. Leiserson
F. B. Schneider
M. Sipser
Texts and Monographs in Computer Science
SuadAlagic
Object-Oriented Database Programming
SuadAlagic
Relational Database Technology
S. Thomas Alexander
Adaptive Signal Processing: Theory and Applications
Kaare Christian
The Guide to Modula-2
Edsger W. Dijkstra
Selected Writings on Computing: A Personal Perspective
Nissim Francez
Fairness
David Gries
The Science of Programming
MichaHofri
Probabilistic Analysis of Algorithms
E.Y. Krishnamurthy
Error-Free Polynomial Matrix Computations
Thomas W. Reps
Tim Teitelbaum
The
Synthesizer
Generator
A System for Constructing
Language-Based Editors
With 75 Illustrations
All rights reserved. This work may not be translated or copied in whole or in part without the written
permission of the publisher (Springer-Verlag, 175 Fifth Avenue, New York, NY 10010, USA),
except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with
any form of information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed is forbidden.
Appendices A, B, C, and 0, © 1988 by Thomas W. Reps and Ray (Tim) Teitelbaum.
Chapters 1 and 3 and parts of Chapter 10 are based on material that originally appeared in Computer
20, 11 (November 1987), © 1987 by the Institute of Electrical and Electronics Engineers, Inc. Used
with permission.
Quoted material on page 266 originally appeared in Reps, Teitelbaum, and Demers, "Incremental
context-dependent analysis for language-based editors," ACM Trans. Program. Lang. Systems 5,
3 (July 1983). © 1983 Association for Computing Machinery. Reprinted with permission.
The use of general descriptive names, trade names, trademarks, etc. in this publication, even if the
former are not especially identified, is not to be taken as a sign that such names, as understood by
the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone.
9 8 7 6 5 432 I
This book is a detailed account of the Synthesizer Generator, a system for creat-
ing specialized editors that are customized for editing particular languages. The
book is intended for those with an interest in software tools and in methods for
building interactive systems. It is a must for people who are using the Syn-
thesizer Generator to build editors because it provides extensive discussions of
how to write editor specifications. The book should also be valuable for people
who are building specialized editors "by hand," without using an editor-
generating tool.
The need to manage the development of large software systems is one of the
most pressing problems faced by computer programmers. An important aspect
of this problem is the design of new tools to aid interactive program develop-
ment. The Synthesizer Generator permits one to create specialized editors that
are tailored for editing a particular language. In program editors built with the
Synthesizer Generator, knowledge about the language is used to continuously
assess whether a program contains errors and to determine where such errors
occur. The information is then displayed on the terminal screen to provide feed-
back to the programmer as the program is developed and modified.
The knowledge incorporated in editors generated with the Synthesizer Gen-
erator takes several forms. One form is knowledge of the language's syntax,
which is used to detect and prevent syntax errors. Other forms of language
knowledge encompass translation, transformation, and analysis of an object
being edited. Knowledge of such aspects can be harnessed to check objects for
inconsistencies, to prompt the user of the editor with legal alternatives, or to
impose constraints on how the user can proceed.
vi Preface
The feature that makes the Synthesizer Generator unique is its use of an
immediate-computation paradigm to perform analysis, translation, and error
reporting while an object is being edited. In the immediate mode of computa-
tion, each modification to data has instantaneous effect, as in an electronic
spreadsheet. For example, with program editors generated using the Synthesizer
Generator, each modification to a program causes all affected analysis, error
messages, and generated code to be immediately updated. Errors are detected as
soon as they occur, and the delay for compilation that is necessary with tradi-
tional program-development tools is eliminated.
The Synthesizer Generator creates a language-specific editor from a
specification of a language's abstract syntax, context-sensitive relationships,
display format, concrete input syntax, and transformation rules for restructuring
objects. The treatment of language syntax by the Synthesizer Generator is of
particular importance. The editor-designer's specification of the language's
syntax addresses not only context-free syntax but also such context-sensitive
conditions as the absence of type mismatches; as the user creates and modifies
objects, the generated editor incrementally checks for violations of context con-
ditions tha~ have been specified.
The Synthesizer Generator is of utility in a wide range of applications, includ-
ing program editors, document-preparation systems, and verification tools. It
has been used to create program editors for several different programming
languages, including ones with such varying features as context-sensitive
"pretty-printing," incremental code generation, detection of program anomalies,
and detection of type violations. It has been used to create editors for text and
mathematical formulas; in these editors, formatting is performed interactively on
a character-by-character basis so that at all times the screen resembles the final
page layout. It has also been used to create editors that verify the correctness of
mathematical proofs (for several different varieties of logic).
The Synthesizer Generator's specification language (called SSL, for Syn-
thesizer Specification Language) is based on the notion of an attribute grammar,
which is a very general notation for expressing syntax-directed translations. A
number of innovations in SSL make SSL specifications different from
specifications written for other systems based on attribute grammars. The pri-
mary innovations are the way SSL merges the concepts of abstract-syntax
definitions and user-defined attribute types, the introduction of notation that per-
mits specifications to be factored into separate modules, and the manner in
which the parser is incorporated into the system.
The Synthesizer Generator has been under development since 1981. It is
implemented in C and runs under the UNIX operating system. The system has
Preface vii
IFor more complete documentation of SSL and the Synthesizer Generator, the reader should consult
The Synthesizer Generator Reference Manual [Reps881.
Preface ix
Thomas Reps's Ph.D. dissertation; it was published by The M.I.T. Press after it
was selected as the winner of the 1983 ACM Doctoral Dissertation Award. The
two books are complementary: Generating Language-Based Environments
describes the algorithmic foundations that underlie the Synthesizer Generator
and presents a number of technical results about attribute grammars; the current
book provides a thorough discussion of the actual system that we developed for
generating language-based environments. In the current book, we have aimed
for a mix of theory and practice in an attempt to make the ideas embodied in the
system accessible to as wide an audience as possible.
We are indebted to a great number of individuals for their part in the develop-
ment of the Synthesizer Generator system. Very considerable contributions
have been made by the following individuals:
Countless discussions with our user community have guided us in refining the
design and implementation of the system. The assistance of this group is appre-
ciatively acknowledged. The Synthesizer Generator is distributed with many of
the demonstration editors that these users have prepared:
CHAPTER 1
Introduction
1.1 Using Structure Editing to Ensure that Programs Are Syntactically Correct 3
1.2 Using Immediate Computation to Locate Errors in Programs 6
1.3 Using Incremental Code Generation to Support Program Testing 9
1.4 Supporting Program-Development Methodologies 11
1.5 The Need for Incremental Algorithms 12
1.6 Adapting Specifications for Immediate Computation 14
1.7 Generating Language-Based Programming Environments 16
1.8 The Synthesizer Generator 18
CHAPTER 2
Demonstration of a Sample Editor 20
CHAPTER 3
The Attribute-Grammar Model of Editing 39
CHAPTER 4
Specification of a Sample Editor 45
4.1 Abstract Syntax 45
4.2 Attributes and Attribute Equations 50
4.3 Unparsing Schemes 59
4.4 Input Interfaces 62
4.5 Templates and Transformations 65
xii Contents
CHAPTER 5
Lists, Optional Elements, and Placeholders 68
5.1 Transient Placeholders 69
5.2 Specifying Lists and Optional Elements in SSL 77
5.3 Sublist Manipulations 84
5.4 Selections of Singleton Sublists Versus Selections of List Elements 87
5.5 Parsing Lists 92
5.6 Attribution Rules for a List's Completing Term and Placeholder Term 93
CHAPTER 6
Defining Hybrid Editors with the Synthesizer Generator 95
6.1 Defining a Language's Underlying Abstract Syntax 97
6.2 Integration of Text Editing and Structure Editing 100
6.3 Defining Computed Display Representations 130
6.4 Context-Sensitive Translations and Transformations 136
CHAPTER 7
Performing Static Inferences with Attributes 143
7.1 Aggregation and Information-Passing Strategies 146
7.2 Using the Attribution Mechanism to Perform Type Inference 151
CHAPTER 8
Practical Advice 160
8.1 How to Begin Developing an Editor 162
8.2 Modular Construction of Editor Specifications 172
8.3 Problems That Frequently Arise 179
CHAPTER 9
Generating Code Using Attributes 195
9.1 Approaches to Incremental Recompilation 196
9.2 Incremental Recompilation Using Attributes 198
CHAPTER 10
Interactive Program Verification 210
10.1 An Introductory Example 211
10.2 Generating Verification Conditions 214
10.3 Checking Proofs of Verification Conditions 220
10.4 Automatic Deductive Capabilities 226
CHAPTER 11
The Implementation 231
11.1 Basic Organization of the Implementation 231
11.2 Finiteness of Completing Terms 237
11.3 Generating Copy Rules for Upward Remote Attribute Sets 238
11.4 Deferred Reference Counting 243
Contents xiii
CHAPTER 12
Incremental Attribute Evaluation for Ordered Attribute Grammars 246
12.1 Greedy Evaluation 247
12.2 Distributed-Control Evaluation 248
12.3 Evaluation of Ordered Attribute Grammars by Visit-Sequence Evaluators 250
12.4 Construction of a Visit-Sequence Evaluator 254
12.5 Incremental Updating by Visit-Sequence-Driven Change Propagation 265
12.6 Optimizations for One-to-One Functions 271
12.7 What to Do When a Grammar Fails the Orderedness Test 273
APPENDIX A
Syntax of SSL 278
APPENDIX B
Invoking the Synthesizer Generator 285
APPENDIX C
Abbreviated List of Editor Commands 290
e.1 Getting Into and Out of an Editor 290
e.2 Changing the Structural Selection by Traversal of the Abstract Syntax Tree 291
C.3 Executing Commands 291
C.4 Structural Editing 292
C.5 Moving the Object with Respect to the Window 292
C.6 Using the Locator 293
C.7 Textual Editing 294
e.8 Changing the Character Selection by Textual Traversal of the Text Buffer 294
e.9 Buffers, Selections, and Files 295
C.IO Creating and Deleting Windows 296
APPENDIX D
Keyboards, Displays, Window Systems, and Mice 297
0.1 Keyboards 297
0.2 Displays and Window Systems 298
0.3 Mice 300
Bibliography 302
Index 309
CHAPTER 1
Introduction
IThe names "Cornell Program Synthesizer" and "Synthesizer Generator" often cause some misunder-
standing about the relationship between the two systems. The Synthesizer Generator is a successor
system to the Cornell Program Synthesizer, but it is actually a completely separate system. Despite
its name, the Synthesizer Generator was not used to create the Cornell Program Synthesizer,
although the Synthesizer Generator could now be used to create a system that is functionally
equivalent to it.
4 Chapter 1. Introduction
<exp> and <statement> are placeholders that identify where additional inser-
tions may be made.
During editing, the current selection (i.e. insertion point), can only be moved
from one template to another, or from one template to its constituents, not from
character to character nor from one line of text to another. (In editors generated
with the Synthesizer Generator, the selection is indicated on the display screen
by highlighting the selected region; in our examples, the highlighted characters
on a line of the display are enclosed in a box.)
Templates are inserted into the program by special commands, and the system
checks whether the insertion is legal. For example, if the currently selected ele-
ment is <statement>, a menu command can be invoked to insert an if-
statement into the program, automatically indented in the body of the loop, as
follows: 2
while <exp> do
Iif <exp> then I
I<statement> I
lelse <statement> I
The menu of insertion commands need not provide the same choices in all
contexts; restricting the choices offered to the insertions that are legal in the
context of the current selection forbids the user from making inappropriate
insertions. For example, because an if-statement is not an appropriate derivation
of an <exp>, the if-statement choice would not be offered in the menu when the
loop-condition is selected.
With a structure editor, changes to a program are accomplished by removal
and insertion of entire, well-formed, program fragments. This highly disci-
plined mode of modification guarantees the syntactic integrity of the program at
every step.
Transformation operations in a structure editor provide a mechanism for mak-
ing controlled changes in a single step. Construct-to-construct transformation
operations emphasize the abstract computational meaning of program units.
Consider, for example, the following Pascal function that multiplies two integers
by repeated addition:
2Note that the three boxes shown in the example constitute one selection, not three separate selec-
tions.
1.1. Using Structure Editing to Ensure that Programs Are Syntactically Correct 5
In one operation, this fragment can be transformed into the equivalent imple-
mentation:
function multiply (a, b: integer): integer;
var
y, z: integer;
begin
y:= b;
z:= 0;
lif Y > 0 thenl
Irepeat 1
Iy:= y-1;1
Iz := z + al
luntil y <= 01;
multiply := z
end;
Here the ellipses ( ... ) indicate portions of the program whose display represen-
tation has been suppressed.
Structure editing has a number of attractive properties. The integrated
behavior of templates and the editor's selection enforces the view that a pro-
gram is a hierarchy of nested components. Placeholders in templates serve both
as prompts and as syntactic constraints, by identifying places that can or must be
refined, as well as by restricting the range of choices to legitimate insertions.
Templates eliminate mundane tasks of program development and let the pro-
grammer focus on the intellectually challenging aspects of programming. Each
template insertion is syntactically correct because template commands are valid
only in appropriate contexts. Indentation is automatic, both when a template is
introduced and when it is moved. Typographical errors in structural units are
impossible; the templates are predefined and immutable, so after a template has
been inserted, errors cannot be introduced by subsequently modifying it. Thus,
a program developed with a structure editor is always well formed, regardless of
whether it is complete.
Templates correspond to abstract computational units. Because they are
inserted and manipulated as units, the process of programming begins and con-
tinues at a high level of abstraction.
Do not get the idea that text editing and structure editing are incompatible.
Structure editors can be augmented with text-editing facilities to create hybrid
tree-and-text editors that have the advantages of both. While permitting both
text editing and structure editing, a hybrid design can still preclude the creation
of syntactically incorrect programs; all components that are inserted or modified
textually (e.g. character-by-character) must be validated by the editor. A good
strategy is to parse the text of the current selection (and only the text of the
current selection) as soon as the programmer selects some other element of the
program to work on; if the phrase contains an error, a message can be printed
and an indicator positioned at (or close to) the site of the error.
.pp
This is a right justified paragraph containing
\f(HOitalicized\fP and \f(HBboldface\fP words.
In batch mode, it is difficult to tell from
the input file what the final page will look like .
.ip "1)"
This is an indented paragraph containing
the formula $x sup 2 - Y sup 2$.
(a)
(b)
program p;
const
size = 10;
type
index = 1 .. size;
list = array [index) of integer;
var
A: list;
begin
A[10) := 0
end.
(a)
program p;
const
size = x {CONSTANT IDENTIFIER NOT DECLARED} ;
type
index = 1 .. size;
list = array [index] of integer;
var
A: list;
begin
A[10] := 0
end.
(b)
program p;
const
size = -1;
type
index = 1 .. size { EMPTY RANGE NOT ALLOWED} ;
list = array [index) of integer;
var
A: list;
begin
A[10] := 0
end.
(c)
Figure 1.3. The three screen images shown above, taken from a Pascal program editor
generated with the Synthesizer Generator, illustrate the editor's immediate error-analysis
capability. In (b), after size is redefined from 10 to x, an error message appears because
the name x is not defined in the program. Changing x to -1 removes this error, but intro-
duces a different one, as shown in (c).
For example, when programs were developed using the Cornell Program Syn-
thesizer, execution could be initiated at any time and began immediately,
1.3. Using Incremental Code Generation to Support Program Testing 11
without any delay for compilation. The programmer could interrupt an execut-
ing program, modify it, and, as long as the program still contained the structure
associated with the point of interruption, execution could be resumed. It was
even possible to run incomplete programs: execution was suspended when a
missing program element was encountered, but could be resumed after the
required code was inserted.
Incremental translation is advantageous both for generating intermediate code
for interpretive systems like the Cornell Program Synthesizer and for generating
machine code, as in GANDALF's Incremental Programming Environment
[Medina-Mora81]. Another situation where it can be of great benefit is in a
cross-development environment. Incremental translation can be used as the
basis for a cross-debugger where programs are developed on a host machine and
executed on a slave machine; the editor on the host machine ships code to the
slave machine in small increments, thereby avoiding long delays for download-
ing a program that has been modified only slightly [Fritzson84, Fritzson84a].
to wrap around to succeeding lines, the propagation of changes will die out if
there is enough space on the last line of the paragraph to prevent the addition of
an extra line; the remainder of the document will be unaffected. This is quies-
cence. An incremental formatting algorithm can exploit independence and
quiescence to minimize the amount of reanalysis performed.
We can distinguish between two approaches to incremental algorithms: selec-
tive recomputation and differential evaluation. In selective recomputation,
values independent of changed data are never recomputed. Such values may be
either intermediate results of scalar computations or individual components of
vector calculations. Values that are dependent on changed data are recomputed;
but after each partial result is obtained, the old and new values of that part are
compared, and when changes die out, no further recomputations take place. In
differential evaluation, rather than recomputing I (x') in terms of the new data
x " the old value I (x) is updated by some difference III computed as a func-
tion of x, x', and I (x ).
The spreadsheet example of Figure 1.2 can be used to illustrate the two
approaches.
To illustrate selective recomputation, suppose that UnitPricepen and
QuantitYpen are changed to $1.00 and 1 respectively. Dependence information
can be used to determine that Amountpad need not be recomputed, since it can-
not change. Although Amountpen must be recomputed, it turns out to be
unchanged, so Total need not be recomputed.
To illustrate differential evaluation, suppose that UnitPricepen is changed to
$1.00 and QuantitYpen is left unchanged. Then the differences
IlUnitPricepen = $0.50
IlAmountpen = IlUnitPricepen x QuantitYpen = $1.00
IlTotal = Mmounlpen = $1.00
can be computed and used for updating Amountpen and Total. Note that with
differential evaluation, even if there are hundreds of lines of data, Total can be
updated with a single addition.
Sophisticated incremental algorithms are not always needed because exhaus-
tive recomputation can be fast enough for small problems. But for language-
based tools to have a major effect on the productivity of software production,
they must apply to large software systems. The larger the application, the more
crucial the need for incremental methods.
The Cornell Program Synthesizer's editor did not employ an incremental
algorithm because exhaustive recomputation was fast enough for small student
programs. The system whetted appetites, but did not incorporate methods that
would scale-up to meet professional requirements. Thus, in 1981 we turned our
14 Chapter 1. Introduction
Changing data is, in effect, changing some of the equations, after which those
equations and, perhaps, other equations must be re-solved.
The attribute grammar formalism adopted by the Synthesizer group for
defining immediate error-analysis in language-based editors is such a declarative
specification language. For each object that a user may create, the attribute
grammar defines a corresponding set of simultaneous equations whose solution
expresses the deductions of the editor about errors in the object. Each unknown
variable in these equations represents a deduction relevant to a particular point
in the object. During editing, each modification to the object causes a related
change to the set of equations and their solution. Error messages that appear
and disappear on the screen (as in Figure 1.3) are merely the values of textual
variables that change from time to time as the equations are re-solved.
For example, suppose the + operator of the language denotes addition of
integer operands (and is not overloaded for adding other kinds of operands).
Then the expression 1 + 5 would be associated with the following set of equa-
tions:
typeOf+ = integer
typeOfleftOperand = integer
typeOfrighlOperand = integer
error+ = if typeOfleftOperand = integer and typeOfrighlOperand = integer
then ""
else "TYPE MISMATCH" fi
In the solution of these equations, variable error+ has the value "" (the empty
string). If, however, the expression were changed to 1 + "a string", then the
equations would be:
16 Chapter 1. Introduction
typeOf+ = integer
typeOfleftOperand = integer
typeOfrightOperand = string
error+ = if typeOfleftOperand = integer and typeOfrightOperand = integer
then ""
else "TYPE MISMATCH" fi
In the solution of the latter equations, variable error+ has the value
"TYPE MISMATCH".
Attribute grammars have several desirable qualities as a notation for specify-
ing language-based editors. A language is specified in a modular fashion by an
attribute grammar: syntax is defined by a context-free grammar; attribution is
defined in an equally modular fashion, because the arguments to each attribute
equation are local to one production. Propagation of attribute values through the
derivation tree is not specified explicitly in an attribute grammar; rather, it is
implicitly defined by the equations of the grammar and the form of the tree.
The benefit of using attribute grammars to handle the problem of incremental
change in language-based editors is that the repropagation of consistent attri-
bute values after a modification to an object is implicit in the formalism. Thus,
there is no need for the notions of "undoing a semantic action" or "reversing the
side-effects of a previous analysis," which would otherwise be necessary.
When an object is modified, consistent relationships among the attributes can be
reestablished automatically by incrementally re-solving the system of attribute
equations. Consequently, when an editor is specified with an attribute grammar,
the method for achieving a consistent state after an editing modification is not
part of the specification.
Apart from its use to specify name analysis and type checking, the attribute-
grammar formalism provides a basis for specifying a large variety of other com-
putations on tree-structured data, including type inference (as distinct from type
checking), code generation, proof checking, and text formatting (including
filling and justification, as well as equation formatting).
varieties of mathematical logic, and program editors for several different pro-
gramming languages (including an editor for full Pascal with complete static-
semantic checking).
The Synthesizer Generator is written in C and runs under Berkeley UNIX.
Editors can be generated for the X Window System and for SunView, as well as
for video display terminals. Editors generated for X or Sun View support multi-
ple overlapping windows and the use of the mouse to make selections in edited
objects, to select commands and transformations from menus, and to scroll,
resize, and iconify windows.
CHAPTER 2
Imain I
loroaram <identifier>:1
varJ
I<identifier> : <tYRe>;1
Ibeginl
i<statement> I ~
lend.1
Positioned at program
The top line of the screen has a highlighted title bar displaying the name of the
current buffer. The rest of the screen is divided into three regions: the command
line, the help pane, and the object pane. The command line, just below the title
bar, is where commands are echoed and also where system messages are
displayed. The help pane, which takes up the last few lines at the bottom, pro-
vides information about what constituent is currently selected. The object pane,
displaying the buffer's program fragment, covers the remaining portion of the
screen.
The editor is a screen-oriented structure editor. The object being edited has
hierarchical structure; it is not just a sequence of characters and lines. The ini-
tial content of the object pane consists of a template for a program. A template
is a formatted pattern of keywords and placeholders, where the placeholders
identify locations at which additional components can be inserted. Programs are
created top down by inserting new statements and expressions within the skele-
ton of previously entered templates. Operations on the object pane are defined
in terms of this structure and the current selection - an individual component of
the program that is denoted in the display by highlighting. In the example
shown above, the entire program is selected, which is indicated by highlighting
the entire program.
22 Chapter 2. Demonstration of a Sample Editor
However, the editor is not just a structure editor; it also supports character-
and line-oriented operations to insert and delete text. During text editing, the
selection contains within it a character selection, denoted on the screen as an
unhighlighted character or I-beam within the highlighted selection. Text editing
is not initiated until the user types or erases a character; thus, at the start of an
editing session there is no character selection within the selection.
The structural selection and the character selection can both be changed by
using the locator to point to a new location in the program. On a workstation
equipped with a mouse, the locator is an arrow or some other graphic icon; in
the screen image shown above, the symbol "~" represents the locator. The
locator's position is changed by moving the mouse, but the selection is not
changed until the user executes the select command by clicking the mouse's
selection button. On a video display terminal, the locator is the terminal's cur-
sor, and it can be moved using one of the commands pointer-up <ESC-p>,
pointer-down <ESC-d>, pointer-left <ESC-b>, and pointer-dght <ESC-
f>.l The select command is normally bound to the sequence <ESC-@>.
The editor incorporates knowledge about a language's syntax, which it uses to
prevent syntax errors from being introduced into the program. One way this is
done is by requiring that certain templates be immutable; that is, when certain
components are selected, the user is not permitted to make textual modifications.
For instance, in our example editor, it is impossible to modify the program
template by textual operations. If the user types a character anyway, the charac-
ter is rejected, a warning signal sounds, and the message text entry not permit-
ted here is displayed on the command line, as shown below:
Imain I
text entry not permitted here
Igrogram <identifier>;1
Ivarl
I<identifier> : <t~ge>;1
Ibeginl
I<statement> I ~
lend.1
Positioned at program
'For the convenience of readers who are experimenting with the editor, the standard key bindings for
the commands mentioned in the chapter are indicated in the text between angle brackets (e.g.
redraw-display <-L».
Chapter 2. Demonstration of a Sample Editor 23
In this case, we must move the selection elsewhere before we can modify the
program. The selection can be changed to a different element of the program by
repositioning the locator and invoking the select command <ESC-@>. Sup-
pose we have decided to enter the program's statements first; we select the state-
ment placeholder of the program's body by pointing the locator at any of the
characters of <statement> and invoking select:
Imain I
program <identifier>;
var
<identifier> : <type>;
begin
l<stat~ent>1
end.
Positioned at stmtList assign if while begin (begin
The highlighted region changes to indicate the extent of the new selection and
the help pane is updated to provide information about the currently selected
component. The new selection is a stmtList, a list of statements that, for the
moment, consists of just one <statement> placeholder.
When the selection is positioned at a stmtList, we are permitted to enter state-
ments into the program directly as text; for the moment, however, we choose to
modify the program by an alternative method: template insertion. The names of
all templates available are displayed in the help pane. In this case, they
correspond to the different kinds of statements in the language; for example,
assign is bound to the template for an assignment-statement.
There are three ways in which a template may be selected. On a workstation
equipped with a mouse, it can be chosen from a pop-up menu of choices
displayed when the mouse's structure-menu button is depressed, or it can be
inserted by just clicking on its name in the help pane. On a standard video
display terminal, the user escapes to the command line by invoking the
execute-command command <AI or ESC-X>2 and then types some unambigu-
ous prefix of the template name.
The execute-command command echoes the prompt COMMAND: on the
command line to signify command mode, after which subsequent characters are
directed to the command line. After typing the template name assign, the
screen appears as:
Imain I
COMMAND: assign
program <identifier>;
var
<identifier> : <type>;
begin
I<statement> I
end.
Positioned at stmtList assign if while begin (begin
Imain I
program <identifier>;
var
<identifier> : <type>;
begin
l<identifier>1 := <exp>
end.
Positioned at identifier
An assignment-statement template, such as the one that has just been inserted,
has two placeholders: one for the left-hand-side identifier and one for the right-
hand-side expression. Because the terminating forward-with-optionals com-
mand is applied after the template is inserted, the selection has been moved to
the identifier placeholder.
Note that when positioned at <identifier>, the help pane does not list any
template commands, indicating that no templates are provided in this context.
As an experiment, however, let us try using the same command again. We
invoke the execute-command command <AI or ESC-x> and type assign.
Imain !
COMMAND: assign
program <identifier>;
var
<identifier> : <type>;
begin
I<identifier>! := <exp>
end.
Positioned at identifier
Imain !
assign is unknown command
program <identifier>;
var
<identifier> : <type>;
begin
I<identifier>! := <exp>
end.
Positioned at identifier
Ihelg !
advance-after-transform (none)
advance-after-parse (none)
apropos <ESC ?>
ascend-to-parent <ESC ">
The help buffer can be removed from the screen with the delete-other-windows
command <"X1 >. For additional information about manipulating buffers and
windows, consult The Synthesizer Generator Reference Manual. (The Syn-
thesizer Generator Reference Manual also provides complete documentation for
each of the other system commands.)
26 Chapter 2. Demonstration of a Sample Editor
Imain 1
program <identifier>;
var
<identifier> : <type>;
begin
11234511 := <exp>
end.
Positioned at identifier
Just as the commands that move the buffer's selection initiate a template
insertion, they also initiate the processing of a textual entry. Because text is
typed by the user, it can be syntactically incorrect in the context of the current
selection. To check syntactic correctness, a parser is invoked when text entry is
terminated. If an error is detected, a warning signal sounds, the message syntax
error is displayed on the command line, and the character selection is located at
the last character of the leftmost error. Thus, after execution of forward-with-
optionals <"M>, the screen appears as:
Imain 1
syntax error
program <identifier>;
var
<identifier> : <type>;
begin
11234151 := <exp>
end.
Positioned at identifier
Imain I
program <identifier>;
var
<identifier> : <type>;
begin
[[J := <exp>
end.
Positioned at identifier
Imain I
program <identifier>;
var
<identifier> : <type>;
begin
i {NOT DECLARED} := I<exp> I
end.
Positioned at exp
However, although the text is syntactically correct, the program does not contain
a declaration for variable i; the editor announces the undeclared-variable error
by attaching the comment { NOT DECLARED} to the undeclared identifier.
The existence of such an error does not place any constraints on the user. The
error can either be corrected immediately by moving to the declaration place-
holder and creating a declaration for i, or it can be ignored for the time being.
We choose to do the latter and proceed to enter the expression for the right-hand
side of the assignment-statement by typing 1.
Chapter 2. Demonstration of a Sample Editor 29
Imain I
program <identifier>;
var
<identifier> : <type>;
begin
i { NOT DECLARED} := ®
end.
Positioned at exp
Imain I
program <identifier>;
var
<identifier> : <type>;
begin
i { NOT DECLARED} := 1;
I<statement> I
end.
Positioned at stmtList assign if while begin (begin
Imain I
COMMAND:w
program <identifier>;
var
<identifier> : <type>;
begin
i { NOT DECLARED} := 1 ;
I<statement> I
end.
Positioned at stmtList assign if while begin (begin
The template is inserted and the selection is advanced to the next insertion point
as soon as forward-with-optionals <AM> is executed:
Imain I
program <identifier>;
var
<identifier> : <type>;
begin
i { NOT DECLARED} := 1;
while I<exp> I do
<statement>
end.
Positioned at exp
The loop's condition is expanded by typing the inequality k> 1 00 in place, fol-
lowed by forward-with-optionals <AM>:
Chapter 2. Demonstration of a Sample Editor 31
Imain I
program <identifier>;
var
<identifier> : <type>;
begin
i { NOT DECLARED} := 1 ;
while (i { NOT DECLARED} <> 100) do
I<statement> I
end.
Positioned at stmt assign if while begin (begin
The expression k> 100 parses successfully and the selection advances to the
next placeholder where an insertion can be made. As before, the system
attaches a comment to identifier i to indicate the absence of a declaration for i.
Note that the expression has been enclosed within parentheses. The sample edi-
tor uses a straightforward rule for generating parentheses: each subexpression
whose outermost operator is binary is enclosed in a single pair of parentheses.
However, it is possible to specify more elaborate rules using the Synthesizer
Generator. (See Section 6.3, "Defining Computed Display Representations.")
Finally, the loop-body is filled in by typing an assignment-statement i:=i+ 1.
Earlier, when an assignment-statement was entered into the program, it was
inserted using an assignment template; to illustrate that the editor supports tex-
tual insertion of assignment statements in addition to providing the assignment
template, this time we choose to type in the entire statement i:=i+1 directly, as
shown below:
32 Chapter 2. Demonstration of a Sample Editor
Imain I
program <identifier>;
var
<identifier> : <type>;
begin
i { NOT DECLARED} := 1 ;
while (i { NOT DECLARED} <> 100) do
1i:=i+1I I
end.
Positioned at stmt assign if while begin (begin
Then, upon forward-with-optionals <AM>, the text i:=i+ 1 is parsed, its display
is annotated with error messages, and the selection is advanced to the next
placeholder where an insertion can made. No further insertions are possible
within the body of the while-statement, because in our example editor the body
of a while-statement is defined to be just a single statement. Thus, the new
selection is automatically advanced to a newly displayed <statement> place-
holder beyond the entire while-statement. The indentation level of the place-
holder reveals that the location is after, rather than within, the while-statement:
Imain I
program <identifier>;
var
<identifier> : <type>;
begin
i { NOT DECLARED} := 1 ;
while (i { NOT DECLARED} <> 100) do
i { NOT DECLARED} := (i { NOT DECLARED} + 1);
I<statement> I
end.
Positioned at stmtList assign if while begin (begin
At this point, we finally add a declaration for variable i. First, we select the
declaration placeholder by pointing the locator at the : within the declaration
and invoking the select command <ESC-@>:
Chapter 2. Demonstration of a Sample Editor 33
Imain I
program <identifier>;
var
l<identifier>~<type>l;
begin
i { NOT DECLARED} := 1 ;
while (i { NOT DECLARED} <> 100) do
i { NOT DECLARED} := (i { NOT DECLARED} + 1)
end.
Positioned at declList
Imain I
program <identifier>;
var
Ii :booleanI I ,
begin
i { NOT DECLARED} := 1;
while (i { NOT DECLARED} <> 100) do
i { NOT DECLARED} := (i { NOT DECLARED} + 1)
end.
Positioned at deciList
When we invoke forward-with-optionals <AM>, the new text is parsed and the
selection is advanced. In addition, as a side effect of the insertion, the error
messages throughout the program are revised:
34 Chapter 2. Demonstration of a Sample Editor
Imain I
program <identifier>;
var
i : boolean;
!<identifier> : <type>l;
begin
i := 1 {INCOMPATIBLE TYPES IN :=};
while (i <> {INCOMPATIBLE TYPES} 100) do
i := (i + { INCOMPATIBLE TYPES} 1) {INCOMPATIBLE TYPES
IN:= }
end.
Positioned at declList
Adding the declaration has corrected the undeclared-variable errors in the pro-
gram, but because i is declared as boolean, it has introduced type errors in a
number of other locations. The left-hand and right-hand sides of the two
assignment-statements now have incompatible types, which the editor has
reported by annotating them with the comment
{INCOMPATIBLE TYPES}.
Now consider what happens when we change the type of variable i from
boolean to integer. First, we select the declaration's type expression by point-
ing the locator at any of the characters in boolean and invoking select <ESC-
@>:
Chapter 2. Demonstration of a Sample Editor 35
Imain I
program <identifier>;
var
i : Iboole~l;
begin
i := 1 { INCOMPATIBLE TYPES IN :=};
while (i <> { INCOMPATIBLE TYPES} 100) do
i := (i + {INCOMPATIBLE TYPES} 1) {INCOMPATIBLE TYPES
IN :=}
end.
Positioned at typeExp
Imain I
program <identifier>;
var
i : I<type> I;
begin
i := 1;
while (i <> 100) do
i := (i + 1)
end.
Positioned at typeExp integer boolean
!main !
program <identifier>;
var
i : integer;
!<identifier> : <type>!;
begin
i := 1;
while (i <> 100) do
i := (i + 1)
end.
Positioned at typeExp
Now suppose we want to add additional statements to the body of the while-
loop. The while body has syntactic type stmt, not stmtList, which means we
need to introduce a compound-statement into the loop. Because the body
already contains a statement, the most convenient way to perform the insertion
is to use a command that encloses the existing statement in a begin-end. First, to
select the assignment-statement, we point the locator at the := symbol and
invoke select <ESC-@>. The declaration placeholder disappears as soon as
the selection moves away. Next, we type execute-command <'lor ESC-x> to
escape to the command line, where we type the command (begin:
!main !
COMMAND: (begin
program <identifier>;
var
i : integer;
begin
i := 1;
while (i <> 100) do
Ii ~ (i + 1)1
end.
Positioned at stmt (begin
template-insertion command, such as the assign and while commands used ear-
lier, are really transformation commands of a particularly simple form - they
transform a placeholder into a template. (In the previous screen image, they are
not itemized in the help pane, since the selection is not currently a statement
placeholder.) The Synthesizer Generator's transformation commands allow the
editor-designer to specify more general source-to-source transformations. For
instance, (begin is the name of a transformation that encloses a statement in a
compound-statement. We now invoke this transformation by executing
forward-with-optionals <AM>:
Imain I
program <identifier>;
var
i : integer;
begin
i := 1;
while (i <> 100) do
Ibeginl
li:=(i+1ll
lendl
end.
Positioned at stmt (begin )begin
As expected, the loop-body has been enclosed within a begin-end pair. The
selection now encompasses the inserted compound-statement. In addition to the
(begin transformation, which can be used to introduce an additional enclosing
begin-end, a second transformation is now enabled. This transformation, named
)begin, strips off a compound-statement when the body of the begin-end con-
sists of only a single statement.
The observant reader may be mystified by the selection that resulted when we
applied the (begin transformation. In particular, why was the selection not
advanced to some placeholder? The following fine point should dispel the con-
fusion. The forward-with-optionals command, when executed alone, advances
the selection forward in preorder through the abstract-syntax tree of the object.
Any optional placeholders encountered en route are materialized and serve as
stopping places. For example, were forward-with-optionals <AM> to be exe-
cuted at this moment, the selection would change as follows:
38 Chapter 2. Demonstration of a Sample Editor
Imain I
program <identifier>;
var
i : integer;
begin
i := 1;
while (i <> 100) do
begin
I<statement>l;
i := (i + 1)
end
end.
Positioned at stmtList assign if while begin (begin
inherited attributes and each production has exactly one attribute equation for
each of the left-hand-side nontenninal's synthesized attribute occurrences and
the right-hand-side nontenninals' inherited attribute occurrences. Finally, we
restrict our attention to the noncircular grammars; a grammar is noncircular if it
is not possible to build a derivation tree in which attributes are defined circu-
larly.
Example. As a running example to illustrate these concepts, we will use a
simple programming language with declaration, assignment, conditional, and
iteration statements. The language is essentially the one supported by the editor
described in Chapter 2, but without type expressions in variable declarations.
The concrete syntax of the language is defined by the context-free grammar
given in Figure 3.1. (For brevity, we have not shown the productions that can
be derived from the nontenninals identifier and exp.) We define a scheme to
compute the set of declared names by attaching the attributes id and env to cer-
tain nontenninals of the grammar. Attribute id is a synthesized attribute of the
nontenninal identifier; its value is an identifier name. Attribute env is a syn-
thesized attribute of declList and an inherited attribute of stmtList and stmt; its
value is a set of identifier names.
Attribute equations define how the values of attributes are related to the
values of other attributes. A collection of equations that define the propagation
of declaration infonnation through a program of the language is presented in
Figure 3.2. (In the equations in Figure 3.2, we have used conventionally
accepted notation to express operations on sets; we have used "." as the operator
for selecting an attribute of a nontenninal; and we have used subscripts to distin-
guish among multiple occurrences of the same nontenninal.) Rules (2) and (3)
of Figure 3.2 describe how a set of declared names is generated from the
Figure 3.2. Attribute equations that express the propagation of declaration information
using the attributes id and env .
declarations of a program; rule (1) causes the set of declared names to be passed
from the declarations to the statements of the program; rules (4)-(9) describe
how the set of declared names is propagated to the individual statements of the
program.
A derivation-tree node that is an instance of symbol X has an associated set of
attribute instances corresponding to the attributes of X. An attributed tree is a
derivation tree together with an assignment of either a value or the special token
null to each attribute instance of the tree. To analyze a program according to its
attribute-grammar specification, first construct its derivation tree with an assign-
ment of null to each attribute instance; then evaluate as many attribute instances
as possible, using the appropriate attribute equation as an assignment-statement.
The latter process is termed attribute evaluation.
Functional dependences among attribute occurrences in a production p (or
42 Chapter 3. The Attribute-Grammar Model of Editing
program p
var
declare q;
declare r
begin
stmt ;
stmt ;
stmt
end
The nonterminals of the derivation tree are connected by dashed lines; the
dependence graph consists of the instances of the attributes env and id, linked
by their functional dependences, shown as solid arrows. (The solid arrows
emanating from the tree's identifier leaves indicate dependences on tree com-
ponents; strictly speaking, they are not part of the dependence graph.)
Attribute grammars have several desirable qualities as a notation for specify-
ing language-based editors. A language is specified in a modular fashion by an
attribute grammar. Syntax is defined by a context-free grammar; attribution is
defined in an equally modular fashion, because the arguments to each attribute
equation are local to one production. Propagation of attribute values through the
derivation tree is not specified explicitly in an attribute grammar; rather, it is
implicitly defined by the equations of the grammar and the form of the tree.
When an editor-designer creates an editor with the Synthesizer Generator,
part of the editor specification consists of attribute equations that express
I A directed graph G = (V • E) consists of a set of vertices V and a set of edges E • where E <;;; V x V.
program
i/
r
J
stmt env
Figure 3.3. A partial derivation tree and its associated dependence graph.
on the screen provides the user with feedback about new errors introduced and
old errors corrected.
Fundamental to this approach is the idea of an incremental attribute evalua-
tor, an algorithm to produce a consistent, fully attributed tree after each restruc-
turing operation. Of course, any non-incremental attribute evaluator could be
applied to reevaluate the tree completely, but the goal is to minimize work by
confining the scope of reevaluation.
After each modification to a program tree, only a subset of attributes, denoted
by AFFECTED, requires new values. It should be understood that when updat-
ing begins, it is not known which attributes are members of AFFECTED;
AFFECTED is determined as a result of the updating process itself. In
[Demers81, Reps82, Reps83, Reps84], we have described algorithms that iden-
tify attributes in AFFECTED and recompute their values. Some of these algo-
rithms have cost proportional to the size of AFFECTED, which means that they
are asymptotically optimal in time, because, by definition, the work needed to
update the tree can be no less than IAFFECTEDI.
The chief drawback to the use of attribute grammars for creating program edi-
tors that perform immediate computation is the question of whether certain
efficiency problems prevent the approach from scaling up. The problem is that
attribute grammars have strictly local dependences among attribute values, and,
at least conceptually, attributed trees have a large number of intermediate attri-
bute values that must be updated. By contrast, imperative methods for imple-
menting checks on a language's context conditions can, by using auxiliary data
structures to record nonlocal dependences in the tree, skip over arbitrarily large
sections of the tree that attribute-updating algorithms visit node by node.
For example, suppose we want to enforce the constraint that the declarations
and uses of identifiers in a program be consistent. An imperative approach can
implement this constraint with a symbol table for each block, in which the entry
for an identifier i points to a chain of all uses of i in that block. When a
declaration is deleted or inserted, the use chains are employed to immediately
access uses of variables that were formerly declared. With the attribute gram-
mar approach, when a declaration is deleted or inserted, the incremental attri-
bute evaluator traverses the entire scope of the declaration.
Some recent work on attribute-grammar extensions, including [Demers85],
[Johnson85], [Reps86], and [Hoover87], has been directed towards abandoning
the restriction to purely local attribute dependences. This matter remains a topic
for additional research; incorporation of such techniques into the Synthesizer
Generator is planned for future releases.
CHAPTER 4
root program;
program: Prog(identifier deciList stmtList);
list declList;
declList DeclListNiI( )
DeclListPair(deci deciList)
identifier . IdentifierNull( )
I Identifier(IDENTIFIER)
list stmtList;
stmtList StmtListNil( )
StmtListPair(stmt stmtList)
stmt EmptyStmt( )
Assign(identifier exp)
IfThenElse(exp stmt stmt)
While(exp stmt)
Compound(stmtList)
exp EmptyExp( )
IntConst(INTEGER)
True( )
False( )
Id(identifier)
Equal(exp exp)
NotEqual(exp exp)
Add(exp exp)
bra).l The phylum associated with a given nonterminal is the set of derivation
trees that can be derived from it. These derivation trees are known as terms.
With the exception of the operators, whose purpose is to identify the production
instances in a derivation tree, the SSL grammar rule acts exactly as the context-
free production
Although several rules may have structurally identical right-hand sides, the
unique operator name differentiates among the various alternatives that would
otherwise have been indistinguishable. For example, in Figure 4.1, the operator
names Equal, NotEqual, and Add distinguish between three kinds of expres-
sion pairs.
Because terms are a focal point of our discussion, we introduce the following
notation: a term is denoted by an expression in which a k -ary operator is applied
to k constants of the appropriate phyla; for terms of nullary operators, the
parentheses may be omitted. 2 For example,
Equal(ld(ldentifierNull), Id(ldentifierNull))
( <identifier> = <identifier> )
as will be explained in Section 4.3, "Unparsing Schemes."
Some expressions that do not denote terms are Equal(ldentifierNull, True),
because IdentifierNull denotes a term that is not of an appropriate phylum to be
used as the first argument of Equal, and Equal(True), because the binary
operator Equal is applied to the wrong number of argument terms.
iMore precisely, definitions of alternative productions can be separated by vertical bars, without re-
peating the left-hand-side phylum name, as is done in Figure 4.1.
'The empty parentheses may not be omitted from the declaration of the nullary operator, however.
48 Chapter 4. Specification of a Sample Editor
Imain I
loroaram <identifier>:1
varJ
I<identifier> : <type>; I
Ibeginl
I<statement>i
lend.1
Positioned at program
which is actually the unparsing of the following ten-node term of phylum pro-
gram
Prog(ldentifierNull,
DeciListPair(Declaration(ldentifierNull, EmptyTypeExp), DecIListNiI),
StmtListPair(EmptyStmt, StmtListNil)
)
The first operator declared for each phylum, such as the operator Prog of
phylum program and the operator IdentifierNull of phylum identifier, is called
the completing operator and plays a special role in the editor specification. The
completing operator is used to construct a default representative for the phylum,
called the completing term. The completing term is created by applying the
completing operator to the completing terms of its argument phyla? For exam-
ple, the completing term for phylum decl defined in Figure 4.1 is the term
Deciaration(ldentifierNull, EmptyTypeExp)
i.e. the completing operator for decl applied to the completing terms of phylum
identifier and phylum typeExp.
An instance of the appropriate completing term is used at each unexpanded
occurrence of a phylum in a derivation tree. For instance, the placeholders in
3Phyla declared to be lists or optional are an exception to this rule, as described below.
4.1. Abstract Syntax 49
i.e. DeclListPair applied to the completing term of phylum decl and to the term
DeclListNil.
Let us now be more specific about the list-insertion points that the editor
automatically supplies (and removes) before and after each element of a list.
For brevity, let us use P to stand for the completing term of phylum decl, i.e.
the term
Declaration(ldentifierNull, EmptyTypeExp).
For brevity, we abbreviate list expressions with the same notation that is used in
SSL: the right associating infix operator :: denotes the concatenation operation;
50 Chapter 4. Specification of a Sample Editor
it attaches a single element to the head of a list, where the element and list are of
appropriate phyla. For example, if A and B are both terms of phylum decl, the
term A :: B :: DeclListNii is the declList term
A :: B :: DeciListNil
4Syntactic references and upward remote attribute sets, described later in the section, extend the
class of pennissible arguments to attribute-definition functions.
4.2. Attributes and Attribute Equations 53
The second argument of Prog, the declaration list of the program, is itself a
term of type declList. It is, in fact, exactly the value desired and can be used
directly in the equation that defines env. Note that the name deciList is used in
two distinct ways in the rules for operator Prog:
54 Chapter 4. Specification of a Sample Editor
program: Prog {
local declList env;
env = deciList;
}
In the declaration of local attribute env, declList specifies the type of the attri-
bute. In the equation defining the value of env, declList is the name of a value,
in particular, the second argument of Prog.
The use, in an attribute definition function, of a part of the term being edited
is called a syntactic reference. This extension to conventional attribute gram-
mars is provided in the Synthesizer Generator in recognition of the fact that
often a piece of the abstract-syntax tree is itself a sufficiently convenient
representation of a value needed for attribute computations. In a system such as
GAG [Kastens82], where a syntax tree is a different sort of object from an attri-
bute value, one must resort to replicating the syntactic tree in the attribute
domain. However, in the Synthesizer Generator, an attribute's type is a phylum
that is defined with the same kind of rules that are used to define syntactic
objects; a program's attribute values and the program itself are all elements of
either primitive phyla or phyla defined in the editor specification. Thus, we
were able to use phylum declList as both the type of the declaration list of a
program and the type of the Prog.env attribute.
In general, any of a production's phylum occurrences can be used as a value
in the expression on the right-hand side of an attribute equation. A second
example of a syntactic reference occurred, without our remarking on it, in the
equation
identifier: Identifier(IDENTIFIER);
Thus, when the value of IDENTIFIER is "i", for example, LookupType will
actually be passed Identifier("i") as its first argument.
This completes our discussion of Figure 4.2. The env and type attributes
defined there are used in the next part of the specification, Figure 4.3, to define
the STR-valued attributes that describe type incompatibilities and undeclared
variables in the program.
First, note the liberal scattering of local attribute declarations in Figure 4.3.
Observe that not every operator of decl, stmt, and exp has such attributes. The
4.2. Attributes and Attribute Equations 55
decl Declaration {
local STR error;
error = (identifier != IdentifierNull
&& NumberOfDecls(identifier, {prog.env}) > 1)
? " { MULTIPLY DECLARED}" : "";
stmt : Assign {
local STR assign Error;
local STR error;
assignError = IncompatibleTypes(identifier.type, exp.type)
? " { INCOMPATIBLE TYPES IN :=}" : "";
error = (identifier == IdentifierNulllllsDeciared(identifier, {prog.env}))
? "" : "{ NOT DECLARED }";
}
IfThenElse, While {
local STR type Error;
typeError = IncompatibleTypes(exp.type, BoolTypeExp)
? " { BOOLEAN EXPRESSION NEEDED }" : "";
exp Id {
local STR error;
error = (identifier == IdentifierNulllllsDeciared(identifier, {prog.env}))
? "" : " { NOT DECLARED }";
}
Equal, NotEqual {
local STR error;
error = IncompatibleTypes(exp$2.type, exp$3.type)
? " { INCOMPATIBLE TYPES} " : "";
}
Add {
local STR leftError;
local STR rightError;
leftError = IncompatibleTypes(exp$2.type, IntTypeExp)
? " { INT EXPRESSION NEEDED}" : "";
rightError = IncompatibleTypes(exp$3.type, IntTypeExp)
? "{ INT EXPRESSION NEEDED} " : "";
chief virtue of local attributes is that they permit defining a computation in one
production of a phylum without requiring the computation in all, as would be
the case, for example, if the attributes were synthesized.
Each local error attribute - error, assign Error, type Error, leftError, and
rightError - is declared to have type STR, the built-in phylum of strings in
56 Chapter 4. Specification of a Sample Editor
SSL. Each attribute will be either the null string, if there is no error, or the
appropriate error message.
Each error attribute is defined by a conditional expression. In SSL, condi-
tional expressions have the form:
where function is the name of the function, phylum is the result type, and expres-
sion is the function body. Each formal parameter name is declared with a type,
and the expression that is the function body must have type phylum. For
4.2. Attributes and Attribute Equations 57
};
};
};
rReturn true iff neither t1 nor t2 is EmptyTypeExp and t1 is not equal to t2. */
BOOl IncompatibleTypes(typeExp t1, typeExp t2) {
(t1 != EmptyTypeExp) && (t2 != EmptyTypeExp) && (t1 != t2)
};
Figure 4.4. Function declarations that define the auxiliary functions lookupType, IsDe-
clared, NumberOfOecls, and IncompatibleTypes.
with ( expression 0 ) (
pattern 1 : expression 1,
pattern 2 : expression 2,
pattern n : expression n
};
The first pattern is simply the constant DeclListNil. This matches the deciList e
precisely when e is the tenn DeciListNil, i.e. an empty declList. If it matches,
the value of LookupType is EmptyTypeExp. If e is not DeclListNil, it is
guaranteed to have the fonn
DeciListPair(Declaration(*, *), *)
the pattern variables are id, t, and dl. When this pattern matches, id, t, and dl
will be bound to the corresponding subtenns in e. Thus, the value of Lookup-
Type can be computed as
4.2. Attributes and Attribute Equations 59
That is, if id is i, the identifier whose type we are looking for, return t, its type;
otherwise, continue looking for i in dl, the rest of the declList e.
The recursive functions IsDeciared and NumberOfDecls behave similarly
and need not be discussed further.
Taken together, the definitions given in Figures 4.1 through 4.4 specify the
language's underlying abstract syntax and define how a program's abstract-
syntax tree is annotated with STR-valued attributes that indicate the presence or
absence of type errors and undeclared variables. We next describe how display
representations are specified.
The unparsing scheme between the square brackets should be seen as an image
of the corresponding production. The choice of the symbol : or ::= determines
whether or not a production can be edited as text. The symbol ::= indicates that
it is permitted to edit the production's text; the symbol: indicates that the pro-
duction is (customarily) treated only as an indivisible structural unit. Note that
not all operators of a given phylum have to use the same symbol. For example,
EmptyStmt and Assign use ::=, while IfThenElse, While, and Compound
use :.
The display is generated by a left-to-right traversal of the tree that interprets
the unparsing schemes. Indentation and line breaks are controlled by control
characters that can be included in the strings of an unparsing scheme. In partic-
ular, the characters %t, %b, and %n have the following meanings:
60 Chapter 4. Specification of a Sample Editor
Each selection symbol, either @ or ", represents the position of one of the
phylum occurrences in the production's mix-fix display representation. An
additional selection symbol for the left-hand-side occurrence is separated from
the rest of the unparsing declaration by the : or ::=; the remaining selection sym-
bols match up with the right-hand-side phylum occurrences in the order in
which they occur in the abstract-syntax definition.
A selection symbol defines the selectability property for the corresponding
node in the tree. Selections may only be made at resting places, which are deter-
mined by the selection symbols of the unparsing declarations. When the user
selects an item on the display, the selection is moved to the nearest resting place
that encloses that item. The selection symbol @ specifies that a phylum
occurrence is a resting place; the symbol " specifies that it is not. Note that a
syntax-tree node is an instance of two phylum occurrences: a left-hand-side
occurrence in one production and a right-hand-side occurrence in another. A
node is a resting place if either of its two corresponding occurrences is specified
with@.
With the unparsing declarations given above, a selection in an expression will
select the smallest sUbexpression whose display representation includes the
selected character. If all of the @ selection symbols in the Equal, NotEqual,
and Add productions were changed to " symbols, then a selection anywhere in
an expression would select the entire expression.
Selections in lists are handled somewhat specially, so care should be made to
define the selection symbols as has been done above for DeclListPair and
StmtListPair; that is, the two occurrences of the list phylum should be defined
with @ and the element-phylum occurrence, in the first argument position,
should be defined with ". This avoids having an extra resting place at the
element-phylum occurrence, yet allows selecting a sublist by pointing and drag-
ging. (See Section 5.3, "Sublist Manipulations.")
The unparsing declarations for DeclListPair and for StmtListPair illustrate
conditional unparsing of list separators. The unparsing declaration of the binary
production of a list phylum may have items enclosed in square brackets; these
items are suppressed from the display of production instances that occur at the
end of a list. Thus, the [";%n"] in the unparsing declarations for DeclListPair
and StmtListPair causes the lists to be printed with a semi-colonlline-feed
separator.
Attribute occurrences in the unparsing declaration cause the display represen-
tation of the attribute's value to be included in the display. In the example
above, the unparsing declarations use this facility to indicate the location of type
errors, undeclared variables, and multiply-declared variables. The attribute
equations of the specification define these attributes to contain a warning
62 Chapter 4. Specification of a Sample Editor
message when an error exists at a location, and the null string otherwise. The
incremental reevaluation of attributes that is performed after each program
modification guarantees that the program's display reflects appropriate error
messages.
5In the case of identifier, the name Identifier was already used for another purpose, so we have
called the input phylum Ident.
4.4. Input Interfaces 63
declList DecIList.t;
decl Decl.t;
typeExp TypeExp.t;
identifier Ident.t;
stmtList StmtList.t;
stmt Stmt.t;
exp Exp.t;
Figure 4.6. Rules that define the association between the abstract syntax and the input
syntax.
p - P.t;
which indicates that when the currently selected component of the program is a
member of phylum p , input is to be parsed to see if it is a member of phylum P ,
and, if so, attribute t is to be inserted in the abstract-syntax tree, replacing the
current selection. The parse tree is a transient structure - it is discarded as soon
as the attribute t is extracted.
The grammar rules shown in Figure 4.7 begin with definitions of lexical
phyla. There is one rule for each keyword and other multi-character token of
the language. The token WHITESPACE has special significance: every
occurrence of a WHITESPACE token is ignored during a parse.
The remaining rules of Figure 4.7 define the syntax of the concrete input
language. The phylum declarations that define the concrete input language, or
parsing declarations, are distinguished from the other phylum/operator declara-
tions of an SSL specification by the use of the symbol ::= to separate the left-
hand-side phylum name from the symbols on the right-hand side. Another
difference between parsing declarations and ordinary phylum declarations is that
tokens - single characters enclosed in quotes - may be interspersed among the
phylum symbols on the right-hand side of a parsing declaration. For example,
the token '=' appears in the operator declaration for parsing an equality expres-
sion:
In this rule's attribute equation, which defines attribute Exp$1.t, the right-
64 Chapter 4. Specification of a Sample Editor
Figure 4.7. Rules that define the input syntax and its translation to a tenn of the abstract
syntax.
The sample specification of this chapter has used only synthesized attributes
in the translation of concrete syntax to abstract syntax. Chapter 6 illustrates how
inherited attributes may also be used to advantage.
The mechanism for translating input text to an abstract-syntax tree provides
an editor-designer with the ability to define textual and structural interfaces in
whatever balance is desired. This ability is discussed in detail in Chapter 6,
along with many examples that illustrate how different combinations of text-
editing and structure-editing interfaces can be defined.
transform typeExp
on "integer" <typeExp>: IntTypeExp,
on "boolean" <typeExp>: BoolTypeExp
transform stmt
on "assign" <stmt>: Assign«identifier>, <exp»,
on "if" <stmt>: IfThenElse«exp>, <stmt>, <stmt»,
on "while" <stmt>: While«exp>, <stmt»,
on "begin" <stmt>: Compound«stmtList»
,
transform stmt
on "(begin" s: Compound(StmtListPair(s,StmtListNil)),
on ")begin" Compound(StmtListPair(s,StmtListNil)): s
Suppose the current selection is a term s of the given phylum; then the transfor-
mation is said to be enabled if the pattern matches s. The transformation-names
of all enabled transformations are listed in the help pane of the current window
and are updated whenever the selection is moved. The effect of invoking the
transformation is to replace s with the value of the expression. As in the case of
with-expressions, the pattern contains pattern variables that are bound to sub-
terms of s. By using these pattern variables in the expression, it is possible to
restructure the selection.
Consider the two source-to-source transformations of Figure 4.8. The pattern
of the transformation named (begin is just the pattern variable s. A pattern
variable, by itself, matches anything. Thus, whenever the selection is a stmt,
(begin is enabled. Invoking this transformation will replace s by the term
Compound(StmtListPair(s,StmtListNiI))
This transformation was illustrated at the end of Chapter 2.
The transformation )begin is the inverse of (begin. Its pattern is
Compound(StmtListPair(s,StmtListNiI) )
which matches only a stmt that is a Compound contammg a singleton
stmtList. Whenever this pattern matches, s is bound to that single statement.
Invoking )begin will replace the selection with s .
We turn now to the template-insertion transformations of Figure 4.8. These
declarations make use of a notation that has not been previously introduced:
each expression in Figure 4.8 of the form <phylum> denotes the placeholder
term of the given phylum. The concept of a phylum's placeholder term is simi-
lar to that of the phylum's completing term - both are default representatives of
the phylum. In general, the two concepts are not identical; however, we will not
explain the difference between them at this point. For now, it suffices to say
that in our sample specification the placeholder term of each phylum is exactly
the same as its completing term. A discussion of the difference between a
phylum's placeholder term and its completing term can be found in Section 5.2,
"Specifying Lists and Optional Elements in SSL."
In each of the template transformations, the pattern denotes the appropriate
placeholder term. Therefore, the template transformations are enabled only
when the selection is at a placeholder. It is thus not possible to accidentally
replace an existing statement with a template, thereby wiping out a large part of
a developed program.
4.5. Templates and Transformations 67
A final subtlety will have escaped the attention of all but the most astute
readers. Review of the sample session of Chapter 2 reveals that the transforma-
tions listed for stmt are also enabled when the selection is a singleton sublist of
a stmtList term. This is true in general: whenever the selection is a singleton
sublist, transformations of both the list phylum and the item phylum are candi-
dates. This feature of the Synthesizer Generator supports the goal of minimiz-
ing distinctions between the singleton sublist and the list item itself; it is dis-
cussed further in Section 5.4, "Selections of Singleton Sublists Versus Selec-
tions of List Elements."
This completes the specification of the sample editor demonstrated in Chapter
2. Figures 4.1 through 4.8 constitute a complete definition of that editor and
illustrate the most important features of SSL.
CHAPTERS
Lists and optional elements are such an important aspect of editor specifications
that they deserve to be discussed in more detail. Although iterated elements
(lists) and optional elements are concepts that are commonly found in many
extended BNF notations, the form that they take in the Synthesizer Specification
Language is different from the familiar notions. The chief motivation for SSL's
notation for lists and optional elements is to provide a uniform notation for
specifying the attribution of terms and their unparsing. The attribute equations
and, with one small exception, the unparsing declarations that one writes for list
phyla and optional phyla are no different in form from those that are used for
ordinary phyla. For example, because all lists are terminated by an instance of
the list phylum's nullary operator, there is no need for special-case rules cover-
ing the cases of an empty list, a singleton list, and a list of length two or more, as
would otherwise have been necessary.
This chapter describes the behavior of lists and optional elements in generated
editors and discusses the issues that arise when writing the parts of editor
specifications that concern them. Sections 5.1 and 5.2 describe how certain edit-
ing operations in generated editors automatically supply placeholders before and
after list elements and at unexpanded optional elements. Section 5.3 discusses
operations for manipulating sublists. Section 5.4 concerns text-entry definitions
for lists and transformations of list elements. Section 5.5 describes how to write
parsing and translation rules that translate concrete text into the correct list
structure of the underlying abstract syntax. Section 5.6 concerns attribute equa-
tions for the completing terms of lists and optional elements.
5.1. Transient Placeholders 69
consist of the selection commands and some of the traversal commands. All
selection commands and traversal commands, even ones that are not
placeholder-inserting commands, are placeholder-deleting commands; that is,
all commands that move the selection will eliminate a transient placeholder that
would be left behind when the selection moves to an element not contained in
the placeholder.
Figure 5.1. Snapshots of the display that illustrate the effect of repeated use of forward-
sibling-with-optionals. Transient placeholders are inserted before and after list elements
«b), (d), (f), and (g), but are deleted when advancing the selection would leave behind an
empty placeholder «c), (e), and (g».
Figure 5.2. Snapshots of the display that illustrate the effect of repeated use of forward-
sibling.
(a) (b)
begin begin
while <exp> do while <exp> do
i := i + 1; i := i + 1;
kstatement> I; II := Ii
j := i end.
end.
Figure 5.3. If the command forward-sibling is invoked when the currently selected con-
stituent is a transient placeholder (a), the placeholder is removed from the program (b).
5.1. Transient Placeholders 73
begin begin
while kexp>1 do while <exp> do
<statement> <statement>;
end. I<statement> I
end.
(bl) (b2) (b3)
begin begin
while kexp>1 do while <exp> do;
end. I<statement> I
end.
Figure 5.4. Snapshots of the display that illustrate the effect of repeated use of forward-
with-optionals in empty lists. Sequence (at) - (a3) illustrates an editor in which state-
ment lists always have at least one element, although the element may be a placeholder.
Sequence (b I) - (b3) illustrates a different version of the editor in which statement lists
are allowed to have zero elements.
Many placeholders are simply leaves of the derivation tree, with no substruc-
ture themselves, such as the placeholders represented by <identifier>, <exp>,
and <statement> in Figures 5.1 - 5.4. This is not always the case, for it is pos-
sible to have placeholders (including transient placeholders) that have subcom-
ponents. Typically, some of a placeholder's subcomponents are placeholders
themselves. In fact, we have already seen an example of a placeholder with
substructure in the editor specification presented in Chapter 4. The placeholder
for the declList phylum looks like
<identifier> : <type>
(a) (b)
end. end.
Figure 5.5. Snapshots of the display illustrating that the selection may be moved from a
transient placeholder (a) to one of its subordinate components (b).
<identifier> : <type>
In Figure 5.6(b), the declList placeholder has been removed because the selec-
tion was moved out of its scope.
(a) (b)
Figure 5.6. Effect of moving the selection from a subcomponent of a transient placehold-
er. Initially, the selection is positioned at a subcomponent of a transient placeholder (a).
The placeholder is deleted when the selection is moved outside of its scope (b).
5.1. Transient Placeholders 75
As shown in Figure 5.7, when a selection command is invoked with the locator
positioned at a semicolon that is a stmtList separator, a transient stmtList place-
holder is inserted between the immediately adjacent list elements.
(a) (b)
begin
while <exp> do
!:=! + 1ct
J:= I "\
end.
I At present, selection commands cause transient placeholders to be inserted only between list ele-
ments; selection commands do not insert placeholders before the first list element or after the last list
element. These placeholders appear only through the actions of traversal commands, such as
forward-with-optionals <AM> or backward-with-optionals <AH>.
76 Chapter 5. Lists, Optional Elements, and Placeholders
Besides inserting transient placeholders before and after list elements, the
traversal commands that are placeholder-inserting commands insert transient
placeholders at optional elements that happen to be unexpanded. A placeholder
is inserted into the tree when an unexpanded optional element is encountered,
and the new placeholder immediately becomes the new selection. The place-
holder is subsequently deleted if the user moves the selection before making an
insertion at the placeholder.
To illustrate this behavior, we again use program fragments based on the edi-
tor specification presented in Chapter 4; however, in these examples we assume
that the definition of the While operator has been changed to
where loopname is the phylum of loop names. The full details of how the
loopname phylum is specified will be presented in the next section; here it
suffices to know that an unexpanded loopname has the property of being an
optional program element.
Example. Suppose that the initial situation is as shown in Figure 5.8(a), in
which the currently selected constituent is the entire while-loop. When the
selection is advanced with the traversal command forward-with-optionals, a
transient placeholder is inserted for the unexpanded loopname component of
the while-loop (see Figure 5.8(b)). At this point, the user could insert a name
for the loop. However, if the user leaves the placeholder empty and chooses
begin begin
i:= 0; i'- O'
Idoopname> :Iwhile <exp> do v.:hile'l<exp>1 do
i := i + 1; i := i + 1;
j := i j:= i
end. end.
instead to advance the selection to the position shown in Figure S.8(c), the
placeholder is removed from the program.
Recall that in Chapter 4 we defined the first operator declared for each phylum
to be its completing operator. For an ordinary phylum, the completing term is
constructed by applying the completing operator to the completing terms of its
argument phyla; the placeholder term for an ordinary phylum is identical to the
completing term.
Example. Suppose that phylum phyo is defined by the following abstract-
syntax rules:
The completing term of phyo, as well as its placeholder term, is the term
listphyo;
phyo: OPI ()
I OP2 ( phy 1 phyo )
Both the placeholder term and the completing term of phylum phyo are the sin-
gleton list formed by concatenating the completing term of phy 1 with the nullary
operator of phyo:
For list phyla, placeholders are instances of placeholder terms that have been
spliced into an existing list. When a placeholder-inserting command inserts a
phYoplaceholderbetweenelementsA andB in the list A ::B ::C ::oPl>phyo's
placeholder term is spliced into the list. This operation creates the list
Thus far, there has been no need to distinguish between a phylum's placeholder
term and its completing term; for ordinary phyla and non-optional list phyla,
they are identical. We now tum to phyla declared with the optional property:
optional phyla and optional list phyla. To discuss how their transient placehold-
ers are defined, we must now distinguish between the phylum's placeholder
term and its completing term.
An optional (non-list) phylum can have any number of operators, but one of
them must be a nullary operator; the completing term is the term constructed
from the first nullary operator of the phylum that is listed in the editor
specification. The placeholder term is constructed by "completing" the first
80 Chapter 5. Lists, Optional Elements, and Placeholders
optional phyo;
phyo : OPI (phy 1,1 phy 1,2 ... phy I,k, )
The completing term of phyo is the term op j' which is constructed from the first
nullary operator listed. The placeholder term is the term
Note that the completing term and placeholder term are defined in terms of the
first nullary operator listed and the first operator that is not the completing-term
operator. Thus, phyo's placeholder term and completing term would still be the
same if we listed the operators in the following order:
phyo : OPj ( )
I OPI (phy 1,1 phy 1,2 ... phy I,k, )
optionalloopname;
loopname: LoopnameNull()
I LoopnamePrompt( )
I Loopname(IDENTIFIER)
While(LoopnameNull, E, S)
While(LoopnamePrompt, E, S)
The abstract-syntax rules for an optional list phylum are like the ones for a non-
optional list: there must be exactly two operators, one of them nullary, the other
binary and right recursive. The completing tenn of an optional list phylum is
the constant tenn constructed from the nullary operator of the phylum. The
placeholder tenn of an optional list is exactly the same as the completing tenn
had the phylum not been declared to be optional; that is, the placeholder tenn is
the singleton list constructed by applying the list's binary operator to the com-
pleting tenn of its left son and to the list's nullary tenn.
Example. An optional list phylum phyo would be defined by abstract-syntax
rules of the fonn
optionallistphyo;
phyo: oPI ()
I oP2 ( phy I phyo )
The completing tenn of phyo is the tenn 0PI' The placeholder tenn is
[phYI] :: 0PI'
As is the case for non-optional list phyla, when a placeholder-inserting com-
mand inserts a phyo placeholder between elements A and B in the list
A :: B :: C :: OPb phyo's placeholder tenn is spliced into the list to create
The essential difference between an optional list phylum and its non-optional
counterpart is that an optional list phylum behaves as a list of length zero or
more, whereas a non-optional list phylum behaves as a list of length one or
more. The qualifier optional specifies that whenever the current selection is
moved outside the scope of an instance of the phylum's placeholder tenn, the
placeholder is automatically replaced by an instance of the phylum's completing
tenn. For example, if phyo is an optional list phylum, the tenn [phy I] :: OPI
would be replaced by the tenn 0PI' In contrast, if phyo were a non-optional list
phylum, the subtenn would be left as it was.
5.2. Specifying Lists and Optional Elements in SSL 83
For expository purposes, in several places we have used the special notation that
SSL provides for denoting placeholder tenns and completing tenns. A
phylum's completing tenn is denoted by the phylum name enclosed in square
brackets; its placeholder tenn is denoted by the phylum name enclosed in angle
brackets. This notation can be used in patterns of transfonnation declarations as
well.
Example. Consider the use of the placeholder-tenn notation in the transfor-
mation declarations that declare the template commands for statements in
Chapter 4:
transform stmt
on "assign" <stmb: Assign«identifier>, <exp»,
on "if" <stmt>: IfThenElse«exp>, <stmb, <stmb),
on "begin" <stmt>: Compound«stmtLisb)
Now consider how to define a template command for our revised definition of
the While operator, which includes an optional loopname component. One
way to define the While template is as follows:
This definition causes the loopname element to expand to its completing tenn
when the template is instantiated; because phylum loopname is an optional
phylum, the loopname component will be initially hidden from the user.
Using loopname's completing tenn on the right-hand side of a template is
not mandatory. The alternative is to use its placeholder tenn, as is done in the
following declaration:
The only difference between the revised template definition and the one given
previously is that the latter causes a transient placeholder for loopname to
appear immediately, when a while-loop is initially created. That is, because the
loopname component is expanded to the tenn LoopnamePrompt when a tem-
plate is instantiated, the template looks like
Thereafter, the loop name placeholder behaves no differently than it does with
the original template declaration: once the loopname placeholder becomes the
84 Chapter 5. Lists, Optional Elements, and Placeholders
We now present two tables that summarize property declarations and the tran-
sient placeholders they define. Figure 5.10 lists canonical abstract-syntax rules
for ordinary phyla, list phyla, optional phyla, and optional list phyla, and sum-
marizes the definitions of their completing terms and placeholder terms. Figure
5.11 summarizes the special actions for automatically inserting and deleting
transient placeholders that are enabled by SSL property declarations.
Figure 5.9. A template is instantiated that has a transient placeholder for the loopname
component (b). When the selection is advanced to another placeholder, the loopname
placeholder is removed from the program (c).
5.3. Sublist Manipulations 85
Figure 5,10, Property declarations, abstract-syntax rules, completing terms, and place-
holder terms,
Figure 5,11. Property declarations and their characteristics vis d vis transient placehold-
ers.
86 Chapter 5. Lists, Optional Elements, and Placeholders
Figure 5.12. Snapshots of the display that illustrate selection of sublists by dragging.
Starting with the locator pointing to the equal sign of the second assignment-statement in
a four-element list (a), a select-start is issued (b). The locator is moved to the third state-
ment, and a select-stop is issued, causing a two-element sublist to be selected (c).
5.4. Selections of Singleton Sublists Versus Selections of List Elements 87
begin
i:= 0;
while <exp> do
[GIl
~;
m :=3
end.
begin
i:= 0;
j := 1 ;
[k:= 2[;
m :=3
end.
The question is whether the selected constituent is an object of phylum stmt or
88 Chapter 5. Lists, Optional Elements, and Placeholders
In this example, the selected constituent is a stmtList in (a) and (c) but is a stmt
in (b) and (d). Not only is this undesirable, because the display is ambiguous,
but it is also frustrating to have the extra stopping places when advancing the
selection through the program with traversal commands.
Because of these considerations, the resting-place definitions in editor
specifications are generally defined so that the nodes for individual list elements
are invisible to the editor user. That is, they are defined so that they are not rest-
ing places in the abstract-syntax tree. With such definitions, the selection never
consists of an individual list element; instead, the selection of a single item in a
list actually causes a one-element (singleton) sublist to be selected.
The resting-place definitions for lists generally follow the pattern found in the
unparsing declarations for the stmtList and stmt phyla shown in Figure 5.14.
(These unparsing declarations come from the editor specification presented in
Chapter 4.) As used in Figure 5.14, the resting-place symbols prevent the selec-
tion of stmt nodes that are elements of stmtLists.
The solution presented above introduces certain problems for other aspects of
editor specifications, including transformation declarations, text entry, and cut-
and-paste operations. In all three cases, it seemed unnatural to make a distinc-
tion between singleton sublists and individual list elements. Consequently, the
Synthesizer Generator has a number of features that minimize the distinction
between a singleton sublist and an individual list element.
5.4. Selections of Singleton Sublists Versus Selections of List Elements 89
Figure 5.14. Unparsing declarations for the phyla stmt and stmtList. These declarations
illustrate resting-place definitions in which the nodes for the individual list elements are
invisible to an editor user.
Transformations
(a) (b)
begin begin
i:= 0; i:= 0;
j := 1; j := 1 ;
k:= 2; k:= 2;
I<statement> I; while Ir-<-ex-p---'>I do
m :=3 <statement>;
end. m :=3
end.
Text entry
Ordinarily, the text-entry property from the unparsing declaration for the
selection's root determines whether text-entry is permitted. However, when the
selection is a singleton sublist, the text-entry property for the individual list item
is also used (i.e. text entry is permitted at a singleton sublist if it is permitted by
the text-entry property associated with the list item or the list's binary operator).
Example. In the declarations given in Figure 5.14, text entry is forbidden at
StmtListPair operators and at While operators, but is permitted at Assign
operators. In the two situations shown below, where in both cases the selection
is a singleton sublist, the user will be permitted to enter text in situation (a) but
not in (b).
(a) (b)
begin begin
j := 1; j := 1 ;
Ik:= 21; k:= 2;
m :=3 Iwhile <exp> dol
end. I<statement> I;
m :=3
end.
5.4. Selections of Singleton Sublists Versus Selections of List Elements 91
Note that matters would be slightly different if the text-entry property for
StmtListPair were specified with the following unparsing declaration:
The latter declaration will allow text entry in both cases (a) and (b) above. (It
will also allow text entry when a sublist is selected whose length is greater than
one, whereas the original one will not.)
Cut-and-paste operations
To further hide the distinction between a singleton sublist and an individual list
element, the paste operation is prepared to make automatic conversions from
singleton lists to list elements, and vice versa.
To illustrate this, note that in Figure 5.14, even though the resting-place
definitions hide all stmt nodes that are elements of stmtList lists, it is still possi-
ble to select the stmt objects that appear in the second and third components of
IfThenElse statements. In the unparsing declaration for IfThenElse, the sym-
bol @ in the operator's second and third components makes these nodes resting
places in the tree. Consequently, it will be possible to select stmt nodes in
conditional-statements and invoke cut-to-clipped to cut them from the program
to the clip buffer. However, when paste-from-clipped is invoked, certain
automatic conversions will be performed, if appropriate. For example, a stmt
cut from an IfThenElse operator can be pasted into a stmtList, as shown below:
These rules define how the input language represented by phylum StmtList is
translated into an element of abstract-syntax phylum stmtList.
Unfortunately, because of the way the Synthesizer Generator is implemented,
these rules have a serious flaw. The parser that is used in a generated editor is a
LALR(l) parser generated using the yacc parser generator [Johnson78]. The
problem lies in the second of the two parse rules given above. Because the pro-
duction is right recursive, there will be a limit on the number of statements that
can be parsed. As described in [Johnson78], right-recursive grammar rules
cause the parser to scan and push (shift) all list elements onto the stack before
any portion of the list can be reduced. After some fixed number of list elements
are shifted, the stack will overflow.
To avoid stack overflow, parse rules for lists should be written left recursively
to allow reductions to be made on-the-fly as new list elements are encountered.
However, in the Synthesizer Generator, this solution leads to a new problem
because of the Generator's requirement that the abstract-syntax rules for a list
phylum be written right recursively. The problem now is how to build up a
right-recursive abstract list in the attribute equations of left-recursive parse
rules. We could (naively) build up the abstract list with the following rules:
However, this has the undesired effect of building up the abstract-syntax list
term in the wrong order - the list's elements would be in the reverse order from
the order in which the input elements were typed by the user!
To build an abstract list in a manner that respects the order of the input ele-
ments, we must arrange for the attribution rules to collect the elements of the
abstract list from right to left in the left-recursive input-syntax parse tree. The
solution is to write the attribution rules of the parsing grammar with attributes
5.5. Parsing Lists 93
whose dependencies are threaded from right to left, as is done in the following
input grammar specification:
StmtList {
inherited stmtList tail;
synthesized stmtList reversed;
};
Stmt { synthesized stmt t; };
stmtList - StmtList.reversed {StmtLisUail = StmtListNiI; };
StmtList·· (Stmt) {StmtList.reversed = (StmU :: StmtLisUail); }
(StmtList ';' Stmt) {
StmtList$2.tail = (StmU :: StmtList$1.tail);
StmtList$1.reversed = StmtList$2.reversed;
}
Here the attribute equations are written so that phylum StmtList has two attri-
butes, named tail" and reversed, both of phylum stmtList. Inherited attribute
tail collects the abstract translation of the concrete list elements from right to
left; the leftmost production of the parse tree passes the abstract list up as the
synthesized attribute reversed. Note the attribute equation associated with the
entry declaration:
phylum has one inherited and one synthesized attribute, and these attributes are
used to pass information through the tree from left to right. It is often the case
that a list's completing term or placeholder term can pass the information
through unchanged. In such a case, the cost of inserting or deleting a place-
holder is essentially zero.
CHAPTER 6
Because the Synthesizer Generator is a tool for creating editors, one of our
goals is to offer editor-designers the ability to make their own design decisions
about the proper balance of text-editing and structure-editing facilities. Conse-
quently, the Synthesizer Generator allows one to define editors that range
between the extremes of a pure structure editor and a pure text editor. In this
regard, the Synthesizer Generator's capabilities are in contrast to the capabilities
of other editor generators, which offer only pure structure editors [Medina-
Mora81] or only one particular (although quite powerful) combination of textual
and structural operations that is fixed for all generated editors [Bahlke86].
Some of the Synthesizer Specification Language's mechanisms for defining
hybrid editors were illustrated previously in the editor specifications used as
examples throughout Chapter 4 and Chapter 5. In this chapter, we discuss these
mechanisms in more detail and introduce some additional facilities of SSL that
were not utilized in previous examples. In addition, we discuss the rationale for
the design decisions that shaped SSL.
We focus on how to write the parts of editor specifications that relate
specifically to editing functions of generated editors; we are concerned with how
an editor-designer uses SSL to specify operations for creating and manipulating
objects and with how he or she uses SSL to define a language's display format.
Because SSL's mechanisms for defining the concrete input language and the
display format are based on the attribution mechanism, we will also discuss cer-
tain aspects of writing attribution specifications. The use of attributes and attri-
bute equations for defining static inferences, such as name analysis and type
checking of programs, as illustrated in Chapter 4, is the subject of Chapter 7.
Section 6.1 addresses matters that must be considered when defining the
underlying abstract structure of objects being edited. Section 6.2 discusses the
SSL constructs for specifying text-editing and structure-editing operations, as
well as how to combine them to specify hybrid editors. Section 6.3 concerns
SSL's display language and presents examples of how to use attribute values to
specify computed display rules. In SSL, the translation of concrete input to
abstract syntax is defined by attribution rules; Section 6.4 discusses an extension
of the basic mechanism that permits defining the translation of input text to
abstract structure so that the translation varies depending on the context in
which the object's currently selected constituent is situated.
6.1. Defining a Language's Underlying Abstract Syntax 97
may be grafted into the tree as an instance of any of the other phyla contain-
ing that operator.
Although we employ the vocabulary of phyla and operators throughout the
book, the Synthesizer Generator currently supports only one of the two features
that makes the phylum/operator formalism novel. The Synthesizer 'Generator
permits operator names to be supplied for all productions, but does not permit
the use of intersecting phyla.
To see how the restriction to disjoint phyla can complicate matters, consider
the abstract-syntax rules for the phylum exp, a phylum of integer arithmetic
expressions. It is clear that the arithmetic operators should be defined by rules
like
exp Null( )
Plus, Minus, Times, Div(exp exp)
Uminus(exp)
null -7
plus -7 EXP EXP;
minus -7 EXP EXP;
times -7 EXP EXP;
div -7 EXP EXP;
uminus -7 EXP;
The Metal convention is that operator names are in lower case and phyla names'
in upper case. Operator names appear to the left of the arrow; the operator's
arguments and arity are indicated by the list of phyla on the right-hand side.
Phylum EXP is defined by the following rule, which names the operators that
generate members of EXP:
One of the features of Metal that is used in this rule is that a phylum name phy
may be used on the right-hand side of a phylum definition as a shorthand for the
6.1. Defining a Language's Underlying Abstract Syntax 99
set of operators that are members of phy. In the rule for EXP, the phylum name
INT refers to the predefined, countably-infinite set of O-ary operators:
Thus, phylum EXP is generated by the set of all of the INT phylum's operators
together with the operators null, plus, minus, times, div, and uminus. Another
way of putting it is that in a phylum definition, phylum names represent sets of
operators, whereas operator names represent singleton sets; the phylum being
defined is the set of terms generated by the union of all these operator sets. The
definition of EXP is really the statement
exp Null( )
Plus, Minus, Times, Div(exp exp)
Uminus(exp)
Const(INT)
prevents that component from ever being selected by the user; the absence of a
resting place at component INT effectively conceals its presence.
A final word on terminology: because the phylum/operator formalism coin-
cides with the nonterminal/production formalism when all phyla are disjoint,
and because the nonterminal/production vocabulary is more widely known, we
shall occasionally refer to phyla as "nonterminals," and refer to an occurrence of
a given operator in a phylum as a "production."
Because these four kinds of facilities are specified by independent parts of SSL
specifications, it is possible to create editors that furnish different combinations
of hybrid-editing facilities in different contexts. For example, text-editing
operations may be permitted for entering a new component at a phylum's place-
holders but not for reediting the component, and vice versa. A similar statement
can be made about template-insertion operations and transformational restruc-
turing operations.
The table presented in Figure 6.1 summarizes the SSL features that govern
which kinds of editing facilities will be furnished in generated editors. In Figure
6.1, the mechanisms for specifying entry operations and modification operations
have been listed in separate columns.
It is only fair to point out that there does exist another way, not supported by
the Synthesizer Generator, of incorporating text editing in language-based edi-
tors: make the editor behave as much like an ordinary text editor as possible, but
use an incremental parser to do syntax checking so that syntax errors can be
reported to the user.
In editors that use an incremental parser, text is entered character by charac-
ter, as with an ordinary editor. After completion of each transaction, the incre-
mental parser is applied to determine the syntactic correctness of the
modification, in the context given by the rest of the object. lOne incarnation of
this idea is based on an incremental parser that assesses the correctness of the
entire object. (Algorithms for this kind of incremental parsing have been
described in [Wegman80), [Ghezzi79], [Ghezzi80], and [Jalili82).) A second
Entry Modification
Structure Transfonnations whose patterns are Construct-to-construct
placeholder tenns transfonnations
Text a) Parse rules a) Parse rules
b) Entry declarations b) Entry declarations
c) Text-entry property of the c) Text-entry properties of the
completing operator's unparsing operators' unparsing
declaration declarations
Figure 6.1. Components of editor specifications that govern text-editing and structure-
editing operations for entry and modification.
IThe scope of each transaction varies from system to system. In some systems, each insertion or
deletion of an individual character is a transaction; in others, a transaction consists of several inser-
tions and deletions.
6.2. Integration of Text Editing and Structure Editing 103
root exp_list;
list expJist;
exp_list ExpListNil( )
ExpListPair(exp expJist) [@: • [n, n] @]
exp Null( )
Plus, Minus, Times, Div(exp exp)
Uminus(exp)
,
WHITESPACE: Whitespace< [\ \t\n] >;
Figure 6.2. Abstract-syntax rules and unparsing declarations that are common to all of
the versions of the spe'cification of an editor for expression lists that is used in examples
throughout the chapter.
SPACE is defined, as usual, to consist of the characters blank, tab, and newline.
(The definition of WHITESPACE is not needed for the first variant discussed
below because it does not allow text to be entered at any of the grammar's
phyla, but WHITESPACE will be needed for all the remaining variants.)
The parts of the specification that vary from example to example consist of
(a) declarations for the integer constants, (b) unparsing declarations, which con-
trol the way expressions are displayed and specify which operators can be
modified with text-editing operations, (c) transformation declarations, and
(d) (in some cases) lexical and parsing declarations.
Structure editing is carried out using a set of editor commands that restructure
objects. New objects, or new fragments of objects, are created using a subset of
the structure-editing commands, called template commands. Templates are
predefined fragments of objects that may contain additional placeholders. A
template'S placeholders provide a framework for the insertion of additional ele-
ments in an object. Templates permit the user to create objects by deriving them
top down.
Structure-editing operations are specified by transformation declarations of
the form
This defines a command, with the given transformation-name, that means, "If
the current selection of the abstract-syntax tree matches pattern, replace it by
the result of evaluating expression."
6.2. Integration of Text Editing and Structure Editing 105
The first variant of the specification for the expression-list editor that we con-
sider defines the editor so that its only entry operations are template commands:
no text-editing operations are allowed at all. Expressions and integer constants
must be derived top down, using template commands. To define this editor, the
core specification of Figure 6.2 is extended in two ways:
1) Template commands and unparsing declarations are defined for the arith-
metic operators.
2) Abstract-syntax rules, template commands, and unparsing declarations are
defined for the integer constants.
These different aspects are described in detail below.
The template commands for the five arithmetic operators are defined with the
following transfonnation declarations:
transform exp
on "+" <exp> : Plus«exp>, <exp»,
on "-" <exp> : Minus«exp>, <exp»,
on "*,, <exp> : Times«exp>, <exp»,
on "/" <exp> : Div«exp>, <exp»,
on "u-" <exp> : Uminus«exp»
list digitList;
digitList DigitListNull( )
I DigitListPair(digit digitList)
transform digit
on "0" <digit> : Zero,
on "1" <digit> : One,
on "2" <digit> : Two,
on "3" <digit> : Three,
on "4" <digit> : Four,
on "5" <digit> : Five,
on "6" <digit> : Six,
on "7" <digit> : Seven,
on "8" <digit> : Eight,
on "9" <digit> : Nine
It is also necessary to add a new template to permit a Canst term with an empty
digitUst to be inserted at an unexpanded expression. The template is defined by
the following transformation declaration:
The editor specified by the rules given above is a rather simple structure edi-
108 Chapter 6. Defining Hybrid Editors with the Synthesizer Generator
tor. 2 It provides template commands for entering each arithmetic operator and
each individual digit of integer constants. An example showing the insertion of
the integer 12 is presented in the six snapshots in Figure 6.3.
The only editing operations permitted at interior nodes of the abstract-syntax
tree are the basic cut-and-paste operations. To see how to define text- and
structure-editing operations for modifying expression lists, refer to Section
6.2.3, "Textual modifications," and Section 6.2.4, "Structural modifications,"
respectively.
The expression-list editor defined in the previous section is a pure structure edi-
tor. Anyone who used it would probably be frustrated by it because entering an
integer constant is very awkward. As Figure 6.3 illustrates, the editor user is
required to derive integer constants one digit at a time, using a template com-
mand to insert each digit. Between insertions, the user has to invoke the traver-
sal command forward-with-optionals to advance the selection to the next
placeholder. We now describe a number of different variations on this
expression-list editor.
For the variant discussed next, as well as for the ones presented in the rest of
the chapter, it is desirable to change the specification's abstract-syntax rules that
define the integer constants. Rather than representing the integer constants by a
list of digits (i.e. phylum digitList), we alter the declaration of the Canst opera-
tor so that an integer constant is represented by a value of the built-in phylum
INT:
Imain I
d<digibl + <exp»
Positioned at digitUst 0 1 2 3 4 5 6 7 8 9
Imain I
COMMAND: 1
d<digibl + <exp»
Positioned at digitUst 0 1 2 3 4 5 6 7 8 9
Imain I
ill + <exp»
Positioned at digitUst
Imain I
(1kdigit>1 + <exp»
Positioned at digitUst 0 1 2 3 4 5 6 7 8 9
Imain I
COMMAND: 2
(1kdigit>1 + <exp»
Positioned at digitUst 0 1 2 3 4 5 6 7 8 9
Imain I
(1(gj + <exp»
Positioned at digitUst
Figure 6.3. Snapshots of the display that illustrate the use of template commands to enter
the integer 12.
window's command line, in the top line of the window, whereas input text is
echoed in place, at the position of the window's current selection.
110 Chapter 6. Defining Hybrid Editors with the Synthesizer Generator
We now discuss how to define five different kinds of hybrid editors. For each
case, we describe how to specify an expression-list editor that incorporates the
desired features.
The first variant of the expression-list editor pennits text entry of integer con-
stants: integers can be entered directly as text whenever the selection is posi-
tioned at an expression placeholder.
To enable text entry at unexpanded expressions, we change the text-entry pro-
perty associated with the operator Null from ::= to :. The unparsing declaration
for Null now looks like
This apparently disables the ability to enter text at list selections; however, when
the selection is a singleton sublist, the text-entry property of the root of the list
element is also taken into account when the editor detennines whether text edit-
ing is pennitted. (For further details, see Section 5.4, "Selections of Singleton
Sublists Versus Selections of List Elements.") Consequently, the declarations
6.2. Integration of Text Editing and Structure Editing 111
given above allow the user to enter text when the currently selected constituent
is a singleton sublist that consists of an expression placeholder; text editing is
forbidden for all other kinds of sublist selections.
We now tum to the lexical and parsing declarations that govern the form of
the input text that is allowed to be entered. The declaration of token INTEGER
defines the syntax of integer constants; INTEGER is specified to be a string of
numerals:
The parsing rules and their attribution declarations define how Exp-valued and
ExpList-valued parse trees are translated to values of phyla exp and exp_list,
respectively.
Note that the call on the built-in function STRtolNT has the effect of converting
the text of the INTEGER token to a value of phylum INT.
To complete the specification of the editor, transformation declarations define
templates for the five arithmetic operators:
transform exp
on "+" <exp> : Plus«exp>, <exp»,
on "-" <exp> : Minus«exp>, <exp»,
on "*" <exp> : Times«exp>, <exp»,
on "/" <exp> : Div«exp>, <exp>),
on "u-" <exp> : Uminus«exp»
These rules are identical to the template declarations used in the pure structure
editor specified in Section 6.2.1, "Structural entry," except that because integers
are entered directly as text, there is no need to have a template for operator
Const.
The version of the expression-list editor specified by the rules given above is
a hybrid of a text editor and a structure editor. Arithmetic operators are entered
top down using template commands; integer constants are entered by typing the
112 Chapter 6. Defining Hybrid Editors with the Synthesizer Generator
text of the integer directly. Template commands are echoed in the command
line, whereas text entered for integer constants is echoed in-place at the position
of the window's current selection. Figure 6.4 shows the integer 123 being
entered. From top to bottom, the snapshots portray the state of the screen as the
characters 1, 2, and 3 are entered, followed by the command forward-with-
optionals, which terminates entry and advances the selection to the next com-
ponent of the tree.
These rules specify the concrete syntax by ambiguous parsing rules, but prece-
dence declarations are supplied to disambiguate them. (In the example above,
the precedence declarations are the lines that begin with the keyword left.)
The LALR(l) parser generator yacc [Johnson78], which is used by the Syn-
thesizer Generator to create an editor's parser, makes use of precedence declara-
tions to resolve ambiguities in the grammar rules. Ambiguities are of two types,
corresponding to the following kinds of conflicts that would arise if there were
no disambiguation mechanism:
I) A shift/reduce conflict arises if there is no way to decide, based on the con-
tents of the stack and the next token ex, whether to apply a reduction rule to
the stack or to shift ex onto the stack, thereby deferring the reduction.
2) A reduce/reduce conflict arises when more than one reduction can be
applied to the stack.
6.2. Integration of Text Editing and Structure Editing 113
Imain I
O<exp>1 + 3)
Positioned at exp + - · / u-
Imain I
® + 3)
Positioned at exp + - · / u-
Imain I
lliIJ + 3)
Positioned at exp + - · / u-
Imain I
~ +3)
Positioned at exp + - · / u-
Imain I
(123 +~)
Positioned at exp
Figure 6.4. Snapshots of the display that illustrate text entry to input the integer 123.
From top to bottom: the characters 1, 2, and 3 are entered, followed by the command
forward-with-optionals, which terminates entry and advances the selection to the next
component of the tree.
3It is possible to assign a different level to a parsing rule using a prec directive. For example, to de-
clare that the rule for parsing unary minus has the same precedence level as the ,., token, one could
use the following declaration:
Imain I
O<exp>1 + 3)
Positioned at exp + - • / u-
Imain I
I® + 3)
Positioned at exp + - · / u-
Imain I
1@±lJ + 3)
Positioned at exp + - • / u-
Imain I
~ +3)
Positioned at exp + - · / u-
Imain I
((2 + 4) + @l)
Positioned at exp
Figure 6.5. Snapshots of the display that illustrate text entry to input the expression 2 +
4. From top to bottom: the characters 2, +, and 4 are entered, followed by the command
forward-with-optionals, which terminates entry and advances the selection to the next
component of the tree.
are depicted show the characters 2, +, and 4 being entered, followed by a com-
mand to terminate text entry and advance the selection to the next component.
116 Chapter 6. Defining Hybrid Editors with the Synthesizer Generator
The editor defined by this version of the editor specification will exhibit the
hybrid-editing characteristics of the Interlisp and MENTOR editors; input that
the user types as text gets turned into structure that can only be re-edited by
structural cut-and-paste operations.
In our opinion, it is usually a mistake to define such an editing interface, for it
gives an inconsistent treatment of entry and modification operations. The object
is entered as text, not according to its structure, so it is not obvious what the
underlying structure is when the user reedits it. Although this is not really a
problem in the Interlisp editor because Lisp has a trivial abstract syntax, the
inconsistency is likely to be a real problem for a language with a more compli-
cated syntax, such as languages in the Algol family.
One of our design goals was to give editor-designers as much freedom as pos-
sible to design whatever user interface deemed desirable. Thus, although we
felt the user interface incorporated in the version of the expression-list editor
that is described above is not a particularly desirable combination of text editing
and structure editing, we felt that SSL had to be expressive enough to allow its
specification.
ExpList {
inherited expJist tail;
synthesized expJist reversed;
};
exp_list - ExpList.reversed { ExpListtail = ExpListNil; };
ExpList ::= (Exp) {ExpList.reversed = Exp.abs :: ExpListtail; }
I (ExpList ',' Exp) {
ExpList$2.tail = Exp.abs :: ExpList$1.tail;
ExpList$1.reversed = ExpList$2.reversed;
}
Figure 6.6 shows the last step of entering the list 2 + 3, 4 + 5; command
forward-with-optionals terminates text entry and advances the selection to a
new expression placeholder.
Imain I
2+34+511
Positioned at expJist + - . / u-
Imain I
(2 + 3), (4 + 5),I<exp>1
Positioned at expJist + - • / u-
Figure 6.6. Snapshots of the display that illustrate text entry to input the expressions 2 +
3 and 4 + 5. From top to bottom: the character string 2+3,4+5 is entered, followed by
command forward-with-optionals, which terminates entry and advances the selection to
a new expression placeholder.
118 Chapter 6. Defining Hybrid Editors with the Synthesizer Generator
Yet another version of the expression-list editor implements expressions that are
phrases, or units that can be selected and modified only as an entire unit. This
version can be obtained by replacing the unparsing declarations that were given
previously by
The use of for the resting-place properties of the phylum occurrences in these
A
In the editors illustrated thus far, whenever a command is entered, the characters
of the command are echoed in the command line (the top line of the window)
rather than "in-place" at the location of the currently selected constituent. In
contrast, when generated editors permit the user to enter fragments of objects
directly as text, the characters that the user types are echoed in-place. (To see
examples that illustrate the differences in the two modes of entry, refer back to
Figures 6.3 - 6.6.)
In this section, we show how to define an editor that acts like an ordinary
structure editor, except that the user is able to type commands in-place in the
window rather than in the window's command line.
The use of the attribution mechanism for translating input text to an abstract-
syntax tree provides an editor-designer with a powerful formalism that can be
used to define textual and structural interfaces in whatever balance is desired.
The editor-designer is free to define parsing rules to recognize whatever input
6.2. Integration of Text Editing and Structure Editing 119
Because these rules are parsing rules, and not transformation declarations,
they apply only to input text and not to command text. Obviously, we also have
to modify Variation 2 to allow input text to be entered at exp placeholders. This
is governed by the unparsing declaration for the Null operator, which must make
use ofthe ::= text-entry property. The appropriate unparsing declaration, shown
earlier, is repeated for emphasis:
The rules given above define a set of editing commands that implemept a
structural interface; it differs from the structural interface incorporated in Varia-
tion 2 in that the user is allowed to type his commands as input text, which
means that they are echoed in-place, at the currently selected constituent, rather
than in the window's command line.
Figure 6.7 shows a sequence of snapshots of the display that illustrate a tem-
plate being inserted using Variation 6. Note that the input text .+ is echoed in-
place. In the last snapshot, the text has parsed successfully by the rules for phy-
lum ExpCommand; the resulting template term, Plus(Null, Null), computed by
the attribution rules of the ExpCommand grammar, has been inserted in the
tree and the selection has been advanced to the next placeholder.
Imain I
d<exp>1 + 3)
Positioned at exp
Imain I
I([ill + 3)
Positioned at exp
Imain I
(d<exp>1 + <exp» + 3)
Positioned at exp
Figure 6.7. Snapshots of the display that, from top to bottom, illustrate text input that is
interpreted as a command to insert a template.
6.2. Integration of Text Editing and Structure Editing 121
In all the versions of the expression-list editor discussed in the previous two sec-
tions, there are only two ways to modify an existing expression: either reorgan-
ize it using the basic cut-and-paste operations or replace it entirely by first delet-
ing the old expression and then entering the new one in its entirety.
SSL also allows the editor-designer to define operations for reediting objects.
In this section, we discuss how to change the specification of the expression-list
editor to allow text editing of expressions. Section 6.2.4 discusses how to define
construct-to-construct transformations that permit the user to make structural
modifications of expressions.
Text editing is controlled by the text-entry symbol that appears in an
operator's unparsing declaration. In the previous section, which discussed how
to specify textual operations for creating new objects, the relevant text-entry
symbol was the one for the operator Null because Null is the operator that
appears at the root of the placeholder term. To allow textual modifications of
terms other than the placeholder term, the unparsing declarations for the other
operators of the specification are changed to use the text-entry symbol ::= in
place of:.
In the unparsing declarations given below, the text-entry property of each opera-
tor is ::=, rather than :, as in the unparsing declarations that have appeared ear-
lier.
An editor specification that permits textual reediting really only makes sense
if text can be reparsed, and in particular only if slightly modified text can be
reparsed. What this means for the editor-designer is that the display image of an
object, generated from the unparsing declarations, should be a subset of the
input language that is defined by the parsing rules. For example, for this version
of the editor, we want to incorporate parsing rules that allow arbitrary expres-
sions to be parsed. The rules for Exp have been given earlier, but are repeated
below for emphasis; the rules for ExpList are written so that only a singleton
sublist can be parsed:
Note that the parsing declarations and the unparsing declarations given in this
section are mutually compatible in the sense that the display image generated by
the unparsing declarations will always be a subset of what is parsable by the
parsing rules.
Figure 6.8 shows a sequence of snapshots of the display that illustrate the
expression ((2 + 3) + 4) being reedited. From top to bottom, they portray the
state of the screen as the character 3 is replaced by 6. In the final image, the
modified text has been successfully re-parsed, the new term Plus(Plus(2, 6), 4)
has been inserted, and the selection has been advanced to the next placeholder.
This version of the expression-list editor defines essentially the same mix of
hybrid-editing operations that was permitted by the Cornell Program
Synthesizer's editor. By permitting only one editing style for each syntactic
6.2. Integration of Text Editing and Structure Editing 123
Imain I
Positioned at exp_list
Imain I
1((2+1)+4)1
Positioned at expJist
Imain I
Positioned at exp_list
Imain I
Positioned at exp_list
Figure 6.8. Snapshots of the display that, from top to bottom, illustrate textual reediting
of the expression ((2 + 3) + 4).
ExpList {
inherited exp_list tail;
synthesized expJist reversed;
};
expJist - ExpList.reversed {ExpListtail = ExpListNil; };
ExpList ::= (Exp) {ExpList.reversed = Exp.abs :: ExpListtail; }
I (ExpList ',' Exp) {
ExpList$2.taii = Exp.abs :: ExpList$1.tail;
ExpList$1.reversed = ExpList$2.reversed;
}
Minus
Times [A .... = "(" @ " * @ ")" ]
II
transform exp
on "factor-left" Plus(Times( a,b), Times( a,c)): Times( a, Plus(b,c)),
on "factor-left" Minus(Times(a,b),Times(a,c)): Times(a,Minus(b,c)),
on "factor-right" Plus(Times(b,a),Times(c,a)): Times(Plus(b,c),a),
on "factor-right" Minus(Times(b,a), Times( c,a)): Times(Minus(b,c),a),
on "distribute-left" Times(a,Plus(b,c)) : Plus(Times(a,b),Times(a,c)),
on "distribute-left" Times(a,Minus(b,c)) : Minus(Times(a,b),Times(a,c)),
on "distribute-right" Times(Plus(b,c),a) : Plus(Times(b,a),Times(c,a)),
on "distribute-right" Times(Minus(b,c),a) : Minus(Times(b,a),Times(c,a)),
on "commute" Plus(a,b) : Plus(b,a),
on "commute" Times(a,b) : Times(b,a)
126 Chapter 6. Defining Hybrid Editors with the Synthesizer Generator
Imain 1
1((3 * 4} + (3 * 2}}1
Imain I
1(3 * (4 + 2}}1
For this tenn, the only transfonnations that apply are distribute-left and com-
mute. Next, we apply the commute transfonnation.
6.2. Integration of Text Editing and Structure Editing 127
Imain I
1((4 + 2) * 3)1
Imain 1
1((4 * 3) + (2 * 3))1
We could get back to the original expression by selecting each of the summands
individually and applying the commute transformation to them.
It is possible to have more than one transformation with the same command
name. For example, the two transformations defined above for commuting
expressions are both associated with the command name commute:
transform exp
on "commute" Plus(a,b) : Plus(b,a),
on "commute" Times(a,b) : Times(b,a),
Because the root operator is different for the patterns that appear in the two
transformations, it happens that they can never both match the current selection;
thus, at most one of the transformations ever applies. In general, however, the
current selection may match the pattern of several transformations that have the
same command name. In this case, invoking the command causes the transfor-
mation that appears earliest in the specification to be applied.
128 Chapter 6. Defining Hybrid Editors with the Synthesizer Generator
transform exp
on "form-subtraction" Plus(a,Uminus(b)) : Minus(a,b),
on "form-subtraction" Plus(a,b) : Minus(a,Uminus(b)),
on "form-addition" Minus(a,Uminus(b)) : Plus(a,b),
on "form-addition" Minus(a,b) : Plus(a,Uminus(b))
Imain 1
1(3 + 2}1
Imain I
1(3--2}1
In contrast, if the initial situation is the one pictured below, the current selec-
tion, (3 + -2), corresponds to the term Plus(Const(3), Uminus(Const(2))).
This term is of the form Plus(a, Uminus(b)), which matches the pattern of the
first of the two form-subtraction transformations.
Imain I
1(3+-2}1
Imain I
1(3-2}1
4Additional material concerning the use of attribute equations in editor specifications also appears in
later chapters of the book.
6.3. Defining Computed Display Representations 131
Suppose we wish to display expressions in infix format, but using only the
minimal number of parentheses needed to represent the underlying expression
tree. In determining an expression's minimal parenthesization, we assume the
standard precedence rules for the arithmetic-operation symbols - Times and
Div have higher precedence than Plus and Minus, and unparenthesized terms at
the same level associate to the left. For instance, the term Times(Plus(a, b), e)
should be displayed as (a + b) * e, and Plus(a, Times(b, e)) should be
displayed as a + b * e. In addition, there should be no redundant parentheses:
i.e. ((a + b)) * c and (a + b * e) are forbidden. Figure 6.9 lists the minimally
parenthesized infix representations for all terms of the form Op\(a, Opz(b, e))
and Opz(Op\(a, b), e).
Note that the use here of the term "precedence level" is separate and distinct
from its use in Section 6.2. In Section 6.2, "precedence level" was used in con-
nection with precedence declarations for disambiguating parsing rules. The pre-
cedence levels discussed in this section are introduced for computing the
minimal use of parentheses in an expression's infix display representation. The
latter precedence levels are defined with SSL's attribution mechanism; different
precedence levels are represented by different values of certain attributes. In
contrast, precedence declarations used for disambiguating parsing rules are a
Figure 6.9. Minimally parenthesized infix representations for all terms of the form
Op,(a, Op2(b, c)) and Op2(Op,(a, b), c).
132 Chapter 6. Defining Hybrid Editors with the Synthesizer Generator
feature that is built into SSL; tokens are assigned different precedence levels
depending on the order in which the precedence declarations are encountered.
The equations that define the values of attributes Ip and rp, given below, define
them to be either the null string or a parenthesis-valued string - ( in the case of
Ip, ) in the case of rp. The Ip and rp values are incorporated as components of
the unparsing declarations, thereby placing parentheses or null strings around
each subexpression, as shown below:
In the equations given above, note how the precedence value passed down to
the right child of a binary operator is made one greater than the local precedence
level. Artificially raising the precedence level in this fashion causes parentheses
to be generated for terms like Times(a, Div(b, e)) and Div(a, Times(b,e)),
which are displayed as a * (b / e) and a / (b * e), respectively. It also causes
parentheses to be supplied around the second component of a term that involves
associative operators but is not a "left association." For example, the term
Plus(a, Plus(b, e)) is displayed with parentheses to indicate that the term is
grouped to the right: a + (b + e); by contrast, the term Plus(Plus(a, b), e) is
displayed without any parentheses: a + b + e.
The equations given above would be appropriate for an editor that allows struc-
tural editing of subexpressions. For example, the resting-place symbols in the
unparsing declarations for Variation 11 make it possible to select subexpressions
.and edit the substructure of an expression directly. Under these conditions, it is
desirable to generate parentheses that indicate the true grouping of terms involv-
ing associative operators. For example, the parentheses in a + (b + e) but not in
a + b + e allow one to determine that the former corresponds to the term
Plus(a, Plus(b, e)) and the latter to Plus(Plus(a, b), c).
However, an editor specification could use an alternative set of unparsing
declarations to specify that expressions have no internal resting places, such as
134 Chapter 6. Defining Hybrid Editors with the Synthesizer Generator
The absence of any resting-places in these declarations means that the internal
structure of an expression is inaccessible, which forces expressions to be treated
as textual phrases. For these editing conditions, if a term can associate either
way without affecting the expression's value, we would like to make an
expression's internal structure completely invisible to the user. In particular, the
three associative laws
(a + b) + c = a + (b + c)
(a * b) * c = a * (b * c)
(a + b) - c = a + (b - c)
say that the respective terms for these expressions have the same value no
matter what the association. Accordingly, we would like parentheses to be left
out of the display representation of such terms. For example, the terms
Plus(Plus(a, b), c) and Plus(a, PluS(b, c)) should both be displayed as
a + b + c. (We are assuming that the symbol/denotes the operation of integer
division, which means that * and / are not associative; in general, for these
operations (a * b) / c #. a * (b / c). Consequently, the term Div(Times(a, b), c)
should be displayed as a * b / c, but Times(a, Div(b, c)) should be displayed as
a * (b / c).)
Figure 6.10 lists our revised notion of the minimally parenthesized infix
representation for the terms of the form Op,(a, Op2(b, c)) and
Op2(Op,(a, b), c). Note the absence of parentheses in the lines marked with a
dagger (t).
One way to express the parenthesization scheme listed in Figure 6.10 is given
in Figure 6.11. The declarations in Figure 6.11 use four precedence levels to
capture the revised parenthesization requirements, rather than just three as in the
previous version. The main difference is that the local level of operator Div is
2, whereas Times has precedence level 3; the use of different levels for Div and
Times makes it possible to write rules that distinguish between them.
6.4. Context-Sensitive Translations and Transformations 135
Figure 6.10. Minimally parenthesized infix representations for all terms of the form
0pt(a, Op2(b, c)) and 0p2(Opt(a, b), c). Note the absence of any parentheses in the
lines marked with a dagger (t).
136 Chapter 6. Defining Hybrid Editors with the Synthesizer Generator
Figure 6.11. Attribute equations that express the minimal parenthesization scheme given
in Figure 6.10.
Recall that any declared phylum, including the phyla used to represent the
abstract syntax of editable objects, is a valid attribute type. Hence, attribute
computations can construct terms of the phyla that represent the abstract syntax
of editable objects. By making available the full power of the attribution
mechanism for defining the translation of input text to a term of the abstract syn-
6.4. Context-Sensitive Translations and Transformations 137
Second, we define phylum identifier and give rules for translating textual input
to an identifier.
list ENV;
ENV NullEnv( )
EnvConcat(BINDING ENV) [@ : [",%n"] @]
A
The recursive function lookup returns the binding for a given IDENTIFIER if it
exists in a given environment, or returns the term Null if the IDENTIFIER does
not exist.
Sit is also necessary to supply an attribute equation for the exp.env attribute in the Binding operator,
although this attribute is never actually given a value:
ExpCommand {
inherited ENV env;
synthesized exp abs;
};
exp ExpCommand.abs { ExpCommand.env = exp.env;};
ExpCommand ::= ('.' IDENTIFIER) {
ExpCommand.abs = lookup(IDENTIFIER, ExpCommand.env);
};
Imain I
define a = 2 + 5 in l<exp>1 ni
Imain I
define a = 2 + 5 in []J ni
Imain I
define a = 2 + 5 in [ijJ ni
Imain I
Figure 6.12. Snapshots of the display that, from top to bottom, illustrate the use of an ab-
breviation. By a context-sensitive translation, the input text .a is translated to the expres-
sion to which the identifier a is bound, i.e. (2 + 5).
synthesized or inherited attribute a, then the expression p.a is legal within the
transfonnation's expression and denotes the value of the corresponding attribute
instance.
Example. To illustrate the definition of a context-sensitive transfonnation, let
us extend the expression-list editor to pennit the use of abbreviation names in
expressions:
exp Use(IDENTIFIER);
Exp (IDENTIFIER) {Exp.abs = Use(IDENTIFIER);};
Note the use of e.env as the second argument of the call on lookup; e.env
denotes the value of the env attribute of the root of the tenn to which pattern
variable e is bound. In other words, e.env refers to the value of the env attri-
bute of the root of the currently selected expression.
Figure 6.13 shows a sequence of snapshots of the display that illustrate the
effect of a context-sensitive transfonnation. The invocation of the expand-
abbreviation transfonnation in Figure 6.13 causes identifier a to be replaced by
the expression 2 + 5.
142 Chapter 6. Defining Hybrid Editors with the Synthesizer Generator
Imain I
define a = 2 + 5 in @l ni
Imain i'~
COMMAND: expand-abbreviation
define a = 2 + 5 in @l ni
Imain I
define a = 2 + 5 in 12 + 51 ni
Figure 6.13. Snapshots of the display that illustrate the expansion of an identifier to the
expression to which it is bound in an enclosing abbreviation. This transfonnation is
defined with a context-sensitive transfonnation rule.
CHAPTER 7
Imain I
program <identifier>;
var
[til : integer;
a: boolean;
c : integer;
begin
b:= c;
while a do
<statement>
end.
Positioned at identifier
the loop-condition has the wrong type. Each error is signaled to the user by
attaching an appropriate comment.
imain i
program <identifier>;
var
@] { MULTIPLY DECLARED} : integer;
a { MULTIPL Y DECLARED} : boolean;
c : integer;
begin
b { NOT DECLARED} := c;
while a { BOOLEAN EXPRESSION NEEDED} do
<statement>
end.
Positioned at identifier
tion needs to be aggregated and in what manner these aggregates will be passed
to the locations that need them. At the same time, it is often possible to formu-
late different attribution schemes to implement the same aggregation strategy.
For example, to collect information that is distributed throughout an object,
left-to-right threading and right-to-Ieft threading are sometimes both possible.
Additional considerations arise when implementing a given design. There are
implementation choices to be made similar to those that arise with any program-
ming language, such as the choice among different implementations of a given
abstract data type (e.g. the choice between using a list versus using an AVL-tree
to implement a dictionary). The data types available in SSL and the techniques
for manipulating them will be familiar to those acquainted with other functional
programming languages. 1
These issues are illustrated using two examples that represent alternatives to
the design presented in Chapter 4 for an editor that performs name analysis and
type checking. Instead of annotating each occurrence of an undeclared variable
with a warning, the editor discussed in Section 7.1 collects the names of unde-
clared variables and lists them at the beginning of the declaration section.
Instead of requiring an identifier to be used in a way that is consistent with its
declaration, the editor discussed in Section 7.2 infers a type for each variable
from its uses and generates an appropriate declaration.
'We anticipate that future releases of the Synthesizer Generator will have some additional built-in
data types for aggregates in order to provide more efficient updating of aggregate-valued attributes
using the techniques described in [Hoover871.
7.1. Aggregation and Information-Passing Strategies 147
The editor described in Chapter 4 uses the first of these strategies, although it
uses an indirect method of aggregation in which the aggregate is "created" by a
syntactic reference. Recall that a syntactic reference permits attribute equations
to refer to and to perform computations on syntactic components. In Chapter 4,
the program's declList component serves as the aggregated declaration infor-
mation. The declarations given below, which are repeated from Chapter 4,
establish a local attribute, named env, whose value is the program's deciList
component:
The expression {Prog.env} in the second equation refers to the instance of attri-
bute env attached to the nearest enclosing Prog operator; in this case there can
only be one such operator, the one at the root of the program tree.
Direct use of component declList via a syntactic reference is by no means the
only way of aggregating information from the declarations. For example, it
would be possible to attach explicit env attributes to each phylum and write
appropriate attribute equations to define their values, as done in Chapter 3 in
Figure 3.2.
In the editor described in Chapter 4, violations of the constraint that a declara-
tion be supplied for each variable used in a program are reported by annotating
each use of an undeclared variable with a warning message. An alternative way
to report that a program contains undeclared variables would be to collect the
names of the undeclared variables and list them at the beginning of the
program's declaration section. For instance, in the example program shown
below, the fact that a declaration has not been provided for variables a and b is
reported on the second line:
148 Chapter 7. Performing Static Inferences with Attributes
Imain I
program <identifier>;
var a, b : UNDECLARED;
!<identifier>I : integer;
begin
a :=b
end.
Positioned at identifier
When a declaration for variable a is inserted, the message on the second line
would change to reflect the fact that variable b is the lone remaining variable
without a declaration:
Imain I
program <identifier>;
var b : UNDECLARED;
@] : integer;
begin
a:= b
end.
Positioned at identifier
We now describe how to write an attribution scheme that achieves this effect.
The attribute equations use the third aggregation strategy outlined above: the
equations aggregate information from the declarations, aggregate additional
information from the statements, and compare the two aggregates. In particular,
the attribution scheme collects the set of declared variables, the set of used vari-
ables, and takes their set difference.
The two collections of identifiers are represented by phylum IdSet, which is
defined in Figure 7.1 along with several operations on IdSets. An IdSet is a list
of identifiers; the IdSet operations given in Figure 7.1 implement the data type
"set of identifiers," maintaining each IdSet value as a sorted list.
To determine which identifiers lack a declaration, we use two synthesized
IdSet attributes, named declared and used. The declared attribute is a syn-
7.1. Aggregation and Information-Passing Strategies 149
list IdSet;
IdSet IdSetNil( ) [@:l
IdSetPair(identifier IdSet) [@ : • [ ", " 1@ 1
IdSet NullldSet( ) { IdSetNil };
BOOL IsNull(ldSet 5) { 5 == IdSetNil };
BOOL IsElement(identifier i, IdSet 5) {
with (5) (
IdSetNil: false,
IdSetPair(id, t): i==id ? true: IsElement(i, t)
)
};
IdSet SingletonldSet(identifier i) { i :: IdSetNil };
IdSet IdSetUnion(ldSet 51, IdSet 52) {
with (51) (
IdSetNiI: 52,
IdSetPair(i1, t1): with (52) (
IdSetNil: 51,
IdSetPair(i2, t2):! i1 < i2 ? i1 :: IdSetUnion(t1, 52)
: i1 == i2 ? i1 :: IdSetUnion(t1, t2)
i2 :: IdSetUnion(s1, t2)
};
IdSet IdSetDifference(ldSet 51, IdSet 52) {
with (51) (
IdSetNil: 51,
IdSetPair(i1, t1): with (52) (
IdSetNil: 51,
IdSetPair(i2, t2): i1 < i2 ? i1 :: IdSetDifference(t1, 52)
: i1 == i2 ? IdSetDifference(t1, t2)
IdSetDifference(s1, t2)
};
Figure 7.l. Definition of the module IdSet, which implements the data type "set of
identifiers." The operations given above maintain an identifier set as a sorted list.
thesized attribute of phyla declList and decl; it is used to compute the set of
identifiers declared in the program. The used attribute is a synthesized attribute
of phyla exp, stmtList, and stmt; it is used to compute the set of identifiers that
occur in the program's statements and expressions. The attribute declarations
and equations that define declared and used are given in Figure 7.2.
The set of undeclared identifiers that are used in a program is detennined by
taking the set difference of the declList.declared and stmtList.used attribute
occurrences of the Prog operator. The following attribute equations and
150 Chapter 7. Performing Static Inferences with Attributes
Figure 7.2. Attribution scheme to determine which identifiers lack a declaration. The
declared attribute computes the set of identifiers declared in a program; the used attri-
bute computes the set of identifiers that occur in the program's statements and expres-
sions.
program: Prog {
localldSet undeclared;
local STR error;
undeclared = IdSetDifference(stmtList.used,decIList.declared);
error = IsNull(undeclared) ? "" : " : UNDECLARED;";
}
[ @ : "program" @ ";%n"
"var" undeclared error "%t%n"
@";" "%b%n"
"begin" "O/otO/on"
@ "%b%n"
"end."
The only other changes that must be made to the unparsing declarations
presented in Figure 4.5 are to the ones for operators Assign and Id so that they
no longer display a message indicating the presence of an undeclared variable:
Imain I
program <identifier>;
var
a, b: <type>;
c: integer;
begin
a :=b;
c:= 1;
while I<exp> I do
<statement>
end.
Positioned at exp
2The language's abstract syntax is redesigned to eliminate the declaration component from programs
entirely:
Imain I
program <identifier>;
var
a, b, c: integer;
begin
a:= b;
c:= 1;
while I(b = c)1 do
<statement>
end.
Positioned at exp
By changing the one statement that forces the type of variable C to be integer
from C := 1 to c := true, the inferred type for all three variables changes to
boolean:
Imain I
program <identifier>;
var
a, b, c: boolean;
begin
a :=b;
c := Itruel;
while (b = c) do
<statement>
end.
Positioned at exp
A program has a type conflict when the program's constituents require a vari-
able to have more than one type. For example, there are type conflicts for vari-
ables a, b, and c if we insert the statement a := 1, which requires that a be of
type integer (hence band c are also of type integer). The type conflicts are
reported by generating a declaration that assigns them the type inconsistent.
154 Chapter 7. Performing Static Inferences with Attributes
Imain I
program <identifier>;
var
a, b, c: inconsistent;
begin
a:= b;
c := true;
while (b = c) do
la:= 11
end.
The four operators of typeExp are treated as the elements of a four-element lat-
tice. The top element of the lattice, EmptyTypeExp, stands for the unknown,
or most general, type; the bottom element of the lattice, NoTypeExp, stands for
the inconsistent type.
EmptyTypeExp
/~
IntTypeExp
BoolTypeExp
~/
NoTypeExp
one context and as a boolean in another, the final type inferred for the variable is
NoTypeExp, the meet of IntTypeExp and BoolTypeExp. NoTypeExp
represents the inconsistent type, which indicates a conflict in how the variable is
used in the program.
The meet operation for the type lattice is implemented by the function Meet,
given in Figure 7.3.
Assignment-statements of the form a := b do not necessarily cause further
narrowing of the types of a and b. However, assuming coercion is not a feature
of the language, the presence of such a statement requires that a and b be of the
same type. Thus, in addition to associating variables with types, the type
environment must partition variables into equivalence classes of variables with
the same types.
Program variables in the same type class are represented by phylum
TypedSet, which consists of an IdSet together with a typeExp, representing
the type common to all of the variables in the IdSet.
Figure 7.3. The function Meet implements the meet operation for the four element lat-
tice of types.
156 Chapter 7. Performing Static Inferences with Attributes
list TypeEnv;
TypeEnv: TypeEnvNiI( ) [@ : 1
I TypeEnvPair(TypedSet TypeEnv) [@: • [ n;%nn 1@ 1
TypeEnv NuliTypeEnv( ) { TypeEnvNil };
TypeEnv TypeEnvUnion(TypeEnv c, identifier i1, identifier i2) {
with (c) (
TypeEnvNiI:
TypedSetOp(ldSetUnion(SingletonldSet(i1), SingletonldSet(i2)),
EmptyTypeExp) :: TypeEnvNil,
TypeEnvPair(ts as TypedSetOp(s, tp), tail):
IsElement(i1, s)
? IsElement(i2, s)
?c
: TypeEnvUnion2(tail, s, tp, i2)
: IsElement(i2, s)
? TypeEnvUnion2(tail, s, tp, i1)
: ts :: TypeEnvUnion(tail, i1, i2)
};
TypeEnv TypeEnvUnion2(TypeEnv c, IdSet s1, typeExp tp1, identifier i2) {
with (c) (
TypeEnvNil:
TypedSetOp(ldSetUnion(s1, SingletonldSet(i2)), tp1) :: TypeEnvNil,
TypeEnvPair(ts2 as TypedSetOp(s2, tp2), tail):
IsElement(i2, s2)
? TypedSetOp(ldSetUnion(s1 , s2), Meet(tp1, tp2)) :: tail
: ts2 :: TypeEnvUnion2(tail, s1 , tp1, i2)
};
TypeEnv MeetType(TypeEnv c, identifier i, typeExp t) {
with (c) (
TypeEnvNil: TypedSetOp(SingletonldSet(i), t) :: TypeEnvNil,
TypeEnvPair(ts, tail): with(ts) (
TypedSetOp(s, tp): IsElement(i, s)
? TypedSetOp(s, Meet(t, tp)) :: tail
: ts :: MeetType(tail, i, t)
};
contain i1 and i2. In place of two separate TypedSets, in the new TypeEnv i1
and i2 are grouped in the same TypedSet; the IdSet element is the union of the
IdSets for i1 and i2 and the typeExp element is the meet of the typeExps for i1
and i2.
The operation MeetType(TypeEnv c, identifier i, typeExp t) creates a
TypeEnv value that is identical to c except that the TypedSet that contains i,
which in C is associated with some type tp, is associated with type Meet(t, tp) in
the new TypeEnv.
The attribute computation for inferring types in a program is expressed using
two attributes, named typeEnvBefore and typeEnvAfter, which are threaded
left to right through the instances of phyla exp, stmtList, and stmt in a program
tree as shown in Figures 7.5 and 7.6.
The equations for the operators of stmt and exp make use of two additional
functions on phylum TypeEnv:
and
exp {
inherited TypeEnv typeEnvBefore;
synthesized TypeEnv typeEnvAfter;
};
exp : EmptyExp, IntConst, True, False, Id {
exp.typeEnvAfter = exp.typeEnvBefore;
}
I Equal, NotEqual {
exp$2.typeEnvBefore =
PossibleMeetType(exp$1.typeEnvBefore, exp$2, exp$3.type);
exp$3.typeEnvBefore =
PossibleMeetType(exp$2.typeEnvAfter, exp$3, exp$2.type);
exp$1.typeEnvAfter = PossibleUnion(exp$3.typeEnvAfter, exp$2, exp$3);
}
I Add {
exp$2.typeEnvBefore =
PossibleMeetType(exp$1.typeEnvBefore, exp$2, IntTypeExp);
exp$3.typeEnvBefore =
PossibleMeetType(exp$2.typeEnvAfter, exp$3, IntTypeExp);
exp$1.typeEnvAfter =exp$3.typeEnvAfter;
}
Figure 7.5. Attribution rules for inferring types in expressions. The two attributes
typeEnvBefore and typeEnvAfter are threaded left to right to collect a TypeEnv that is
consistent with the ways variables are used in the expression.
158 Chapter 7. Performing Static Inferences with Attributes
stmtList, stmt {
inherited TypeEnv typeEnvBefore;
synthesized TypeEnv typeEnvAfter;
};
program: Prog { stmtLisUypeEnvBefore = NuliTypeEnv; };
stmtList : StmtListNii { stmtLisUypeEnvAfter =stmtLisUypeEnvBefore; }
I StmtListPair {
stmttypeEnvBefore = stmtList$1.typeEnvBefore;
stmtList$2.typeEnvBefore = stmttypeEnvAfter;
stmtList$1.typeEnvAfter = stmtList$2.typeEnvAfter;
}
Figure 7.6. Attribution rules for inferring types in a program. The two attributes
typeEnvBefore and typeEnvAfter are threaded left to right to collect a TypeEnv that is
consistent with the ways variables are used in the program.
These functions, defined in Figure 7.7, are used in the equations given in Figures
7.5 and 7.6 to create a TypeEnv that is a narrowing of TypeEnv C when the
arguments of phylum exp are identifiers (as opposed to other kinds of expres-
sions). For example, PossibleMeetType is used in the equations for an
7.2. Using the Attribution Mechanism to Perform Type Inference 159
the TypedSet for variable x gets narrowed by taking the meet of its type with
BoolTypeExp. When the condition of a conditional-statement consists of
something other than a lone identifier (e.g. X <> 100), additional narrowing at
the IfThenElse operator is unnecessary.
The value of attribute stmtLisUypeEnvAfter in operator Prog that is pro-
duced by the equations in Figure 7.6 is a TypeEnv that is consistent with the
ways variables are used in the program. The following unparsing declaration
causes it to be displayed as the program's (generated) declarations:
Practical Advice
<-DIVISION BY ZERO->
appears in the display and the value of the dividend is used as the value of the
quotient. For example:
Chapter 8. Practical Advice 161
Imain I
(1 + 2)
VALUE = 3;
Positioned at calc
Imain I
Ileta= 1 inl
Iletb=2+ainl
Ilet a = a + a inl
la+ bl
[ill]
[ill]
[ill]
IVALUE = 51
Positioned at calc
162 Chapter 8. Practical Advice
<--UNDEFINED SYMBOL
is printed after the name, and the value associated with the name is O.
The first step in implementing the editor is to define the abstract syntax of the
objects to be edited.
A context-free or BNF grammar for the given language, if available, may pro-
vide a convenient starting point. Bear in mind, however, that such grammars
have usually been designed to serve a different purpose. Concrete syntax
defines the strings of a language, and has often been fashioned to serve as input
to a parser generator. In order to make the grammar unambiguous, extra syntac-
tic categories may have been introduced; in order that left-to-right parsing be
deterministic, the grammar may have been factored unnaturally.
Such parsing considerations are not relevant in designing abstract syntax for
an editor. What is essential is that the structure of the abstract-syntax trees be
convenient for editing. The hierarchical decomposition of an edited object must
make sense to editor users, for the subtrees of this hierarchy, and not arbitrary
substrings of the object's textual representation, will be selected, edited, cut, and
pasted.
As a rule, minimize unnecessary syntactic distinctions. Generated editors
enforce context-free syntactic correctness by restricting the insertion of subtrees
with root symbol X to contexts where an X is permitted. However, a poorly
written abstract syntax can tum this advantage into a weakness; inappropriate
categories or excessive refinement will impede rather than assist editing. For
8.1. How to Begin Developing an Editor 163
root calc;
list calc;
calc CalcNil( )
CalcPair(exp calc)
exp Null( )
Sum, Diff, Prod, Quot(exp exp)
Const(INT)
In designing abstract syntax, omit all terminal characters that are merely syn-
tactic sugar; the operator of a production is sufficient to distinguish it from the
other alternatives of the left-hand-side phylum. The only terminal symbols that
should be retained in abstract-syntax trees are identifiers, numerals, and other
lexemes that carry semantic information.
The use of abstract-syntax trees as the sole intermediate representation of
objects necessarily results in textually distinct expressions being represented by
one and the same tree. Because the display of an object is generated from this
tree alone, the abstract syntax imposes a canonical form not only on the struc-
ture of objects being edited, but on their textual representation as well.
In the abstract syntax, one should retain precisely the distinctions worth keep-
ing and discard inessential differences in favor of a canonical representation and
display format. Note that a construct that is omitted from the abstract syntax,
such as redundant parentheses, may be includ~d in the parsing grammar. In this
case, one has the effect that the construct is translated to canonical form on
input.
Example. In the desk calculator, only the number represented by a numeral is
stored, which results in some loss of information from the original input text.
For instance, the numerals 007 and 7 will both be translated to the INT value 7,
which is displayed as the numeral 7. If numerals are to be displayed exactly as
they are entered, they should appear in the abstract-syntax tree as lexemes, i.e.
strings. In other words, if the distinction between 007 and 7 is to be preserved,
then the last alternative of exp's phylum declaration should have been
164 Chapter 8. Practical Advice
exp Const(INTEGER);
The second step in implementing the editor is to provide an initial set of unpars-
ing declarations. It is pointless to devote much effort to fancy pretty-printing
until the abstract syntax has been firmly settled. At this stage, define only
enough syntactic sugar to debug the abstract syntax. Defer consideration of
alternative unparsing schemes, optional line breaks, context-dependent display
formats, and special fonts. However, line breaks and simple indentation rules
are advisable at this stage. At locations where attribute values will ultimately be
displayed, either provide some temporary indication of the value or omit the
reference entirely.
Unparsing declarations also specify permissible resting places for the selec-
tion. Start with a maximal number of resting places, since this will allow full
exploration of syntax trees while debugging the abstract syntax and the attribute
equations. Plan to eliminate undesirable resting places later.
Occurrences of primitive phyla are an exception to the rule - they should not
be resting places. To see why, imagine that the selection is positioned on an
instance of INT in one of the desk calculator's expressions. Executing delete-
selection would replace the selection with 0, the placeholder term of INT. In
contrast, by making INT a non-resting place the selection is forced one node
higher in the tree, to the entire Const(INT) subterm. In this case, executing
8.1. How to Begin Developing an Editor 165
delete-selection would replace the selection with Null, the placeholder tenn of
exp, whose display representation is the string <exp>. In general, the textual
representation of a placeholder tenn of a primitive phylum is not mnemonic,
whereas that of a user-defined phylum can be made so.
Example. The following unparsing declarations specify that the desk
calculator's expressions are displayed fully parenthesized, on separate lines, fol-
lowed by their values (temporarily represented by??), and separated by semi-
colons:
The third step in creating the editor specification is to provide a set of transfor-
mations that will permit the top-down derivation of an object. Complex
transformations for restructuring objects may be added later; for now, define.
only transformations that correspond directly to productions of the grammar.
Such transformations are called template transformations, because invoking one
has the effect of providing a template into which additional components may be
inserted.
The template transformation associated with a production of the form
x0 : operator (X 1 ... Xn )
Any of the <Xi> in the replacement expression may instead be [Xi], with the
choice affecting whether or not a placeholder for Xi appears in the resulting
template. (When Xj is neither an optional phylum nor an optional list phylum,
the distinction between <Xi> and [Xi] is irrelevant, since the placeholder and
completing terms for Xi are the same.)
Example. The following transformations permit the derivation of expression
trees in the desk calculator:
transform exp
on "+" <exp> : Sum «exp>, <exp»,
on "-" <exp> : Diff «exp>, <exp»,
on "*,, <exp> : Prod «exp>, <exp»,
on "/" <exp> : Quot «exp>, <exp»
Template transformations are normally not required for list phyla and optional
phyla, since adequate transformations are already built into some of the com-
mands that move the selection (e.g. forward-with-optionals and forward-
sibling-with-optionals). Template transformations are also not appropriate for
productions corresponding to lexemes; the mechanism for inserting lexemes will
be defined later when the concrete syntax for textual input is defined.
8.1. How to Begin Developing an Editor 167
The partial editor specification that has been written up to this point defines
enough of the editor's structure-editing functions that it is usually worthwhile to
create an editor from the specification and test the editor's characteristics. Gen-
erate an editor by invoking the sgen command on the specification files; by
default, the editor that sgen creates is placed in file syn.out. This editor should
be able to derive any term, except for token leaves.
The next step in implementing the editor is to define the concrete syntax for text
input. As a first cut, provide only enough rules to permit lexemes and simple
expressions to be entered. The parser can be elaborated later to permit addi-
tional language constructs to be entered as text. Ultimately it may be desirable
to implement a parser for the entire language so that users of the editor can read
in objects from existing text files.
Example. The following rules permit the user of the desk calculator to enter
an integer constant at an expression. The rules designate phylum Exp to be an
entry point to the parser when the selection is positioned at an exp in the
abstract-syntax tree; the rules define the concrete syntax for expression input to
be the strings of phylum Exp, which, by the declarations shown below, consists
only of the INTEGER lexemes:
Note that, as in the declaration of the INTEGER phylum, a blank must precede
the closing angle bracket of a lexeme declaration.
Do not forget to provide appropriate entry declarations; if an entry declaration
is lacking for a phylum X, the editor has no way to parse text entered when the
selection is positioned at an X -node. (In such a case, the user is forbidden from
even entering text at an X -node.)
It is a good idea to put in the rule for scanning whitespace now, so that it is
not left out later, by accident, when additional rules are added to the editor
specification to permit more elaborate phrases to be parsed.
Once we have provided rules for parsing lexemes, the specification defines an
editor that can be used to derive any term of the language. In many cases these
rules are sufficient for the purposes of development. However, some editor-
designers may find it desirable to provide a few additional rules to permit simple
phrases to be entered as text.
Example. The declarations given below augment the previous definition of
phylum Exp to consist of the arithmetic expressions with optional parentheses;
the precedence declarations specify that * and / take precedence over + and -.
With these rules, the user can now enter all arithmetic expressions when the
selection is positioned at an exp node of the abstract-syntax tree.
For now, write the productions for parsing lists as right-recursive rules rather
than as left-recursive rules. Although right-recursive rules have the disadvan-
tage that it will not be possible to parse more than some fixed number of list ele-
ments (because the parser stack overflows at some point), right-recursive rules
are somewhat more straightforward to write. The rules can be changed later to
use left recursion. (See Section 5.5, "Parsing Lists".)
Example. The following right-recursive rules permit the user to enter a list of
expressions when the current selection is positioned at a calc node:
calc Calc.abs;
Phylum calc in the example above is a (non-optional) list phylum. The list
property is used when a phylum is to be treated as a sequence whose minimum
8.1. How to Begin Developing an Editor 169
length is one; consequently, the rules for parsing the corresponding concrete
phylum (i.e. phylum Calc) should be written, as above, so that a sequence of
one or more elements is accepted. By contrast, the parsing rules that should be
supplied for a concrete phylum that corresponds to an optional-list phylum are
slightly different; they should be written to accept a sequence of zero or more
elements.
Example. If calc were declared to be an optional list, the rules for the desk
calculator's concrete input syntax would be written as follows:
At this point, a new version of the editor should be generated to test and
debug the rules for parsing textual input.
Once the (preliminary version of the) editor's concrete input syntax is settled,
it is a good idea to use sgen's -r option whenever a new version of the editor is
generated. The -r option directs sgen to attempt to bypass unnecessary calls on
some of the subsidiary tools that are invoked as part of editor generation, such
as yacc, lex, and cc. (When the -r option is employed, sgen compares the new
versions of the intermediate files it generates with any previously generated
intermediate files that are in the local directory; if certain files are identical,
some of the steps of generating an editor are skipped. The new intermediate
files are retained for future comparisons.)
The fifth step in implementing the editor is to define the attribution schemes that
will be incorporated in the editor. Decide what information needs to be passed
around the syntax tree, and write the attribute equations that cause the right
information to be passed to the right locations. Ultimately, some of the informa-
tion collected in attributes will be used to annotate the display, for example, to
inform the user of inconsistencies and possible errors or to perform context-
sensitive pretty-printing. However, during development it is often better to con-
centrate on the attribute equations by themselves and to postpone modifying the
unparsing declarations to incorporate attributes in the display.
170 Chapter 8. Practical Advice
<-DIVISION BY ZERO->
and the value of the quotient is taken to be the value of the dividend; if the value
of the dividend is not 0, the value of error is the empty string.
At this point, a new version of the editor should be generated to test and
debug the specification's attribute declarations and attribute equations.
To examine a syntax tree's attribute values when debugging the attribution
rules, use the commands dump-on, show-attribute, and write-attribute.!
Because all three commands generate the textual representation of one or more
attribute values, it is worth spending the effort to provide unparsing declarations
for the phyla used to represent attribute values. This should be done for all attri-
bute phyla, not just the ones whose members will annotate the display in the
finished editor.
IThe dump-on command creates a textfile buffer that is thereafter associated with the buffer that is
the current buffer at the time of the command. In the new buffer, the editor displays the current
values of all attribute instances associated with the apex of the editing buffer's selection; the dump-
buffer's contents are updated whenever the selection in the editing buffer changes position. The
show-attribute command writes to a given buffer the textual representation of the value of an attri-
bute instance of the current selection's apex. The write-attribute command writes to a file the textu-
al representation of the value of an attribute instance of the current selection's apex.
8.1. How to Begin Developing an Editor 171
8.1.7. Putting the finishing touches on (an initial version of) the editor
The unparsing declaration for operator Quat needs to be altered so that each
quotient is displayed together with the value of the local attribute error. This
change causes occurrences of operator Quote where the denominator has the
value a to be annotated with the message <-DIVISION BY ZERO->.
list listType;
listType: ListTypeNull( )
I ListTypePair(listElementType listType) [@ ::= • @]
One last step needs to be taken to complete an initial version of the editor:
check that the editing-mode symbols that appear in the specification's unparsing
declarations (i.e. the ::= and: symbols) specify the desired text-editing proper-
ties for the various operators. For consistency, we recommend that textual
reediting be permitted for exactly those elements that can be entered as text.
Example. Suppose the desk calculator's parsing rules permit text entry only
of lexemes, but not full expressions, as outlined at the beginning of Section
8.1.5. The entry-mode symbols of the operators of phylum exp should be
changed to forbid textual reediting of the arithmetic operators:
ported only by two rather simple facilities that permit different parts of an editor
specification to be contained in different source files:
1) All argument files passed to the 5gen command (i.e. files with suffix .551)
are concatenated together in the order found on the command line.
2) SSL permits new attributes, operators, and equations to be attached to an
existing phylum and new attribute equations to be attached to existing opera-
tors.
Despite the primitiveness of these mechanisms, there are significant benefits to
be gained by taking advantage of them to help organize a specification.
There are two main advantages of a modular organization. First, it enhances
the comprehensibility of editor specifications. Second, it permits the same
"module" to be used in several editors; in particular, it facilitates the creation of
collections of related editors - both upward-compatible editors for extensions of
a base language as well as different editors for the same language that perform
different kinds of static inferencing or display the language in different formats.
Organizing an editor specification or editor extension in a modular fashion
involves placing closely related aspects of the specification in the same file and
less related aspects of the specification in different files. We recommend organ-
izing each layer of a specification in six files containing the following elements:
1) abstract-syntax declarations,
2) lexical declarations,
3) concrete-input syntax declarations,
4) attribute-domain declarations and operations on the attribute domain,
5) attribute declarations and equations, and
6) unparsing declarations.
In the remainder of this section, we describe and illustrate some guidelines for
organizing the specification's various files. To illustrate the conventions that we
have found to be useful, we augment the desk-calculator specification of Section
8.1 to extend the editor with additional constructs for let-expressions and the use
of bound names in expressions. The declarations from Section 8.1 are extended
by the declarations presented below.
Example. Two new operators, Let and Use, are added to the existing phylum
exp to represent let-expressions and uses of bound names, respectively. A new
phylum, symb, is also required, to represent the <name> component in a lete
expression.
symb OefNull( )
Oef(IO)
Some phyla are used exclusively to represent attribute values and never to
represent a subterm of an abstract-syntax tree. That is, they are are not deriv-
able from the root phylum, so their members can never appear as editable com-
ponents of a buffer. The declarations for such a phylum, together with associ-
ated operations on the phylum, should be encapsulated in a file as an abstract
data type.
Example. The scoping of names in let-expressions is block-structured, so
each exp must be evaluated in an environment of appropriate local name bind-
ings provided by an inherited environment attribute. One possible representa-
tion for such environment attributes is as a list of identifier-value pairs, as
defined by the following declarations for phyla BINDING and ENV:
176 Chapter 8. Practical Advice
list ENV;
ENV NullEnv( ) [@:]
EnvConcat(BINDING ENV) [ @ : A[",%n"] @ ]
The attribution rules of the existing arithmetic operators must also be extended
by adding attribute equations to pass the inherited environment to both the left
and right operands.
The remaining component contains the unparsing declarations for the new pro-
ductions declared in the abstract-syntax specification file.
Example. The unparsing declarations for let-expressions specify that all four
kinds of operators are editable and that operator Use is annotated by the value
of its error attribute.
# define ERROR
8.3. Problems That Frequently Arise 179
8.3.1. Comments
The chief problem with comments stems from the fact that, in many program-
ming languages, comments serve no linguistic purpose (other than to delimit
tokens) and consequently are permitted to appear in arbitrary places in program
text. In a conventional compiler, the scanner skips over the comment-delimiter
characters and the comment text at the same time that it skips over the "white-
space" characters that can also appear between tokens (e.g. blanks, tabs, and
newline characters).
By contrast, in editors generated with the Synthesizer Generator, comments
can appear only at certain selected places designated by the editor-designer. A
basic requirement is that the comment text must be preserved among the com-
ponents of a program's underlying abstract-syntax tree. Thus, comments can
only appear in the program display at the places where the editor's unparsing
declarations specify that these components appear. Even in user-typed input,
comments can only appear at certain selected places; any editor-designer would
find it very tedious work to try to create an input grammar that permitted com-
ments to appear between each pair of tokens.
Because the existing features of the Synthesizer Generator do not support com-
ments located in arbitrary places in the text of a program, an editor-designer is
restricted to providing more controlled ways of annotating programs with com-
ments. To illustrate some of the kinds of constructions that can be supported,
we describe below how to extend the programming-language editor from
Chapter 4 so that programs can be annotated with several classes of comments.
In these examples, a comment consists of a list of text lines enclosed in curly
braces, such as the following one:
180 Chapter 8. Practical Advice
list commentLines;
commentLines CommentLinesNil( ) [@ :]
CommentLinesPair(commentLine commentLines)
[ @ ::= ["%n "] @ ] A
When using the editor generated with this declaration, exactly one comment
appears in each buffer at the head of the program, as in the following example:
8.3. Problems That Frequently Arise 181
Imain I
Positioned at commentLines
optional optionalComment;
optional Comment
OptionalCommentNiI( ) [ A : ]
Imain I
{<comment>}
program <identifier>;
var
<identifier> : <type> ~ Now is the time I
Ifor all aood menl
Ito come to the aidl
lof the party. I};
begin
<statement>
end.
Positioned at commentLines
Imain I
COMMAND: comment-part
{<comment>}
program <identifier>;
var
la : integerl;
begin
<statement>
end.
Imain I
{<comment>}
program <identifier>;
var
a: integeri {<comment>}I;
begin
<statement>
end.
Positioned at optionalComment
Imain I
{<comment>}
program <identifier>;
var
<identifier> : <type>;
begin
£Now is the timel
Ifor all (lood menl
Ito come to the aidl
lof the party. I}
<statement>
end.
Positioned at commentLines
184 Chapter 8. Practical Advice
The two components commentLines and stmt are part of one operator. The
intention is that stmt be the implementation of the specification provided in the
comment part; to indicate the subordinate status of the implementation, the stmt
part is indented one level deeper than the comment text.
In conjunction with the Synthesizer Generator's alternate-unparsing facility,
statement-comments can be used to provide a mechanism for suppressing details
of a file. By furnishing an alternate unparsing scheme that elides the stmt com-
ponent of a StmtComment, an editor-designer can provide a mechanism for
hiding the refinement of a comment. For example, we could define the follow-
ing alternate unparsing scheme for the StmtComment operator, which would
display the body of a comment as the string ... :
By switching to the alternate unparsing scheme, the user could hide implementa-
tion details and, at the same time, cause more of the program to be displayed on
the screen.
One way of inserting statement-comments would be to have a template-
insertion command for them, which would be declared as follows:
For example, consider what happens when the comment command is invoked
in the program shown below:
8.3. Problems That Frequently Arise 185
Imain 1
COMMAND: comment
{<comment>}
program <identifier>;
var
a: integer;
begin
a:= 1;
la:= 2;1
la :=3;1
la :=41;
a :=5
end.
Imain I
{<commenb}
program <identifier>;
var
a: integer;
begin
a:= 1;
£<commenbl}
begin
a :=2;
a:= 3;
a :=4
end;
a:= 5
end.
Positioned at commentLines
186 Chapter 8. Practical Advice
Clearly, comment text itself (as opposed to an entire comment in the context of
some larger program construct) has to be entered and reedited textually, not
structurally. In the Synthesizer Generator, all text that is typed in by the user is
parsed to determine whether it is syntactically correct in the context of the
current selection, so comment text is parsed, even though it has a trivial syntac-
tic structure. (For most programming languages, comment text is a regular set.)
This requirement causes some slight inconveniences for the editor-designer,
because to parse comment text properly, it is necessary to treat the input text in
such a way that the ordinary whitespace lexemes defined by phylum WHITE-
SPACE go unrecognized. In particular, it is necessary to override the normal
consumption of whitespace that is enabled by the usual declaration of phylum
WHITESPACE. This is accomplished by means of (left-) context sensitive
scanning, specified with directives to shift the scanner to a different scanning
state.
The regular expression of a lexeme declaration is enabled only when the
current scanning state of the scanner matches the start-state of the declaration; if
a declaration has no start-state or has the start-state INITIAL, the declaration is
enabled in all scanner states. At the beginning of parsing, the scanner is a state
that only recognizes the INITIAL-state lexemes. The scanner's state can be
changed in two ways: by a start-state directive in an entry declaration and by a
final-state directive in a lexeme declaration.
Example. We now illustrate how to cause comment text to be treated as a
sublanguage separate from the rest of the editor's concrete-input language,
shielded by the scanner state NO_WHITESPACE. We will suppose that phy-
lum WHITESPACE is defined in the usual way:
The two kinds of lexemes that make up comment text - individual newline char-
acters and single lines of comment text, which are represented by phyla CLlNE-
BREAK and CLINE, respectively - are declared to be recognizable only when
the scanner is in scanner state NO_WHITESPACE:
The use of the scanner state in these declarations ensures that these tokens,
which are used only in the limited context of parsing comment text, do not
conflict with any of the other tokens employed in the editor's concrete input
syntax. However, it is also necessary to ensure the converse, namely that other
8.3. Problems That Frequently Arise 187
yCommentLines
inherited commentLines tail;
synthesized commentLines reversed;
};
yCommentLines
::= (yCommentLine) {
yCommentLines.reversed =
yCommentLine.a :: yCommentLines.tail;
commentLines - <NO_WHITESPACE>yCommentLines.reversed
yCommentLines.tail = CommentLinesNiI;
};
A second place where it is necessary to have the scanner state change from
INITIAL to NO_WHITESPACE is when a left curly bracket is recognized; this
change is triggered by the final-state component of the token declaration. The
state then stays as NO_WHITESPACE until a right curly bracket, whose final-
state component triggers the resetting of the scanner state to the IN ITIAL state:
In conjunction with the above declarations, the final change needed is to alter
the parsing rules for textual input of declarations so that the user can type in a
comment with each declaration:
Thus, although the program shown below contains a number of errors, none are
reported because the messages about them are suppressed by the principal
unparsing scheme.
!main !
COMMAND: alternate-unparsing-on
program !<identifier>!;
var
a: integer;
a: boolean;
c : integer;
begin
b :=c;
while a do
<statement>
end.
Positioned at identifier
!main !
program !<identifier>!;
var
a { MULTIPL Y DECLARED} : integer;
a { MULTIPLY DECLARED} : boolean;
c : integer;
begin
b { NOT DECLARED} := c;
while a { BOOLEAN EXPRESSION NEEDED} do
<statement>
end.
Positioned at identifier
Some problems arise with generated messages in connection with textual reedit-
ing. In particular, if a generated message can appear in a selection that is edit-
able as text, it is desirable that the user not have to explicitly delete the message
text for the modified text to (re-)parse correctly. One possibility is to anticipate
in the parsing grammar all possible messages that may occur in various loca-
tions throughout the program.
A much simpler approach is to choose a pair of special delimiters that enclose
all message text and treat all character sequences enclosed in these delimiters as
whitespace. What makes this approach possible is the fact that message values
(i.e. attribute values) are derived from the underlying abstract-syntax tree. Mes-
sage values reflect the state of the tree before the user's input causes the tree to
be modified; messages consistent with the state of the altered tree will be gen-
erated after change propagation of attribute values quiesces. Consequently, we
do not care about the old value of any message and are free to discard from the
user's input all the characters of the message's old display representation.
For example, suppose we choose the convention that error messages be
enclosed in the strings {-- and} so that they would appear as in the following
example:
192 Chapter 8. Practical Advice
Imain I
program !<identifier>l;
var
a {-- MULTIPLY DECLARED} : integer;
a {-- MULTIPLY DECLARED} : boolean;
c : integer;
begin
b {-- NOT DECLARED} := c;
while a {-- BOOLEAN EXPRESSION NEEDED} do
<statement>
end.
Positioned at identifier
Error phyla
In the examples given above, the error messages are string constants (i.e.
members of phylum STR). However, there are several good reasons for
defining a phylum of error values whose members unparse as the error mes-
sages. For example, the programming-language editor from Chapter 4 would
use phylum Error defined as follows:
Error
NoError( )
I MultiplyDeciared() r :"{-- MULTIPLY DECLARED }"]
I NotDeclared() r :" {--NOT DECLARED }"]
I BooleanNeeded() [A : " {__ BOOLEAN EXPRESSION NEEDED "]
I IntNeeded() [A: " {__ INT EXPRESSION NEEDED}"]
I IncompatibleTypes() [A: "{ __ INCOMPATIBLE TYPES }"]
I IncompatibleAssign() r :" {--INCOMPATIBLE TYPES IN := }"]
The attribute equations that establish error messages would be changed to use
these values, as in the following equation:
8.3. Problems That Frequently Arise 193
Working with error values instead of error strings allows error messages to be
formatted better. By defining a phylum of error values whose members unparse
as the error messages, the editor-designer can use formatting commands in the
messages' unparsing declarations to control how error messages are formatted.
With STR-valued error messages, the editor-designer would (usually) not have
such control; ordinarily, when a term t is displayed, any percent signs in STR
subterms of t and in STR-valued attributes of t are not interpreted as formatting
commands. (Although it is possible to change this behavior using an SSL direc-
tive - see The Synthesizer Generator Reference Manual - this is not generally
done.)
Example. The following unparsing declaration uses the %{, %0, and %}
directives to specify that if it necessary to break the line in the middle of the
error message, preference is given to breaking it between the two words:
Error MultiplyDeclared
r:
"%{{-- MULTIPLY %oDECLARED }%r];
Creating a phylum of error values also allows one to define error values that
have substructure, which is useful for defining parameterized error messages.
For example, we could parameterize the NotDeclared operator with an IDEN-
TIFIER whose value is displayed as part of the error message.
Error NotDeclared(IDENTIFIER)
r :"{--
ID" • "NOT DECLARED r];
exp Id {
local Error error;
error = with(identifier) (
IdentifierNull: NoError,
Identifier(i) :
IsDeclared(identifier, {Prog.env})
? NoError
: NotDeclared(i)
);
CHAPTER 9
TestEquallnt(LlNK) "TestEquallnt" .. ]
TestNotEquallnt(LlNK) r: "TestNotEquallnt" .. ]
Addlnt(LlNK) r: "Addlnt" .. ]
PushBool(BOOL LINK) r: "PushBool"
r:
A •• ]
TestEquaIBool(LlNK) "TestEquaIBool" .. ]
TestNotEquaIBool(LlNK) r: "TestNotEquaIBool" .. ]
BranchOnCondition(LlNK LINK) r: "BranchOnCondition" .... ]
The construction of the code graph is specified with two sets of attributes
named entry and next, which carry information used to link the fragmented
code together. The entry attribute at the root of a given program segment is a
link to the first instruction to be executed for that segment. The next attribute at
the root of a given program segment is a link to the first instruction to be exe-
cuted after the code for the given segment has been completed.
For the moment, assume that the type of the entry and next attributes is
LINK:
The values of the entry and next attributes are incorporated into instructions
to establish the code graph. Each stmt incorporates the LINK stmt.next into its
code fragment, and passes up a LINK to the first instruction of this fragment in
the attribute stmt.entry. Intermediate nodes in the abstract-syntax tree, such as
each StmtListPair node in a statement list, merely pass on linking information
in their entry and next attributes and do not contribute additional instructions to
the code.
Example. Recall that phylum stmtList is defined by the following abstract
syntax rules:
list stmtList;
stmtList: StmtListNil( )
I StmtListPair(stmt stmtList)
Attributes entry and next are passed through the operators of phylum stmtList
as follows:
The equation that defines the entry point of the conditional-statement also illus-
trates how intermediate nodes in the abstract-syntax tree merely pass on linking
information:
This equation defines the entry point of the statement to be the entry point of the
expression; whatever statement precedes the IfThenElse will receive a link to
the expression, allowing the interpreter to jump to the expression directly rather
than making a jump to the conditional-statement and then another jump to the
expression.
The semantics of the individual control constructs of the source language are
expressed in terms of CODE values as follows. Each construct has an associ-
ated code fragment, which is defined as the value of a local CODE-valued attri-
bute. The LINK constructed from this fragment is passed to other (appropriate)
components of the program using the entry and next attributes.
9.2. Incremental Recompilation Using Attributes 201
Example. The Prog operator has one code fragment represented by the local
attribute named code. The code fragment consists of a single Quit instruction,
which would cause the CODE interpreter to terminate execution. The LINK
formed from attribute code is passed down the abstract-syntax tree, providing
access to the final instruction to be executed when the program's body has com-
pleted:
program: Prog {
local CODE code;
code = Quit;
stmtlist.next = Makelink(code);
}
Halt instructions are generated at all places where the CODE interpreter
should abort execution, such as at unexpanded placeholders or at locations of
type errors.
Example. Consider the attribute equations that define the code for the Emp-
tyStmt and Assign operators. EmptyStmt represents an unexpanded state-
ment, and its code is defined to contain a Halt instruction.
stmt EmptyStmt {
local CODE code;
code = Halt;
stmt.entry = Makelink(code);
}
The abstract-syntax tree contains error attributes that indicate the presence or
absence of errors in the program. Thus, the Assign operator's code is a Halt if
either of two error conditions holds, indicated by particular values of the attri-
butes named assign Error and identifier.type:
202 Chapter 9. Generating Code Using Attributes
stmt : Assign
local CODE code;
code = (assignError != "" II identifier.type == EmptyTypeExp)
? Halt: Store(identifier, stmt.next);
stmt.entry = exp.entry;
exp.next = MakeLink(code);
}
exp: Equal {
local CODE code;
code = (error != "") ? Halt
: exp$2.type == IntTypeExp ? TestEquallnt(exp$1.next)
: TestEquaIBool(exp$1.next);
exp$1.entry = exp$2.entry;
exp$2.next = exp$3.entry;
exp$3.next = MakeLink(code);
}
The code for this program is distributed among eleven local code fragments,
which are displayed in the table given below. (In the table, the symbol ... is
used wherever the instruction's LINK field is the instruction on the next line.
9.2. Incremental Recompilation Using Attributes 203
stmt EmptyStmt {
local CODE code;
code = Halt;
stmt.entry = MakeLink(code);
}
I Assign {
local CODE code;
code = (assign Error != "" II identifier.type == EmptyTypeExp)
? Halt: Store(identifier, stmt.next);
stmt.entry = exp.entry;
exp.next = MakeLink(code);
}
IfThenElse {
local CODE code;
code = (type Error != "") ? Halt
: BranchOnCondition(stmt$2.entry, stmt$3.entry);
stmt$1.entry = exp.entry;
exp.next = MakeLink(code);
stmt$2.next = stmt$1.next;
stmt$3.next = stmt$1.next;
}
I While {
local CODE code;
code = (typeError != "") ? Halt
: BranchOnCondition(stmt$2.entry, stmt$1.next);
stmt$1.entry =exp.entry;
exp.next = MakeLink(code);
stmt$2.next = exp.entry;
}
Compound {
stmt.entry = stmtList.entry;
stmtList.next = stmt.next;
}
Figure 9.1. Code-generation equations for the five kinds of statements of the source
language.
204 Chapter 9. Generating Code Using Attributes
Figure 9.2. Code-generation equations for the expressions of the source language.
9.2. Incremental Recompilation Using Attributes 205
For the two cases where the LINK field is not the next instruction in the table the
LINK field is indicated by a symbolic label.)
To break the circularity, we introduce one level of indirection into the generated
code by making use of the built-in, primitive phylum ADR to implement
206 Chapter 9. Generating Code Using Attributes
list environ;
environ EnvironNil( )
EnvironPair(binding environ) [A : A["%n"] A]
};
val ValOfld(identifier id, environ s) {
with(s) (
EnvironNil: Uninitialized,
EnvironPair(Binding(i, v), tail): id == i ? v : VaIOfld(id, tail)
)
};
exitStatus: T erminatedNormally( environ)
r: "Terminated normally.%nFinal state:%n" A]
ExecutionAborted( )
[A : "Execution aborted"]
Figure 9.3. Definitions for auxiliary phyla and operations used in the interpreter.
9.2. Incremental Recompilation Using Attributes 207
LINKs. For example, we could add the following pre-processor definition to the
above editor specification:
In essence, these definitions break the circularity because some of the edges
of the dependence cycle are replaced by attribute references. Ordinarily, every
attribute occurrence a that appears on the right-hand side of a defining equation
for attribute occurrence b introduces a dependence from a to b; however, an
attribute equation that defines attribute occurrence b in terms of &&a does not
introduce a dependence from a to b. It is this property that makes attribute
references useful for representing links. For example, in the (revised) equations
of the While operator, there is no dependence:
While.code ~ exp.next
'The Synthesizer Generator does not provide a dereferencing operation on AITR values, so the
operation CodeForLink cannot be written directly in SSL; however the Synthesizer Generator does
provide a mechanism for calling functions written in C (see The Synthesizer Generator Reference
Manual). To implement CodeForLink it is necessary to declare its type with a foreign-function de-
claration:
One school holds that a program should be developed hand-in-hand with a proof
that the program satisfies its specification [Dijkstra76, Gries81]. Support for this
programming methodology can be provided in the form of a proof editor that
permits a programmer to create and modify program proofs. The editor pro-
vides the programmer with feedback about errors that exist in a proof as it is
developed, using knowledge embedded within the editor of the programming
logic's inference rules [Reps84].
The subject of this chapter is the design of an editor for creating and modify-
ing proofs of programs; in particular, we discuss how the Synthesizer Generator
can be used to build such an editor. The chapter is organized into four sections.
Section 10.1 presents an example of a program and proof being manipulated.
This example motivates the formulation of interactive proof checking described
in Section 10.2; following a brief introduction to a few basic concepts from
logic, Section 10.2 describes how the generation of a program's verification
conditions can be expressed with an attribute grammar. Section 10.3 discusses
how the checking of predicate-logic proofs can be expressed with an attribute
grammar. Section 10.4 describes some enhancements to the basic approach
aimed at making the user's task less tedious when proofs are created and
modified.
10.1. An Introductory Example 211
Imain I
{b >= O}
y:= b
z:= 0
while y >= 0 with invariant y >= -1 & a*y+z = a*b do
y := y - 1
z:= z + a
od /* Exit obligation not established */
{Iz = a*bl}
Positioned at form
Initialization obligation:
{b >= O}
y:= b
z:= 0
{y >= -1 & a*y+z = a*b}
Invariance obligation:
{y >= 0 & Y >= -1 & a*y+z = a*b}
y := y - 1
z:= z + a
{y >= -1 & a*y+z = a*b}
Exit obligation:
{-(y >= 0) & Y >= -1 & a*y+z = a*b}
{z = a*b}
The initialization obligation expresses the requirement that the (claimed) invari-
ant of the loop be established when execution of the loop is begun. The role of
the invariance obligation is to show that the predicate given as the loop-invariant
actually is an invariant for the loop: execution of the loop-body starting in a
state that satisfies the conjunction of the condition and the invariant reestablishes
the invariant. The role of the exit obligation is to show that the conjunction of
the invariant and the negation of the loop-condition implies the post-condition of
the loop.
In a conventional program-verification system, a verification-condition gen-
erator reduces the question of consistency between the proof outline's state-
ments and assertions to that of the validity of formulae in the underlying logic.
One drawback of conventional verification tools is that the formulae generated
in this process are divorced from the program context in which they arise. By
contrast, a language-based proof editor can provide information about mistakes
in a proof outline by determining which proof obligations are not satisfied and
displaying warning messages to indicate the location of unsatisfied obligations.
For example, the comment /* Exit obligation not established *j is a warning
supplied by the editor, indicating that the while-loop's exit obligation fails to be
satisfied.
Now suppose that, in an attempt to correct the proof outline, the first clause of
the invariant assertion is changed to y >= O.
10.1. An Introductory Example 213
Imain I
{b >= O}
y:= b
z:= 0
while y >= 0 with invariant Iy >= 01 & a*y+z = a*b do
/* Invariance obligation not established */
y := y - 1
z:= z + a
od
{z = a*b}
Positioned at prop
With the change, the message about the failure to establish the exit condition
disappears. However, the loop now carries a new message, /* Invariance obli-
gation not established */, indicating that the loop-body fails to re-establish the
invariant on each iteration of the loop.
The modification of the proof outline changed the proof obligations from the
previous ones to the following ones:
Initialization obligation:
{b >= O}
y:= b
z:= 0
{y >= 0 & a*y+z = a*b}
Invariance obligation:
{y >= 0 & Y >= 0 & a*y+z = a*b}
y := y-1
z:= z + a
{y >= 0 & a*y+z = a*b}
Exit obligation:
{-(y >= 0) & Y >= 0 & a*y+z = a*b}
{z = a*b}
In operational terms, the error in the program is that the loop-condition per-
mits the loop to execute one too many times. The problem can be corrected by
changing the loop-condition to be y > O. With this change, the program's state-
ments are now consistent with the assertions that annotate it, and all warnings in
the program disappear:
214 Chapter 10. Interactive Program Verification
Imain I
{b >= O}
y:= b
z:= 0
while [i2]] with invariant y >= 0 & a*y+z = a*b do
y := y - 1
z:= z + a
od
{z = a*b}
Positioned at cond
10.2.1. Background
A logic consists of sentences together with rules that permit one to demonstrate
that certain sentences are theorems. The logic's axioms specify a collection of
sentences that are theorems without further demonstration. The logic's infer-
ence rules specify how one sentence (the conclusion) may be deduced from
other sentences (the premises).
A sentence is a theorem if and only if it can be deduced from the axioms by
successive applications of inference rules. A proof is a demonstration of
theoremhood; it is a systematic listing of sentences, inference rules, and axioms
that indicate how the given sentence is deduced from the axioms.
The axioms and inference rules of a logic can be presented using a notation
due to Gentzen in which the premises and conclusion are listed above and below
a horizontal bar:
premise I premise 2 premisek
Inference rule:
conclusion
An axiom is an inference rule that has no premises:
10.2. Generating Verification Conditions 215
Axiom:
formula
Axioms and inference rules may be parameterized by meta-variables, which can
be instantiated by objects of the appropriate kind. For example, the reflexive
property of equality might be specified by the axiom:
Reflexivity: - -
1 =1
For example, in a logic for integer arithmetic, the meta-variable 1 stands for any
arithmetic expression.
It is convenient to consider a proof as an upward-branching tree, where the
root is the sentence to be proven and the leaves are invocations of axioms. For
example, suppose a logic contains the following axioms and inference rules:
A A B
true (A I B) (A & B)
Then a proof of the sentence (true & (true I false)) would be represented by the
following tree of deductions:
true
true (true I false)
(true & (true I false))
In a Hoare-style logic for reasoning about programs, the sentences are triples
of the form {P IS {Q I, where P and Q are logical formulae and S is a program
fragment (i.e. a list of statements). A triple is an assertion that a terminating
execution of S begun in a state that satisfies P terminates in a state satisfying Q .
The axioms and inference rules of the logic are rules for manipulating triples;
they permit inferring new triples from old triples. For example, straight-line
code can be verified using the axiom of assignment and the rule of composition
[Hoare69]:
{PI S) {QI {QI S2 {RI
{P~~p I id := exp {P I {PI S);S2 {R}
(The notation P ~~p denotes the result of substituting exp for all occurrences of
id in P.)
In addition to the rules for manipulating triples, there are also rules for rea-
soning about pre- and post-conditions; it is possible to strengthen a pre-
condition and weaken a post-condition:
216 Chapter 10. Interactive Program Verification
The proof editor treats (purported) proofs as objects with constraints on them;
modifying one part of a proof may introduce an inconsistency in some other part
of the proof, yet simultaneously correct an inconsistency in a third part of the
proof. Providing feedback to the user involves checking a collection of con-
straints on the components of the proof. The editor keeps the user informed of
errors and inconsistencies in a proof by reexamining the proof's constraints after
each modification and annotating the proof with warning messages to indicate
locations where constraints are violated.
Because SSL permits expressing constraints on a language, it is a suitable for-
malism for creating the proof-checking editor illustrated in Section 10.1. The
axioms and inference rules of the programming logic are expressed as produc-
tions and attribute equations of the editor's defining attribute grammar. Depen-
dences among attributes, as defined in the attribute equations of such a grammar,
express dependences among parts of a proof.
(The programming language discussed in this chapter has a slightly different
abstract syntax from the language used in earlier chapters; the phylum
definitions for the language's abstract syntax are presented in Figure 10.1.)
root program;
program: Prog(form stmtList form);
list stmtList;
stmtList StmtListNil( )
StmtListPair(stmt stmtList)
stmt EmptyStmt( )
Assign(identifier exp)
IfThenElse(cond stmtList stmtList)
While(cond form stmtList)
Figure 10.1. Phylum definitions for the abstract syntax of programs and statements.
10.2. Generating Verification Conditions 217
The phyla exp, prop, cond, and form represent arithmetic expressions, arith-
metic propositions, boolean formulae for statement conditions, and logical for-
mulae, respectively; their definitions are given in Figures 10.2 and 10.3.
Generation of verification conditions for a program in this language can be
expressed with an attribute grammar using two attributes pre and post. Attri-
bute pre is a synthesized attribute of nonterminals stmt and stmtList whose
value is a formula in the language of assertions (i.e. an element of phylum
form); attribute post is an inherited attribute of stmt and stmtList whose value
is also a formula in the language of assertions. The relationships among these
attributes that express partial correctness of programs are given by the attribu-
tion equations presented in Figure lOA and Figure 10.5, which are adapted from
an attribute grammar given in [Gerhart75].
The attribute equations of the grammar treat statements as backward predicate
transformers [Dijkstra76]. For example, in an assignment-statement the rela-
tionship between attributes pre and post is that attribute pre is defined as attri-
bute post with the expression on the right-hand side of the assignment substi-
tuted for all occurrences of the left-hand-side identifier. For while-loops, the
post-condition of the loop-body and the pre-condition of the parent statement
are both defined in terms of the loop-invariant.
Inconsistencies between a proof outline's statements and its assertions are
detected according to the values of the local attributes prooCoblig, invar_oblig,
and exiCoblig, defined by the equations given in Figure 10.5. In these attribute
equations, function IsTheorem is a decision procedure - a procedure that
Figure 10.3. Phylum definitions for statement conditions and logical formulae. Function
CondToForm converts between phylum cond and phylum form.
10.2. Generating Verification Conditions 219
stmt, stmtList {
synthesized form pre;
inherited form post;
};
program: Prog {stmtList.post = form$2; };
stmtList : StmtListNil { stmtList.pre = stmtList.post; }
I StmtListPair {
stmtList$1.pre = stmt.pre;
stmtList$2.post = stmtList$1.post;
stmt.post = stmtList$2.pre;
}
Figure lOA. Attribute equations to generate pre- and post-conditions for the statements
of the language.
220 Chapter 10. Interactive Program Verification
ERROR: NoError( )
IlnvarErr( )
[@ : "%t%t%nl* Invariance obligation not established "/%b%b"]
I ExitErr( )
[@ :" 1* Exit obligation not established "/"]
I AssertErr( )
[@ :" 1* Assertion not established "/"]
program: Prog {
local ERROR prooCoblig;
prooCoblig = IsTheorem(lmplies(form$1, stmtList.pre))
? NoError : AssertErr;
stmt : While {
local ERROR invar_oblig;
invar_oblig = IsTheorem(lmplies(And(form, CondToForm(cond)),
stmtList.pre)) ? NoError : InvarErr;
local ERROR exit_oblig;
exit_oblig = IsTheorem(lmplies(And(form, Not(CondToForm(cond))),
stmt.post)) ? NoError : ExitErr;
(10.1)
The set {A J, A 2, ... , Am}, on the left, is called the antecedent, the set
{B J, B 2, ... , B n }, on the right, the succedent. A sequent is a theorem in the
sequent calculus if it can be derived from the system's axioms and rules of
inference.
It can be shown that a sequent is a theorem in the sequent calculus if and only
if, in one of the more familiar forms of the predicate calculus, such as a natural-
deduction calculus or a Hilbert-style calculus, a formula in the succedent can be
demonstrated taking the formulae in the antecedent as assumptions [Gentzen69].
Informally then, one can think of the formulae of the antecedent as known facts
and the formulae of the succedent as goals, one of which is to be demonstrated;
thus, the informal meaning of the sequent (10.1) is no different from asserting
the formula
The inference rules of sequent calculus allow us to infer new sequents from
old sequents. For each logical operator there are two inference rules: an
analysis rule and a synthesis rule. The analysis rule for a (logical) operator Ej;)
expresses how a formula of the form A Ej;) B may be introduced into an
antecedent; the synthesis rule for Ej;) expresses how A Ej;) B may be introduced
into a succedent.
222 Chapter 10. Interactive Program Verification
Implication analysis:
r ~ d v {A } r v {B} ~ d
(l0.2a)
rv{A~B}~d
Implication synthesis:
r v {A} ~ d v {B}
(1O.2b)
r~dv{A~B}
The implication analysis rule (1O.2a) says (roughly) that to prove a goal under
the assumption A ~ B , we must demonstrate A , and we must demonstrate our
goal assuming B. The implication synthesis rule (l0.2b) says (again roughly)
that to demonstrate A ~ B, we must show that by assuming A we can demon-
strate B.
An attribute grammar can be used to express the rules of sequent calculus as
follows. A sequent is represented by an instance of the nonterminal proof
together with its two inherited attributes: ant (for antecedent) and sue (for suc-
cedent). Each of the operators for proof represents a rule of inference or an
axiom scheme (see below). The operators that correspond to inference rules
contain additional proof nonterminals whose ant and sue attributes are defined
in terms of attributes ant and sue of the parent proof nonterminal. The context
in which the inference rule is invoked is checked to ensure that it represents an
appropriate deductive step.
For example, the productions corresponding to the implication inference rules
are shown in Figure 10.6. In each production in Figure 10.6, there are two form
nonterminals on the right-hand side that determine how an inference rule is
instantiated. The formulae derived from them determine the components of the
formula being analyzed or synthesized, as well as the antecedents and suc-
cedents of the right-hand-side proof nonterminals. The equations for the error
attribute ensure that the formula being analyzed (synthesized) really is in the
left-hand-side proof nonterminal's ant (sue) attribute.
The axioms of the sequent calculus are expressed in three schemes:
(l0.3a)
r v {false} ~ d
(lO.3b)
r v {A} ~ d v {A}
(1O.3c)
r~d v (true}
10.3. Checking Proofs of Verification Conditions 223
proof {
inherited condList ant;
inherited condList suc;
synthesized ERROR error;
};
proof: ImpAnalysis (form form proof proof)
[@ :"analyze (" @ " => " @ ") by implication-analysis" proof$1.error
"%nshow" form$1 "%t%n"
@ "%b%n"
"assume" form$2 "%t%n"
@ "%b%I"]
proof$1.error = InList(lmplies(form$1, form$2), proof$1.ant)
? NoError : AnErr;
proof$2.ant = proof$1.ant;
proof$2.suc = form$1 :: proof$1.suc;
proof$3.ant = form$2 :: proof$1.ant;
proof$3.suc = proof$1.suc;
}
I ImpSynthesis (form form proof)
[@ :"synthesize (" @ " => " @ ") by implication-synthesis" proof$1.error
"%nassume "form$1 "%n"
"show" form$2 "%t%n"
@ "%b%I"]
proof$1.error = InList(lmplies(form$1, form$2), proof$1.suc)
? NoError : SynErr;
proof$2.ant = form$1 :: proof$1.ant;
proof$2.suc = form$2 :: proof$1.suc;
}
Axiom scheme (1O.3a) says that if false is assumed, then any formula can be
demonstrated; (lO.3b) says that if formula A is assumed, then A is demon-
strated; (1O.3c) says that true is demonstrated, no matter what the assumptions
are. These three axiom schemes can be combined into a single production
whose local attribute error indicates whether an application of the production
completes a branch of the proof. The SSL specification for this production is
shown in Figure 10.7.
224 Chapter 10. Interactive Program Verification
};
Figure 10.7. Definitions for the operator corresponding to axiom schema (lO.3a) -
(lO.3c). The attribute equation of the rule checks whether an application of the produc-
tion completes a branch of the proof.
The axioms and inferences rules of a sequent calculus for first-order predicate
logic are listed in Figure 10.8.
We also need to provide inference rules for tenns of phyla prop and expo For
example, rules for equality must be incorporated in our logical system. The
rules for equality can be expressed as follows, using the meta-variables J, K,
and L to represent expressions (e.g. arithmetic expressions):
ru {J=J}~A
(1O.4a)
r~A
Axiom schema.
r u {A,B} ~L1
synthesis:
r~L1 u {A} r~L1 u (B)
A analysis: A
rU{AAB}~L1 r~L1U{AAB}
r~L1u{A} ru{A}~Ll
--, analysis: --, synthesis:
r u {--,A} ~L1 r~L1 u {--,A}
r u (A(b)) ~L1
:3 synthesis:
:3 analysis:
r u {:3x.A (x)} ~ L1 f ~ L1 u {:3x.A (x)}
where b is a variable not where t is a term free
occurring free in A (x) for x inA (x)
V analysis:
r u (A(t)} u {Vx.A (x)} ~L1
V synthesis:
r u (Vx.A(x)} ~L1 f~L1u (Vx.A(x)}
where t is a term free where b is a variable not
for x in A (x) occurring free in A (x)
Figure 10.8. Axioms and inference rules of a sequent calculus for first-order predicate
logic.
226 Chapter 10. Interactive Program Verification
Figure 10.9. Grammar rules corresponding to the inference rules for equality.
with decision procedures - algorithms for deciding simple theories - and proof
tactics - procedures that attempt to construct a proof tree but may leave some
nodes unexpanded for the user to fill in later [Gordon79].
The editor can be extended with decision procedures by making use of known
algorithms for deciding simple theories. For example, an algorithm for deciding
the theory of equality with uninterpreted function symbols can be used as the
basis of a procedure for propositional inference [Nelson8l,Iohnson82]. Such
algorithms can be incorporated into the editor using the production
requires the use of the implication synthesis rule to establish the formula c ~d ,
as in the starred branch of the following proof tree:
{a,b,c,d}~{d} (*)
{a,b,d}~{b} {a,b,d}~{c~d}
{a,b,d}~{a} {a, b, d) ~ {b/\ (c~d)}
{a,b,d} ~ {a/\ (b/\ (c~d)))
Cut rule (10.6) says that to prove someCgoal, if we can demonstrate some for-
mula A , then A can be used as an assumption in the proof of the goal. The cut
rule pennits a user to isolate a fonnula rather easily by using automatic infer-
ences to skip over the easy intennediate steps.
Example. Returning to the example above, if we choose c =} d as the cut for-
mula A, the proof branches into two subproofs whose sequents are
and
We are then able to apply implication synthesis directly to (1O.7a), and the
automatic inference rule can be used to establish (10.7b) because a proof can be
found that makes no use of implication synthesis:
A proof tactic is a method for applying inference rules repeatedly and recur-
sively until none is applicable [Gordon79]. In proof editors, a proof tactic may
be employed to automatically construct a proof fragment; the tactic may some-
times construct a complete a proof tree, but in general it will leave some unex-
panded proof nodes for the user to fill in later [Bates85].
Example. Given an unexpanded proof nontenninal representing the sequent
given above as equation (10.5), a proof tactic could apply the and-synthesis rule
twice to produce the attributed derivation tree that corresponds to the inference:
{a,b,d}~{b} {a,b,d}~{c=}d}
{a,b,d}~{a} {a, b, d) ~ {bl\ (c=}d)}
{a, b, d) ~ {al\ (bl\ (c=}d)))
10.4. Automatic Deductive Capabilities 229
Figure 10.10. Specification of a proof tactic. Input of the form !A (respectively IS) is in-
terpreted as a command to insert, at the current proof nonterminal, an analysis production
(synthesis production) that is appropriate for the outermost operator of the first formula
listed in the current proof nonterminal's ant (sue) attribute.
230 Chapter 10. Interactive Program Verification
production) that is appropriate for the outennost operator of the first fonnula
listed in the current proof nontenninal's ant (sue) attribute.
CHAPTER 11
The Implementation
• The editing kernel, consisting of the four subcomponents that are the back-
bone of all generated editors. Although each different SSL specification
generates an editor with distinct characteristics, all share the common opera-
tions and user interface of the editing kernel.
• The generator proper, which is the part of the system that generates editors
from editor specifications. This is split into two parts: the SSL translator, the
part that actually processes the SSL source, and a shell program, named
sgen, that coordinates the activity of the SSL translator with that of several
other UNIX utilities employed in the task of creating an editor.
Grammar tables
An editor's grammar tables contain information about the editor's defining attri-
bute grammar. Among them are six tables that contain properties of phyla, phy-
lum occurrences, operators, operator occurrences (productions), attributes, and
attribute occurrences. The general rule followed in these tables is that properties
are contained in the most general structure that applies. For example, properties
of phylum occurrences (respectively, operator occurrences and attribute
occurrences) that are common to all occurrences of the given phylum (respec-
tively, operator and attribute) are factored out and stored in the phylum's
(operator's, attribute's) entry in the table of phylum (operator, attribute) proper-
ties.
lIn fact, there are some slight differences between a PROD_INSTANCE that represents a syntactic
value and one that represents (a component of) an attribute value; however, both are subsumed under
the type PROD_INSTANCE.
234 Chapter 11. The Implementation
The terminal's screen is divided into windows, with each window divided into
three regions, named (from top to bottom) the command line, the object pane,
and the help pane. Any command names or transformation names that the user
types are echoed on the command line. The command line is also used to
display system messages. The window's buffer is displayed in the object pane.
The collection of transformations that are enabled for the buffer's current selec-
tion are displayed in the help pane.
In editors generated for ordinary video-display terminals, windows are tiled
on the screen, arranged in horizontal slabs. In editors generated for high-
resolution, bitmapped workstations, resizable and overlapping windows are
11.1. Basic Organization of the Implementation 235
supported. For machines with a mouse, the object pane is equipped with a hor-
izontal scroll bar along its bottom edge and a vertical scroll bar along its right
edge.
The SSL translator takes an SSL specification as input and produces the editor's
grammar tables, the byte-code sequences to be used by the interpreter for
evaluating the specification's SSL expressions, and several files that define the
editor's scanner and parser. The translator creates the grammar tables as an
intermediate file of C source code consisting of a number of initialized arrays. It
generates a second intermediate file consisting of the stack code for evaluating
the specification's expressions. It also generates lex source for producing the
editor's scanner and yacc source for producing the parser.
These files are processed further by other tools to create the final executable
image. Most of the tools involved are standard UNIX tools, including lex
[Lesk75], yacc [Johnson78], the C preprocessor, and the C compiler. The
activities of the different tools that participate in creating an editor are coordi-
nated by the sgen shell program.
The SSL translator performs several tests and normalizations on the declara-
tions that appear in an editor specification. For example, the specification's
phylum/operator definitions are tested to see if any phylum's completing term is
defined circularly. In addition, copy rules are generated for uses of upward
remote attribute sets. These normalizations are described in Section 11.2 and
Section 11.3, respectively.
To detect errors in editor specifications, the SSL translator checks for type
errors in the specification's expressions, which appear in attribute equations,
function declarations, transformation declarations, and entry declarations. Our
experience has been that the SSL type checker detects a high percentage of
errors in editor specifications.
The SSL translator also performs other analyses aimed at improving the per-
formance of the incremental attribute evaluation algorithm. In particular, the
translator applies a test that determines whether the specification's attribute
equations define an ordered attribute grammar [Kastens80]; if the grammar
passes the test, the translator generates evaluation plans for a visit-sequence
evaluator. The orderedness test is described in Chapter 12, "Incremental Attri-
bute Evaluation for Ordered Attribute Grammars."
A second analysis involves examining each expression that appears on the
right-hand side of an attribute equation to determine whether the expression
236 Chapter 11. The Implementation
procedure TestFinitenessOfCompletingTerms(Syntax)
declare
Syntax: a set of phylum/operator declarations
p , q : phyla of Syntax
G : a directed graph <vertex set, edge set>
i : an integer
Op : an operator of Syntax
PhYOp,i: the phylum of Op 's i th argument
begin
G := < {the phyla in Syntax }, 0>
for each phylum p in Syntax do
if p is a non-optional list phylum then
q := the phylum of the first argument of p ,s binary operator
Insert edge (p , q) into the edge set of G
else if p is an optional phylum then skip
else
Op := the completing operator of phylum p
for i := 1 to NumberOfArguments(Op) do
Insert edge (p , PhYOP.i) into the edge set of G
od
fi
od
if G contains a directed cycle then
error: "The completing terms of Syntax are not finite" fi
end
Figure 11.1. An algorithm to test whether the completing terms of an editor specification
are finite.
238 Chapter 11. The Implementation
may be used on the right-hand side of an attribute equation to denote the value
of an attribute instance that occurs higher up in the derivation tree. Each id; is
11.3. Generating Copy Rules for Upward Remote Attribute Sets 239
either a phylum name or an operator name .. If id; is a phylum name, then attr;
must be a synthesized or inherited attribute of that phylum; if id; is an operator
name, then attr; must be a local attribute of a production with that operator as its
right-hand side.
Within the attribute equations of a production p, an upward remote attribute
set refers to an attribute that is not local to p; it refers to an attribute of a dif-
ferent production, one that necessarily occurs above any instance of production
p. By "above," we mean "between any instance of the production p and the
root of the tree." The value of an upward remote attribute set in a given instance
of production p is the value of the id; .attr; that occurs first on the path from the
given production instance to the root of the tree, not including the given produc-
tion instance or its left-hand-side phylum.
For an upward remote attribute set to be well formed, at least one of the
id; .attr; must be guaranteed to occur in every conceivable tree that can contain
the given production. In different instances of p, the upward remote attribute
set can be resolved to different id; .attr; of the set. Note that each id; must be
unique in a given upward remote attribute set. If id; is a phylum and idj is an
operator of phylum id;, then idj .attrj takes precedence over id; .attr;.
More specifically, suppose an upward remote attribute set
Now consider a particular instance of p in some derivation tree. Let Op, OPI>
OP2, ... , OPk be the sequence of operator instances on the path from the given
instance of p to the root of the tree. Let} be the least integer 1 $.} $.k such that
either some id; is OPj (where 1 $. i $. n), or else some id; is the left-hand-side
phylum occurrence of OPj. Then, in this instance of p, {id I·attr I, id2.attr2, ...
,idn .attrn } denotes the value of id; .attr; in OPj. An occurrence of an upward
remote attribute set is improper if there exists a derivation tree such that none of
the id; appear on the path to the root.
Example. Consider the upward remote attribute set that appears on the right-
hand side of the attribute equation in the following declarations:
In each instance p of B : Op(C ,D), the instance of C.c receives the value of
the instance of A.a, B.b, or D.d that occurs closest to p on the path from p 's
left-hand-side phylum instance to the root of the derivation tree. Note that SSL
does allow B.b and D.d to occur in the upward remote attribute set, as illus-
240 Chapter 11. The Implementation
trated above; they refer to the closest instance of these attributes not including
the attributes a/the current production instance.
To accommodate upward remote attribute sets, the SSL translator applies an
algorithm to translate an editor specification that contains upward remote attri-
bute sets to one without such abbreviations. In the process, it introduces new
inherited attributes and defining equations that propagate appropriate attribute
values, as necessary, to each instance of an upward remote attribute set in the
derivation tree. The algorithm also detects improper uses of the notation.
The basic idea underlying the algorithm is as follows: for each occurrence of
an upward remote attribute set, a search of the grammar is initiated to determine
the set of productions deriving the phyla on the left-hand sides of the produc-
tions whose equations refer to upward remote attribute sets. Any production
thus found whose left-hand-side phylum is not among those to which a remote
reference has been made constitutes a potential "conduit" through which the
value of one of the attributes can flow. Under the assumption that a "source"
production defining the value of one of the attributes in an upward remote attri-
bute set will eventually be found, we add to each production encountered a new
attribute and equation that will cause the value of one of the remote attributes
sought, if found, to be propagated.
A branch of the search for a given upward remote attribute set S may ter-
minate in one of three ways:
1) A production is encountered that contains one of the members of S. In this
case, an attribute equation is generated to pass the value down the tree.
2) A production is encountered whose left-hand-side phylum already has an
attribute that passes the set S down the tree. In this case, a previous branch
of the search has already encountered the production, and there is no need to
repeat the search.
3) A production is encountered whose left-hand-side phylum symbol is the start
symbol, but the start symbol is not among the phyla of S. This indicates that
the notation has been used improperly, because otherwise this branch of the
search would have terminated at a production that defines one of S's attri-
butes.
The fact that productions are effectively "marked" by the addition of attribute
equations to pass a specific upward remote attribute set ensures that the algo-
rithm terminates, since there are only a finite number of productions in an editor
specification.
A unique attribute identifier must be provided for each set of attributes to
which a remote reference is made. This identifier is used to label the newly
introduced attributes of any production through which a value among those in
11 .3. Generating Copy Rules for Upward Remote Attribute Sets 241
the corresponding attribute set must be propagated. Note that even if the attri-
bute sets in several different upward remote attribute sets overlap, we must, in
general, provide a separate attribute to propagate a value in each set because the
values being passed down are not always the same.
We use the following notation for newly introduced attribute identifiers: for
upward remote attribute set S, we denote the corresponding attribute identifier
as [S]. For example, [id I.attr b . . . , idn .attrn ] represents the (inherited) attri-
bute that is used to pass down the value of one of the attributes in the set
{idl.attrl, id2·attr2, ... ,idn·attrn }·
To distinguish among multiple occurrences of the same phylum on the right-
hand side of some production, we make the following definitions.
• A phylum X has context <P ,j> if X occurs on the right-hand side of pro-
duction P as the ph occurrence of an X phylum in P .
• For a given grammar, each phylum Z occurs in some bounded number of
contexts. NumContexts(Z) denotes the number of contexts in which Z
occurs.
• Each phylum's contexts can be assigned a position in the range
1 .. NumContexts(Z). The notation Context(Z, i) denotes the i lh context in
which Z occurs, for 1 ~ i ~ NumContexts(Z).
Using this notation, procedure GenerateCopyRules for generating copy rules to
resolve uses of upward remote attribute sets is given in Figure 11.2.
The procedure of Figure 11.2 can be improved by detecting cases in which
the value that will be propagated by attributes corresponding to different upward
remote attribute sets are guaranteed to be the same. This will be the case in
regions of the grammar where all the attributes that occur in the (set) difference
of two upward remote attribute sets cannot exist on any path from the root to the
given production. The algorithm of Figure 11.2 will generate separate new attri-
butes for the two upward remote attribute sets. This can be avoided by using
attribute-identifier labels that correspond to the subset of the upward remote
attribute set's elements whose phylum or operator component can occur on
some derivation path to the given production.
Assuming the sets of phylum names and operator names are disjoint, for each
phylum X in grammar G , we define CanDerive(X) to be the following set:
CanDerive(X) =
{ Y 13 derivations in G such that ROOT~' a Y ~ and Y ~+ yX 0 }
u { Op I 3 derivations in G such that ROOT ~ * a Y ~,
Y : Op (yZ 0) is a production, and Z ~* pX 1t }.
For phylum Z and attribute set S = {id I.attr b ... , idn .attrn }, define
242 Chapter 11. The Implementation
procedure GenerateCopyRules
declare
worklist: a set of attributes
S : a set of attributes
X, W, Z: phylum names
P : a production
i , j : integers
begin
worklist := 0
for each occurrence of lid I.aftr], ... , idn .aftrn ) in any production W --t a do
S := lid I.aftr], ... , idn .aftrn )
Replace lid I.aftr], ... , idn .aftrn ) with a new attribute W. [S]
Insert W. [S] into worklist
od
*
while worklist 0 do
Z. [S] := RemoveElement(worklist)
for i := 1 to NumContexts(Z)do
<P, j> := Context(Z, i)
X := the left-hand-side phylum of P
if for any local attribute x of P , P.x E S then
Add the equation "Z$j. [S] := x" to the equations of P
else iffor any attribute x of X, X.x E S then
Add the equation "Z$j. [S] := X.x" to the equations of P
else if X is the grammar's start symbol then
error: "improper upward remote reference"
else if X has attribute [S] then
Add the equation "Z$j. [S] := X. [S]" to the equations of P
else
Attach a new inherited attribute [S] to X
Insert X. [S] into worklist
Add the equation "Z$j. [S] := X. [S]" to the equations of P
fi
od
od
end
Figure 11.2. An algorithm for generating copy rules to resolve uses of upward remote at-
tribute sets.
procedure GenerateCopyRules
declare
worklist: a set of attributes
S , T: a set of attributes
X, W ,Z: phylum names
P : a production
i , j : integers
begin
worklist := 0
for each occurrence of {id [.attr [, ... , idn .attrn } in any production W ~ a do
S := lid [.attr to ••• , idn .attrn }
T := Project(S , W)
Replace lid [.attrto ... , idn .attrn } with a new attribute W. [T]
Insert W. [T] into worklist
od
while worklist i:- 0 do
Z. [S] := RemoveElement(worklist)
for i := I to NumContexts(Z)do
<P, j> := Context(Z, i)
X := the left-hand-side phylum of P
if for any local attribute x of P , P.x E S then
Add the equation "Z$j. [S] := x" to the equations of P
else if for any attribute x of X, X.x E S then
Add the equation "Z$j. [S] :=X..x" to the equations of P
else if X is the grammar's start symbol then
error: "improper upward remote reference"
else if X has attribute [T] such that T = Project(S ,X) then
Add the equation "Z$j. [S] := X. [T]" to the equations of P
else
T := Project(S ,X)
Attach a new inherited attribute [T] to X
Insert X. [T] into worklist
Add the equation "Z$j. [S] := X. [T]" to the equations of P
fi
od
od
end
Figure 11.3. An improved algorithm for generating copy rules to resolve uses of upward
remote attribute sets.
for every shadow reference to a node. At the end of Stage 1, each node's
reference-count field reflects the total number of references to the node.
Stage 2:
The second stage empties the zero-count set; every node in zc_set whose
reference count is still zero is immediately reclaimed, together with all attri-
butes and descendants whose reference count drops to zero as a result. By
virtue of Stage 1, no value referenced from the SSL stack will be reclaimed.
Stage 3:
The third stage re-establishes the invariant condition: "All references
emanating from the stack are shadow references." The actual reference
count of every node referred to from the SSL stack is decremented; that is,
all references from the stack are converted back into shadow references. If,
in the process, a node's reference count drops back to zero, the node is again
inserted into the zc_set.
Operation rc_alloc(n) performed on a newly-created node n merely sets the
actual reference count of the node to zero. For a node created by the SSL inter-
preter, the node is immediately pushed onto the expression stack, thereby creat-
ing a shadow reference. At the same time, the newly created node is inserted
into the zero-count set so that, in the event that it is popped off the stack without
ever being referenced, it may eventually be reclaimed.
CHAPTER 12
Incremental Attribute Evaluation for
Ordered Attribute Grammars
By far the most involved of the algorithms employed in the Synthesizer Genera-
tor are its algorithms for incremental attribute updating. In generated editors,
incremental attribute updating is carried out by one of several change-
propagation algorithms. One of the algorithms used is described in [Reps82],
[Reps83], and Chapter 5 of [Reps84], which works for arbitrary noncircular
attribute grammars. Alternatively, when the grammar falls into the class of
ordered attribute grammars [Kastens80], a much more efficient algorithm spe-
cialized to that class can be used.
The ordered attribute grammars are a subclass of the noncircular grammars,
and almost all grammars arising in practice are ordered. However, if a
specification fails the orderedness test, an editor can still be generated that uses
the more general change-propagation algorithm. Any grammar that is circular
will fail to be ordered. In this case, the generated editor may loop endlessly as it
follows a cycle of attribute depertdences in the abstract-syntax tree. To detect
this possibility in advance, Release 3 will incorporate a general test of circular-
ity.
Editors generated from ordered specifications are far more efficient than edi-
tors generated from unordered specifications. First, the change-propagation
algorithm runs far more quickly; second, the version of the attributed-tree
module tailored for ordered grammars uses a much more compact representation
of tree nodes in which certain extra information needed for the more general
change-propagation algorithm is eliminated.
Chapter 12. Incremental Attribute Evaluation for Ordered Attribute Grammars 247
procedure Evaluate(T)
declare
T: an unattributed derivation tree
S : a set of production instances
p , pi: production instances
begin
S := the set of deficient production instances of T
whileS :f. 0 do
Selecfand remove a production instance p from S
Update(p )
for each production instance p I that is a neighbor of p do
if p I is deficient then Insert p I into S fi
od
od
end
set of p 's attributes are available. A transition of the machine represents either
an instruction to evaluate an attribute or an instruction to transfer control to a
neighboring machine.
To make it easier to discuss distributed-control evaluation, we introduce the
following terminology: if production instance t 0 ~ t 1 ... tn is an instance of
production p, we say that p applies at node to; the machine for the production
that applies at node to will be referred to as the machine that applies at to.
Conceptually, a distributed-control evaluator is made up of instances of the
machines that apply at each node of the tree. At a given stage of evaluation, a
single machine of the distributed-control evaluator is active. Initially, the active
machine is the machine that applies at the root of the tree. During the course of
evaluation, control passes from machine to machine; an important feature of the
evaluation pattern of a distributed-control evaluator is that such transfers of con-
trol are exclusively transfers from neighbor to neighbor. Evaluation continues
until all machines reach their final state, at which point the tree is fully attri-
buted.
During evaluation, the distributed-control evaluation technique is time-
efficient because, instead of determining the evaluation order at run-time, the
evaluator uses an evaluation sequence that has been pre-computed by statically
analyzing the grammar at construction-time. An additional advantage is that to
keep track of the run-time situation at a production instance t, it is only neces-
sary to know the state of the machine that applies at t. The distributed-control
evaluation technique is space-efficient because at run-time it is not necessary to
provide an actual copy of the machine for each production instance. The label
on each node n that indicates what production applies at n, denoted by
n .ruleIndicator, completely determines which machine description applies.
The use of the term "distributed-control evaluator" is not meant to convey the
impression that the attributes of a tree are evaluated in parallel by separate pro-
cessors; on the contrary, a tree's attributes are evaluated by a single procedure
that interprets descriptions of the machines that make up the distributed-control
evaluator for the tree.
Four varieties of distributed-control evaluators have been discussed in the
literature: tree-walk evaluators [Warren75, Kennedy76], coroutine evaluators
[Warren76], local-control automata [Cohen79], and visit-sequence evaluators
[Kastens80]. The feature that distinguishes the visit-sequence evaluator from
the others is its simplicity. The transition diagram of each machine of a visit-
sequence evaluator is a single sequence of instructions, with no branching. For
the other three kinds of distributed-control evaluators, a machine's transition
diagram is, in general, a directed acyclic graph.
250 Chapter 12. Incremental Attribute Evaluation for Ordered Attribute Grammars
actions that affect x directly.) Thus, the attributes of x get evaluated in alternat-
ing groups of inherited attributes and synthesized attributes.
The interaction pattern described above holds for any pair of adjacent
machines. Therefore, in deriving a grammar's plans, we must take into account
the interactions between every possible pair of adjacent machines. For example,
different instances of the production p:X ~ ex may be adjacent to production
q:Y ~ ~ X yand to production r:Z ~ WX i. Moreover, the plan for p cannot
vary depending on whether it is adjacent to q or to r; in the plan for p , the inter-
face through X must be compatible with both.
The interface between any two plans must reflect the dependences that hold
among X's attributes. Visit-sequence evaluators apply to the class of partition-
able attribute grammars, I which are characterized by the following property of
the dependences among the nonterminals' attributes:
... for each symbol a partial order over the associated attributes can be
given, such that in any context of the symbol the attributes are evaluable
in an order which includes that partial order.
[Kastens80]
The ordered attribute grammars are a subclass of the partitionable grammars for
which there is a particularly efficient way to determine the plans for a visit-
sequence evaluator.
The adjective "partitionable" refers to the use that is made of the above-
mentioned partial order in the construction of the evaluator's plans. The partial
order is used to partition each nonterminal's attribute set into a sequence of dis-
joint subsets that alternate between one that consists entirely of inherited attri-
butes and one that consists entirely of synthesized attributes. In addition, the
partitions respect the evaluation requirements of each nonterminal's attributes;
that is, the partitions Ak (X) associated with nonterminal X' s attributes are
defined so that for each tree node x, if x.a E Ai (X), x.b E Aj (X), and i > j, then
x.a can be evaluated before x.b .
These partitions define the interface between adjacent machines. The plan for
each production is created using the production's local attribute dependences
and the partitions for each of the production's nonterminals. For ordered attri-
bute grammars, this construction is described in more detail in Section 12.4,
"Construction of a Visit-Sequence Evaluator." The reader may also wish to
consult [Kastens80] or [Waite83].
tThe tenn is used in [Kastens82] and [Waite83]; in [Kastens80]. the class is called the arranged ord-
erly grammars.
252 Chapter 12. Incremental Attribute Evaluation for Ordered Attribute Grammars
procedure VSEvaluate(T)
declare
T: an unattributed derivation tree
currentNode: the node of T whose machine is currently active
planIndex, r, i : integers
a : an attribute name
currentNodei: the i th node of the production instance at currentNode
begin
currentNode := root(T)
planIndex := 1
forever do
ifPlan[currentNode.ruleIndicator][planIndex] has the form EVAL(i, a) then
Execute the attribute-definition function to evaluate currentNodei.a
planIndex := planIndex + 1
else if Plan[currentNode.ruleIndicator][planIndex] has the form VISIT(i , r) then
planIndex := MapVisitToPlanIndex(currentNodei .ruleIndicator, 0, r)
currentNode:= currentNodei
else if Plan[currentNode.ruleIndicator][planIndex] has the form SUSPEND(r) then
if currentNode = root(T) then return fi
planIndex := Map VisitToPlanIndex( currentNode.parent.ruleIndicator,
currentNode.sonNumber,
r)
currentNode := currentNode.parent
fi
od
end
Plan[currentNode.ruleIndicator].
In either case, after currentNode is assigned the node that is receiving control,
the new machine resumes execution at the instruction given by
The detail that most complicates the plan-construction algorithm is that the
evaluation context of a given production may be different for different instances
of the production in an abstract-syntax tree. By a production instance's evalua-
tion context, we mean the transitive dependences that exist among the produc-
tion instance's attribute instances due to chains of dependences that run
throughout the tree.
Kastens's construction side-steps the problem of a production having dif-
ferent evaluation contexts by making what is essentially a worst-case assump-
tion. The dependences that exist among a nonterminal's attributes must obey
the following restriction: for each nonterminal X, there must exist an acyclic
relation among the attributes of X that covers the actual dependence relation in
every tree. The attributes of each of the grammar's nonterminals are required to
have such a covering dependence relation. We use the notation TDS (X) (stand-
ing for "Transitive Dependences among a Symbol's attributes") to denote the
covering dependence relation for nonterminal X.
Using covering dependence relations in place of a tree's actual dependence
relations is a key part of the construction. The advantage of using the TDS rela-
tions is that they are a precomputed approximation to the actual dependence
relations. By using TDS (X) in place of the actual dependence relation that
holds at X, we are being pessimistic because the edges that occur in TDS eX)
may correspond to combinations of dependences that never actually appear
together in anyone tree. In fact, an individual edge in TDS eX) may represent a
dependence that never actually exists in any tree. However, as long as this does
not lead to the introduction of any cycles, using an approximate dependence
relation in place of the actual dependence relation merely places additional
256 Chapter 12. Incremental Attribute Evaluation for Ordered Attribute Grammars
is cyclic. The restrictions of the algorithm are not usually a handicap because
most grammars that arise in practice meet the requirements. Those grammars
that fail the condition can usually be adjusted to meet the requirements (see Sec-
tion 12.7).
In Steps 1 and 2 of Kastens's construction of visit-sequences for ordered attri-
bute grammars, the goal is to determine the transitive dependences among a
nonterminal's attributes. This information is recorded in the relation TDS (X).
Initially, all the TDS relations are empty. The construction that builds them up
involves the auxiliary relation TDP (P) (standing for "Transitive Dependences
in a Production"), which expresses dependences among the attributes of a
production's nonterminal occurrences.
The basic operation used in Steps 1 and 2 is procedure
AddEdgeAndInduce(TDP (P), (a, b)) whose arguments are the TDP graph of
some production p and a pair of attribute occurrences in p. AddEdgeAndIn-
duce carries out three actions:
1) Edge (a, b) is inserted into graph TDP (P).
2) Any additional edges needed to transitively close TDP (P) are inserted into
TDP(P).
3) In addition, for each edge added to TDP (P) by either action 1 or 2, (i.e.
either the edge (a, b) itself or some other edge (c, d) added to reclose
TDP (p )), AddEdgeAndInduce may add an edge to one of the TDS graphs.
In particular, for each edge added to TDP (P) of the form (Xi.m ,Xi.n),
where Xi is an occurrence of nonterminal X and (X.m, X.n ) e TDS (X), an
edge (X.m ,X.n) is added to TDS (X).
Each edge of a TDS (X) graph can be marked or unmarked; the edges that
AddEdgeAndInduce adds to the TDS (X) graphs are unmarked.
The TDS graphs are generated by the first two steps of Kastens's algorithm.
The first step, an initialization step, is carried out by procedure Stepl, given in
Figure 12.3. Stepl initializes the grammar's TDP graphs with edges represent-
258 Chapter 12. Incremental Attribute Evaluation for Ordered Attribute Grammars
procedure Step1
declare
p : a production
Xi, Xj : nonterminal occurrences
a , b : attributes
begin
for each production p do
for each attribute occurrence Xj.b of p do
for each argument Xi.a of Xj.b do
if (Xi.a ,Xj.b) fi. TDP (p) then
AddEdgeAndInduce(TDP (P), (Xi.a ,Xj.b» fi
od
od
od
end
ing all direct dependences that exist between the grammar's attribute
occurrences. Because the edges of the TDP graphs are entered by calls to
AddEdgeAndInduce, Step 1 also serves to initialize the TDS graphs. Through
the side effects of the calls on AddEdgeAndInduce, Step 1 creates unmarked
edges in the TDS graphs due to all direct dependences occurring in the gram-
mar.
Procedure Step2, given in Figure 12.4, determines a set of induced transitive
dependences by performing a closure operation on the TDP and TDS relations.
In Step2, the invariant for the while-loop is
procedure Step2
declare
X : a nonterminal
X: a nonterminal occurrence
p : a production
begin
while there is an unmarked edge (X.a ,X.b ) in one of the TDS relations do
mark (X.a ,Kb )
for each occurrence X of X in any production p do
if (X.a ,X.b) fi. TDP (P) then AddEdgeAndInduce(TDP (P), (X.a ,X.b» fi
od
od
end
When all edges in TDS are marked, the effects of all direct dependences have
been induced in the TDP and TDS relations. Thus, the TDS (X) graphs com-
puted by Step2 are guaranteed to cover the actual transitive dependences among
the attributes of X that exist at any occurrence of X in any derivation tree.
If any of the TDP relations is circular after Stepl or Step2, then the construc-
tion halts with failure. Failure after Step 1 indicates a circularity in the equations
of an individual production; failure after Step2 can indicate that the grammar is
circular, but Step2 can also fail for some noncircular grammars.
procedure Step3
declare
G: a directed graph
SynWorklist, InhWorklist: sets of attributes
k: an integer
b , c : attributes
begin
for each nonterminal X do
G :=TDS(X)
SynWorklist := { b E synthesized attributes of X I out-degreea<b) = 0 }
InhWorklist := { b E inherited attributes of X I out-degreeG (b) = 0 }
k := 1
while SynWorklist '" 0 v InhWorklist", 0 do
if k is odd then
while SynWorklist '" 0 do
Select and remove an attribute b from SynWorklist
Partition[X.b] := k
for each c that is a predecessor of b in G do
Remove edge (c , b) from G
if out-degreeG (c) = 0 then
if c is a synthesized attribute of X then Insert c into SynWorklist
else Insert c into InhW orklist
Ii
Ii
od
od
else
while InhW orklist '" (2) do
Select and remove an attribute b from InhW orklist
Partition[X.b] := k
for each c that is a predecessor of b in G do
Remove edge (c, b) from G
if out-degreeG (c) = 0 then
if c is a synthesized attribute of X then Insert c into SynWorklist
else Insert c into InhW orklist
Ii
Ii
od
od
Ii
k := k + 1
od
numberOfYisitsNeeded(X) := max(l, k12)
od
end
Figure 12.5. Computation of disjoint partitions for the attributes of each nonterminal X
according to the dependences in TDS (X).
Partition[X.b] := k
The two worklists contain the sets of attributes that are ready to be assigned to
partitions k and k + 1. The parity of k determines whether the active worklist is
SynWorklist or InhWorklist.
G changes as attributes are considered to reflect only the dependences among
attributes that have not yet been assigned to a partition set. The out-degree of
each vertex reflects the number of unassigned attributes that depend on the ver-
tex. An edge (b, c) in G indicates that attributes band c are unassigned and
that c depends on b. This means that after consideration of attribute c, all its
incoming edges (b, c) are deleted from G, and if this action causes b to have
out-degree 0, b is inserted in the appropriate worklist because all successors
have already been assigned to a partition set.
Step3 also computes the quantity numberOfVisitsNeeded(X) for each nonter-
minal X. This indicates the number of VISIT/SUSPEND transfer pairs that
must take place across each occurrence of nonterminal X. Note that to ensure
that the visit-sequence evaluator has a chance to evaluate every attribute of the
tree, at a minimum one pair of transfers must take place across each non terminal
occurrence so that every machine of the tree gets activated at least once.
The final two steps of Kastens's construction convert the partition information
into evaluation plans. The step that actually emits the plans (the fifth and final
step) is carried out by what is essentially a topological sort of each of the
grammar's TDP graphs. One problem that arises is that when a graph is topo-
logically sorted there is considerable choice at each step as to what should be
considered next; if we were to sort the TDP graphs computed by Step 2, there is
no guarantee that compatible plans would be created for machines that can be
adjacent in a derivation tree.
The purpose of the fourth step of Kastens's construction is to ensure that the
plans that are created for machines that can be adjacent in a tree are compatible
with the partitions found in Step 3. The role of Step 4 can be stated succinctly
as follows:
When we choose a partition, this choice fixes the order in which certain
attributes may be computed. In this respect the partition acts like a set of
262 Chapter 12. Incremental Attribute Evaluation for Ordered Attribute Grammars
dependencies, and its effect may be taken into account by adding these
dependencies to the ones arising from the attribution rules.
[Waite83]
Thus, the fourth step of Kastens's construction adds additional edges between
attribute occurrences in the grammar's TDP graphs that are in different parti-
tions. This step ensures that compatible plans are generated for machines that
can be adjacent in some derivation tree. This step is carried out by procedure
Step4, given in Figure 12.6.
If all the TDP (P) graphs are acyclic after Step 4, then the grammar is
ordered. One of the problems with Kastens's construction is that Step 4 may
introduce cycles into the TDP graphs. If any cycles are introduced by Step 4,
the construction halts with failure. (This failure is reported by the SSL transla-
tor as a type 3 circularity.) For a discussion of what can be done when this kind
of failure occurs, see Section 12.7, "What to Do When a Grammar Fails the
Orderedness Test."
The final step of Kastens's construction makes use of the TDP graphs (as
amended by Step 4) to create the actual plan for each production. In this step, a
topological sort of each TDP graph is carried out. At each stage of the topologi-
cal sort of TDP (P), a certain set S of the vertices of TDP (P) have been con-
sidered. Such a state corresponds to a (run-time) evaluation state at a node
where p applies in which the elements of S are available. The plan-creation
step is carried out by procedure Step5, of Figure 12.7.
procedure Step4
declare
Xi : a nontenninal occurrence
p : a production
a , b : attributes
begin
for each production p do
for each nontenninal occurrence Xi in p do
for each pair <Xi.a ,Xi.b > of Xi'S attribute occurrences do
if Partition[Xi .a ] > Partition[Xi .b ] then
Insert edge (Xi.a ,Xi .b ) into TDP (p ) fi
od
od
od
end
Figure 12.6. Completion of TDP graphs with edges from higher-numbered partition-set
elements to lower-numbered elements.
12.4. Construction of a Visit-Sequence Evaluator 263
procedure Step5
declare
p : a production
i , j , k : integers
Xi , Xj: nonterminal occurrences
a : an attribute
numberOfYisitsScheduled(Xi ): an integer associated with nonterminal occurrence Xi
worklist, S : sets of vertices of TDP (p)
v, w : individual vertices of TDP (p)
Vk,j : a condensation vertex of TDP (p )
begin
for each production p : X 0 ~ X I ... Xk do
[1] for j := 0 to k do
for each partition set S of attributes of Xj that are input attributes of p do
Condense TDP (p) with respect to S , replacing S with a vertex V Partition[Sl.j
od
numberOfYisitsScheduled(Xj ) := 0
od
maxPartition := the maximum value of Partition for any partition set of X 0
[2] if maxPartition is even then
Remove vertex V maxPartition,O from TDP (p ) together with all edges emanating from it
fi
MapVisitToPlanIndex[p, 0, I] := 1
workList := the set of TD P (p )' s vertices with in-degree 0
while workList "* 0 do
Select and remove a vertex v from worklist
if v is a condensation vertex of the form Vk,o then
numberOfYisitsScheduled(X 0) := numberOfYisitsScheduled(X 0) + 1
0»
Append SUSPEND(numberOfYisitsScheduled(X to Plan[p ]
MapVisitToPlanIndex[p , 0, numberOfYisitsScheduled(X 0) + I] := length(Plan[p D+ 1
else if v is a condensation vertex of the form Vk i, where i > 0 then
numberOfYisitsScheduled(Xi ) := numberOfYlsitsScheduled(Xi ) + 1
Append VISIT(i , numberOfYisitsScheduled(Xi » to Plan[p]
MapVisitToPlanIndex[p, i, numberOfYisitsScheduled(Xi )] := length(Plan[p D+ 1
else /* v corresponds to an output attribute occurrence Xi.a */
Append EV AL(i , a) to Plan[p ]
fi
for each vertex w that is a successor of v in TDP (p ) do
Remove edge (v, w) from TDP (p)
if in-degree(w) = 0 then Insert w into workList fi
od
od
[3] for i := 1 to k do
if numberOfYisitsNeeded(Xi ) "* numberOfYisitsScheduled(Xi ) then
numberOfYisitsScheduled(Xi ) := numberOfYisitsScheduled(Xi ) + 1
»
Append VISIT(i , numberOfYisitsScheduled(Xi to Plan[p ]
MapVisitToPlanIndex[p, i, numberOfYisitsScheduled(Xi )] := length(Plan[p D+ I
fi
od
[4] Append SUSPEND(numberOfYisitsScheduled(X o) + I) to Plan[p]
od
end
Figure 12.7. The algorithm to create tables Plan and MapVisitToPlanIndex from the
grammar's TDP graphs.
264 Chapter 12. Incremental Attribute Evaluation for Ordered Attribute Grammars
mining the cost, we neglect the true cost of evaluating attribute-definition func-
tions and count them as unit steps when, in fact, they may be a great deal more
expensive. Our analysis of the algorithm's cost is based on the number of attri-
bute reevaluations performed, plus any additional bookkeeping costs associated
with determining the reevaluation order.
In this section, we describe how the visit-sequence evaluation method can be
used for incremental attribute evaluation. We demonstrate how the tables of a
visit-sequence evaluator can be used to determine an optimal reevaluation order
for a change-propagation algorithm. The algorithm described updates a tree in
o (I AFFECTED I) steps after a subtree replacement and is thus asymptotically
optimal.
One way for an incremental attribute evaluator to achieve optimal behavior is
based on the following observations:
If, in the course of propagating new values, an attribute is ever (tem-
porarily) reassigned a value other than its correct final value, spurious
changes are apt to propagate arbitrarily far beyond the boundaries of AF-
FECTED, leading to suboptimal running time. To avoid this possibility,
a change propagator should schedule attribute reevaluations such that
any new value computed is necessarily the correct final value. That is,
an attribute should not be reevaluated until all of its arguments are
known to have their correct final values.
[Reps83]
This last observation describes a sufficient condition on the order in which attri-
butes are reevaluated that will ensure asymptotically optimal behavior of a
change-propagation algorithm.
A reevaluation order that never assigns an attribute a value that is not its
correct final value can only be determined by taking into account both direct and
transitive dependences among attributes. In the optimal updating algorithm for
arbitrary noncircular grammars, called PROPAGATE in [Reps82], [Reps83],
and Chapter 5 of [Reps84], such dependences are represented explicitly by
edges in a scheduling graph. PROPAGATE can be understood as a generaliza-
tion of Knuth's topological-sorting algorithm. As in the topological-sorting
algorithm, PROPAGATE keeps a work-list of attributes that are ready for
reevaluation. (In topological sorting, the work-list consists of nodes ready for
enumeration.) An attribute is placed on the work-list when its in-degree is
reduced to zero in a scheduling graph whose edges reflect dependences among
attributes that have not yet been reevaluated (enumerated).
Whereas in topological sorting the vertices of the scheduling graph are known
a priori, in PROPAGATE the set of vertices of the scheduling graph is gen-
erated dynamically, at the same time as its vertices are being enumerated. Some
12.5. Incremental Updating by Visit-Sequence-Driven Change Propagation 267
These ideas are incorporated into the statement of VSPropagate given in Figure
12.8.
Updating begins at node nodeOfSubtreeReplacement.parent - the parent of
the node at which the subtree replacement was performed. To carry out
reevaluations in an order that respects the visit-sequence evaluation order of T,
the machine that applies at nodeOfSubtreeReplacement.parent must be reac-
tivated in its initial state. As updating progresses, VISIT and SUSPEND
instructions will call for control to be transferred to neighboring machines.
Whereas during non-incremental visit-sequence evaluation, a VISIT(i, r)
instruction always causes control to be transferred to the ith child and a
SUSPEND instruction always causes control to be transferred to the parent, dur-
ing change propagation the goal is to skip over as many unnecessary VISITs and
SUSPENDs as possible. In particular, transfers of control are ignored in the fol-
lowing two situations:
• When the machine at node x reaches a VISIT(i, r) instruction, if the ith
child of x is not a member of Reactivated, x retains control and proceeds to
the next instruction in its plan.
• When the machine at x reaches a SUSPEND instruction, if the parent of x is
not a member of Reactivated, x retains control and continues executing the
plan for x.
Ignoring such instructions allows the algorithm to skip over, in unit-time, arbi-
trarily large sections of the tree in which attribute values cannot possibly
change. When a machine does transfer control to a neighboring machine, Map-
VisitToPlanIndex is used to establish the new machine's execution state whether
or not evaluation activity at the node has previously been suppressed. Thus,
when a machine receives control it performs the rest of its instructions in the
same order as it would during non-incremental evaluation. Because the visit-
sequence evaluation order is a total order on the attributes of T that respects the
partial order given by D (T), omitted VISIT and SUSPEND instructions preced-
ing a subsequent "reactivation" are irrelevant.
When the machine that applies at node x executes an EVAL instruction that
changes the value of attribute b, all attributes that use b as an argument may
become inconsistent. These attributes belong to a production instance p , that is
one of the neighbors of the production instance at x. If b is a synthesized attri-
bute of x , p' is the production instance that derives x; if b is an inherited attri-
bute of a child of x , p' is the production instance derived from the child. Attri-
butes in p' will get reevaluated only if p' is a member of Reactivated; thus,
because p' may contain inconsistent attributes, p' is made a member of
270 Chapter 12. Incremental Attribute Evaluation for Ordered Attribute Grammars
°
if oldValue;f. the value of currentNodej.a then
if i ;f. then neighborNode := currentNodej
else if currentNode;f. root(T) then neighborNode := currentNode.parent
else neighborNode := null
fi
if neighborNode ;f. null and currentNodej.a has successors in the
production that applies at neighborNode then
Insert neighborNode into Reactivated
fi
fi
planIndex := planlndex + 1
else if Plan[currentNode.ruleIndicator][planIndex] has the form VISIT(i, r) then
if currentNodej E Reactivated then
planIndex := MapVisitToPlanIndex(currentNodej .ruleIndicator, 0, r)
currentNode := currentNodej
else planIndex := planIndex + I
fi
else if Plan[currentNode.ruleIndicator][planIndex] has the form SUSPEND(r) then
if currentNode = root(T) then return fi
if currentNode.parent E Reactivated then
planIndex := MapVisitToPlanIndex(currentNode.parent.ruleIndicator,
currentNode.sonNumber, r)
currentNode := currentNode.parent
else ifPlan[currentNode.rulelndicator][planIndex] is the last instruction of
Plan[currentNode.ruleIndicator] then return
else planlndex := planlndex + I
fi
fi
od
end
function's argument has changed value, then the new value computed is
guaranteed to be different from the old value; thus, it is wasteful to perform a
test to see if the attribute has changed value. When the incremental attribute-
evaluation algorithm reevaluates an attribute defined by an identity function, it
should be possible to save the overhead of comparing the attribute's new value
with its old value.
This is an important optimization, because in practice a large proportion of
attribute-definition functions are identity functions [Wilner71]. Moreover, the
principle holds for attributes defined by any one-to-one function, not just the
identity function. Consequently, the SSL translator tests the specification's attri-
bute equations to try to discover which attributes are defined by one-to-one
functions.
If the first attribute instance in a chain of one-to-one functions changes value,
then the rest of the elements in the chain must also change value. It is important
to note, however, that this is only true when all the old values in the chain are
consistent (i.e. identical). Change propagation can terminate without reaching
the end of the chain if, initially, some of the old values are inconsistent with
their predecessor's value.
However, the only possible inconsistent instances of attributes defined by
one-to-one functions are attributes of nonterminals in the region of the tree that
was modified. In the case of a subtree-replacement operation, these inconsisten-
cies are confined to the attributes of two production instances - the one derived
from the node of the subtree replacement and the one that derives the node of
the subtree replacement. Thus, it is necessary to treat the attributes of nontermi-
nals in the initially inconsistent region that are defined by copy rules as if they
were not part of any chain; that is, for such an attribute it is still necessary to test
its new and old values for equality whenever the attribute's value is recomputed.
To carry out the optimization of suppressing equality tests for attributes
defined by one-to-one functions, the change-propagation method must have one
additional property. It must be the case that the only attributes that are
reevaluated are ones whose arguments have changed value. (If this property
holds, then the information that an attribute is defined by a one-to-one function
is sufficient to conclude that it changed value when evaluated.) This property
does not hold for the version of VSPropagate presented in Figure 12.8: once a
machine is reactivated in some state, it may actually carry out many "needless
evaluations" because all the machine's subsequent EVAL instructions are car-
ried out, regardless of whether any of the attribute's arguments changed value.
To achieve the property that the only attributes reevaluated are ones whose
arguments change value, we introduce an additional set, named needTo-
BeEvaluated, used as follows:
12.6. Optimizations for One-to-One Functions 273
A type 3 circularity indicates that the grammar is definitely not circular, but
indicates a failure of Kastens's method for constructing a visit-sequence evalua-
12.7. What to Do When a Grammar Fails the Orderedness Test 275
tor, due to the use of a representation that only approximates the actual depen-
dences among the grammar's attributes. A common way a type 3 circularity
arises is when a grammar has two disjoint threadings of attribute dependences
through the same productions, one threaded left-to-right and the other threaded
right-to-left. Such a grammar is noncircular but has a type 3 circularity.
Example. The grammar given in Figure 12.10 illustrates the type 3 circularity
that arises in a specification that has both a left-to-right threading and a right-
to-left threading. When sgen processes this specification, it produces the error
message:
root Root;
Root Top (X);
X Null ()
.Pair (X X)
Figure 12.10. A grammar with both a left-to-right threading and a right-to-Ieft threading
of attribute dependences. This grammar has a type 3 circularity.
276 Chapter 12. Incremental Attribute Evaluation for Ordered Attribute Grammars
Consider the origin of the type 3 circularity in this specification: after Step 2 of
the Kastens construction, graph TDS (X) has vertex set {X.a, X.b, X.C, X.d} and
edge set {(X.a, X.b), (X.c, X.d)}. In Step 3 of the construction, which parti-
tions TDS (X), both of X's synthesized attributes, X.b and X.d, are placed in one
partition, and both inherited attributes, X.a and X.c, are placed in a second parti-
tion, as follows:
Partition[X.a] := 2
Partition[X.b] := 1
Partition[X.c] := 2
Partition[X.d] := 1
2In this error message, X<v_no=1> refers to X$2. The quantity "v_no" is the phylum-occurrence
number: the phylum-occurrence number of the left-hand-side phylum occurrence is 0; the right-
hand-side occurrences are numbered 1,2, etc.
12.7. What to Do When a Grammar Fails the Orderedness Test 277
The partition of TDS (X) created by Step 3 then puts X.a and X.b in higher-
numbered partition sets than they were in previously:
Partition[X.a] := 4
Partition[X.b] := 3
Partition[X.c] := 2
Partition[X.d] := 1
In effect, the modified equation causes the evaluation of the right-to-Ieft thread
to be delayed until after the left-to-right thread has been evaluated.
There are several other ways to eliminate the type 3 circularity from the
example. For instance, there are two other ways to modify the grammar to force
the right-to-Ieft thread to be delayed until after the left-to-right thread has been
evaluated. One way is to change the equation for X$2.c to
Yet another way to eliminate the type 3 circularity is to force the left-to-right
thread to be delayed until after the right-to-Ieft thread has been evaluated. This
can be accomplished by using any of the following three attribute equations in
place of the corresponding equation in the original specification:
Syntax of SSL
In the following grammar, nonterminals are printed in Times italic and terminals
in Helvetica roman (e.g. specification and root). The alternatives of each non-
terminal symbol appear indented below it on consecutive lines. The alternatives
of binary-injix-operator and unary-prejix-operator are exceptions and appear on
the same line, separated by spaces. The characters €e, [, D, D*, and D+ have
been adopted as meta-symbols. Their meanings are:
o.€e~ o.or~
[o.D optional occurrence of a.
[o.D* zero or more occurrences of a.
[o.D+ one or more occurrences of a.
[o.D~* [a. [~o.D*D
[o.D~ o.[~o.D *
specification
[declaration ]+
declaration
root phylum ;
list [phylum ] ~ ;
optional [phylum]~ ;
[phylum ] ~ { [field]+} ;
phylum-name: [alternative Df ;
phylum-name ::= [alternative Df ;
style [style-name]~ ;
store [store-declaration]+ ;
exccomputers [external-computer-name]~ ;
quantified-declaration
phylum : operator-name = base-type-name = ;
phylum [exported]] function-name( [phylum parameter-name]] ,* )
{expression } ;
alternative
lexeme-name < [<state-name>]] regular-expression [<state-name>]] >
operator-name = base-type-name =
parameters
( [phylum]] * )
token
character-constant
attribution
{ [local-field$attribute-equation$view-predicate]] * }
local-field
[demand]] [store-list]] local phylum attribute-name;
attribute-equation
output-attribute = expression ;
view-predicate
in [view-name]]; on expression;
output-attribute
phylum-occurrence .attribute-name
local-attribute-name
unparsing
[[view-name]],* [@ $ A]] [: $ ::=]]
[resting-place-denoter$unparsing-item]] * ]
Appendix A. Syntax of SSL 281
resting-place-denoter
@EtlAEtl ..
unparsing-item
string-constant
phylum-occurrence
phylum-occurrence .attribute-name
local-attribute-name
[ [unparsing-item] * ]
field
[demand] [store-list] synthesized phylum attribute-name;
transform-clause
on string-constant pattern : expression
store-declaration
store-name phylum with (op-name, op-name, op-name, op-name)
store-list
store ( [store-name] +, )
expression
constant
variable
unary-prefix-operator expression
function-name( [expression] ,* )
operator-name( [expression] ,* )
expression (expression )
expression [expression ]
expression [expression :]
expression { }. attribute-name
pattern
constant
pattern-variable-name
default
operator-name( [pattern]],* )
pattern-variable-name as pattern
constant
[phylum]
<phylum>
false $ true
decimal-constant
octal-constant
real-constant
character-constant
string-constant
nil
niLattr
Appendix A. Syntax of SSL 283
variable
phylum-occurrence
phylum-occurrence .attribute-name
local-attribute-name
parameter-name
pattern-variable-name
pattern-variable-name .attribute-name
phylum-occurrence
$$
phylum
phylum $decimal-constant
phylum
phylum-name
phylum-name [ [phylum] ~ ]
binary-infix-operator
# * / % + .. @ < <= > >= != & " I && II
unary-prefix-operator
- ! - & * &&
regular-expression
character
string-constant
\ character
[[character$character-character]+ ]
["[character$character-character]+ ]
"regular-expression
regular-expression$
regular-expression?
284 Appendix A. Syntax of SSL
regular-expression*
regular-expression+
regular-expression regular-expression
regular-expression I regular-expression
(regular-expression)
regular-expression / regular-expression
decimal-constant
[1 ED 2 ED ED 9] [digit] *
octal-constant
o
[digit ] *
real-constant
[digit ]+.[digit ]+[e[ +ED-] [digit ]+]
character-constant
'character'
string-constant
"[character] *"
character
aEDbEDcED
\octal-constant
\n $ \r $ \b ED \t
digit
0$1ED"'ED9
APPENDIX B
NAME
sgen - Synthesizer Generator
SYNOPSIS
sgen [ option] ... file ...
DESCRIPTION
Sgen is a system for generating a language-based editor from a language
specification. Editor specifications are written in the Synthesizer
Specification Language (SSL). Arguments to sgen whose names end
with .ssl are taken to be SSL source files. Arguments to sgen whose
names end with .c (.0) are taken to be C source (object) files. They are
compiled and linked into the generated editor. In the absence of the -0
flag, the generated editor is called syn.out.
To use sgen, the location of a directory containing the Synthesizer Gen-
erator must be specified. This can be done in one of two ways: either as
the value of environment variable SYNLOC or as the contents of file
lusr/locai/lib/synioc. This location is referred to below as SYNLOC.
The default system location is SYNLOC/sys.
Options to sgen include:
-a alt_directory
Search for .ssl files in directory (directories) alt_directory if they
are not found in the current directory.
286 Appendix B. Invoking the Synthesizer Generator
-b
Use the COLLECTIONS implementation of maps instead of the
AVL-tree implementation. Available only with the ATO attribute-
evaluation kernel. (See the -kernel flag.)
-d
Make all attributes demand attributes.
-dbx
Invoke dbx on the SSL language processor. So that dbx can run
ssl with the proper arguments, an appropriate run command is
placed in a file named .rundbx. This command may be invoked by
giving the dbx command: source .rundbx.
-debug
Use a version of the SSL interpreter that includes the SSL
debugger.
-0 name
Define name for the macro preprocessor. The .ssl source files are
processed by the C preprocessor, as are any .c files specified as
arguments to sgen.
-G
Force sgen to provide diagnostic output about attribute depen-
dences.
-I name
Append name to the macro preprocessor's search path.
-kernel kernel
Use an alternative attribute-evaluation strategy. Allowable values
are UNORDERED, ORDERED, and ATO. The default strategy
is ATO.
The UNORDERED kernel implements Reps's original algorithm
for incremental attribute evaluation. It works for any noncircular
attribute grammar but has a somewhat high time and space over-
head.
The ORDERED kernel incorporates an incremental version of
Kastens's algorithm for attribute evaluation. It is more efficient
than the UNORDERED kernel but works only for the ordered sub-
class of attribute grammars.
The ATO kernel incorporates Hoover's attribute-updating algo-
rithm based on approximate topological ordering. It works for any
Appendix B. Invoking the Synthesizer Generator 287
-v
Invoke yaee with the -v flag, so that diagnostic file y.output will
be produced.
-w window_type
Create an editor for a window_type window system. Allowable
values for window_type are VIDEO, SUN, X10, X11 and X. The
default is installation specific. Window_type X denotes the latest
supported version of the X Window System.
The following options are useful primarily for debugging different parts of the
system.
-interp interp
Use an alternative interpreter from directory interp in place of the
default interpreter, which has the same name as argument kernel to
the -kernel option. (See also the -debug option.)
-lang lang
Use an alternative version of the SSL compiler from directory lang.
-mise mise
Use alternative versions of miscellaneous object modules from
directory misc.
-support support
Use alternative versions of supporting object modules from direc-
tory support.
- T testloe
Search the directory structure rooted at testloe for parts of the sys-
tem. Any object module found in testloe will be used instead of the
system copy.
AUTHOR
Thomas Reps and Tim Teitelbaum.
FILES
file.ssl input file
syn.out editor created by sgen
sysloe SYNLOC/sys or location given with -S
or -s flag
sysloe/sys/LANG/SSUssl SSL language processor
Appendix B. Invoking the Synthesizer Generator 289
This appendix documents the most frequently used editor commands available
in generated editors. This summary is intended to serve the needs of new users
of the system. Documentation for the rest of the editing commands can be
found in The Synthesizer Generator Reference Manual.
start-command ESC-s
Initiate execution of a command with the parameters contained in the current
form. If not currently editing a parameter form, start-command does noth-
ing.
cancel-command ESC-e
Cancel execution of the command awaiting completion of its parameter
form. If not currently editing a parameter form, cancel-command does
nothing.
Move the object up with respect to the window, effectively moving the view
of the object down one page.
C.S. Moving the Object with Respect to the Window 293
previous-page ESC-v
Move the object down with respect to the window, effectively moving the
view of the object up one page.
next-line AZ
Move the object up with respect to the window by one line, effectively mov-
ing the view of the object down one line.
previous-line ESC-z
Move the object down with respect to the window by one line, effectively
moving the view of the object up one line.
pointer-long-up
Move the locator eight characters up. If already at the top of the object
pane, scroll the window.
pointer-long-down
Move the locator eight characters down. If already at the bottom of the
object pane, scroll the window.
select ESC-@
Change the selection to the production instance whose unparsing scheme
caused the printing of the character pointed at by the locator.
left "S
Move the character selection one position to the left.
beginning-of-Iine "A
Move the character selection to the beginning of the line.
end-of-Iine "E
Move the character selection to the end of the line.
Editors may be generated for many different workstations, each with their own
sort of keyboard and display screen. This appendix describes information that
must be available to a running editor to describe the specific keyboard, display
screen, and window system in use.
D.I. Keyboards
Key-bindings for commands are specified in a keyboard description file. Each
command can have zero or more bindings. Each binding is defined on a
separate line in the format
command-name 000
An editor generated for the XlO Window System must be invoked after an
XlO window manager has been initiated. Editors generated for XlO support the
standard XlO geometry and display specifications on the command line. In
addition, a number of window properties can be defined in a -/.Xdefaults
resource file. These window properties are:
D.3. Mice
Editors generated for either the X Window System or Sun View can make use of
a mouse, while editors generated for video displays cannot. Editors generated
for XlO and SunView make use of three buttons. On some workstations, the
middle button is simulated by chording both buttons.
The leftmost button has the following uses:
• The selection in the object pane can be changed by clicking or dragging
across non-blank characters in the display of an object.
• A transformation can be invoked by clicking on the transformation's name
in a help pane.
• The display of an object can be scrolled within the window by clicking on
one of the arrows in a scroll bar.
The middle button is used to control a pop-up menu containing language-
independent system commands.
The rightmost button is used to control a pop-up menu containing the
currently enabled transformations.
In editors generated for XlI, system-command and transformation menus are
combined under the control of the right button.
Actuators on the vertical scroll bar, from top to bottom, are bound to
0.3. Mice 301
scroll-to-top
previous-page
(back-by-half-a-page)
previous-line
next-line
(forward-by-half-a-page)
next-page
scroll-to-bottom
Actuators on the horizontal scroll bar, from left to right, are bound to
page-left
(left-by-half-a-page)
column-left
column-right
(right-by-half-a-page)
page-right
The mouse is used to move, resize, raise, lower, and iconify windows in the
fashion of the given window system and its window manager.
Bibliography
Aho86.
Aho, A.V., Sethi, R., and Ullman, J.D., Compilers: Principles, Techniques, and Tools,
Addison-Wesley, Reading, MA (1986).
Alberga81.
Alberga, C.N., Brown, A.L., Leeman, G.B., Mikelsons, M., and Wegman, M.N., "A
program development tool," pp. 92-104 in Conference Record of the EighthACM Sym-
posium on Principles of Programming Languages, (Williamsburg, VA, January 26-28,
1981), ACM, New York, NY (1981).
Bahlke86.
Bahlke, R. and Snelting, G., "The PSG system: From fonnal language definitions to
interactive programming environments," ACM Trans. Program. Lang. Syst. 8,
4 (October 1986), pp. 547-576.
Bates85.
Bates, J. and Constable, R., "Proofs as programs," ACM Trans. Program. Lang. Syst.
7,1 (January 1985), pp. 113-136.
Burstall69.
Burstall, R.M., "Proving properties of programs by structural induction," Computer
10urnal12, 1 (February 1969), pp. 41-48.
Cohen79.
Cohen, R. and Harry, E., "Automatic generation of near-optimal linear-time translators
for non-circular attribute grammars," pp. 121-134 in Conference Record of the Sixth
ACM Symposium on Principles of Programming Languages, (San Antonio, TX, Jan.
29-31,1979), ACM, New York, NY (1979).
Delisle84.
Delisle, N., Menicosy, D., and Schwartz, M., "Viewing a programming environment as
a single tool," Proceedings of the ACM SIGSOFTISIGPLAN Software Engineering
Bibliography 303
Gerhart75.
Gerhart, S.L., "Correctness-preserving program transformations," pp. 54-66 in Confer-
ence Record of the Second ACM Symposium on Principles of Programming
Languages, (Palo Alto, CA, Jan. 20-22,1975), ACM, New York, NY (1975).
Ghezzi79.
Ghezzi, C. and Mandrioli, D., "Incremental parsing," ACM Trans. on Prog. Lang. and
Syst. 1, 1 (July 1979), pp. 58-70.
Ghezzi80.
Ghezzi, C. and Mandrioli, D., "Augmenting parsers to support incrementality," Jour-
nal of the ACM 27,3 (October 1980), pp. 564-579.
Gordon79.
Gordon, M., Milner, R., and Wadsworth, C., Edinburgh LCF, Lecture Notes in Com-
puter Science, Vol. 78, Springer-Verlag, New York, NY (1979).
Gries8l.
Gries, D., The Science of Programming, Springer-Verlag, New York, NY (1981).
Hansen7l.
Hansen, W., "Creation of hierarchic text with a computer display," Ph.D. dissertation,
Dept. of Computer Science, Stanford Univ., Stanford, CA (June 1971).
Henderson84.
Henderson, P., Proceedings of the ACM SIGSOFTISIGPLAN Software Engineering
Symposium on Practical Software Development Environments, (Pittsburgh, PA, April
23-25,1984), ACM SIGPLAN Notices 19, 5 (May 1984).
Henderson85.
Henderson, P. and Weiser, M., "Continuous execution: The Visiprog environment," in
Proceedings of the Eighth International Conference on Software Engineering, (1985).
Henderson87.
Henderson, P., Proceedings of the ACM SIGSOFTISIGPLAN Software Engineering
Symposium on Practical Software Development Environments, (Palo Alto, CA, Dec.
9-11,1986), ACM SIGPLAN Notices 22,1 (January 1987).
Hoare69.
Hoare, C.A.R., "An axiomatic basis for computer programming," Commun. of the
ACM 12,10 (October 1969), pp. 576-580, 583.
Hoover87.
Hoover, R., "Incremental graph evaluation," Ph.D. dissertation and Tech. Rep. 87-836,
Dept. of Computer Science, Cornell University, Ithaca, NY (May 1987).
Jalili82.
Jalili, F. and Gallier, J., "Building friendly parsers," pp. 196-206 in Conference Record
of the Ninth ACM Symposium on Principles of Programming Languages, (Albu-
querque, NM, Jan. 25-27,1982), ACM, New York, NY (1982).
Johnson85.
Johnson, G.F. and Fischer, C.N., "A meta-language and system for nonlocal incremen-
tal attribute evaluation in language-based editors," pp. 141-151 in Conference Record
of the Twelfth ACM Symposium on Principles of Programming Languages, (New Orle-
Bibliography 305
Morris81.
Morris, J.M. and Schwartz, M.D., "The design of a language-directed editor for block-
structured languages," Proceedings of the ACM SIGPLAN-SIGOA Symposium on Text
Manipulation, (Portland, OR, June 8-10, 1981), ACM SIGPLAN Notices 16, 6 (June
1981), pp. 28-33.
Mughal85.
Mughal, K., "Control-flow aspects of generating runtime facilities for language-based
programming environments," pp. 85-91 in Proceedings of the 1985 IEEE Conference
on Software Tools, (New York, NY, April 15-17, 1985), IEEE Computer Society,
Washington, D.C. (1985).
Nelson81.
Nelson, G., "Techniques for program verification," Tech. Rep. CSL-81-lO, Computer
Science Laboratory, Xerox Palo Alto Research Center, Palo Alto, CA (June 1981).
Notkin85.
Notkin, D., Ellison, R.J., Staudt, BJ., Kaiser, G.E., Kant, E., Habermann, A.N.,
Ambriola, V., and Montangero, C., Special issue on the GANDALF project, Journal of
Systems and Software 5,2 (May 1985).
Reppy84.
Reppy, J. and Kintala, C.M.R., "Generating execution facilities for integrated program-
ming environments," Tech. Mem. 59545-84, A.T.& T. Bell Laboratories, Murray
Hill, NJ (March 1984).
Reps82.
Reps, T., "Optimal-time incremental semantic analysis for syntax-directed editors," pp.
169-176 in Conference Record of the Ninth ACM Symposium on Principles of Pro-
gramming Languages, (Albuquerque, NM, January 25-27, 1982), ACM, New York,
NY (1982).
Reps83.
Reps, T., Teitelbaum, T., and Demers, A., "Incremental context-dependent analysis for
language-based editors," ACM Trans. Program. Lang. Syst. 5,3 (July 1983), pp. 449-
477.
Reps84b.
Reps, T. and Alpern, B., "Interactive proof checking," pp. 36-45 in Conference Record
of the Eleventh ACM Symposium on Principles of Programming Languages, (Salt Lake
City, UT, Jan. 15-18, 1984), ACM, New York, NY (1984).
Reps84.
Reps, T., Generating Language-Based Environments, The M.LT. Press, Cambridge,
MA (1984).
Reps84a.
Reps, T. and Teitelbaum, T., "The Synthesizer Generator," Proceedings of the ACM
SIGSOFTISIGPLAN Software Engineering Symposium on Practical Software Develop-
ment Environments, (Pittsburgh, PA, Apr. 23-25, 1984), ACM SIGPLAN Notices 19,
5 (May 1984), pp. 42-48.
Reps86.
Reps, T., Marceau, C., and Teitelbaum, T., "Remote attribute Updating for language-
Bibliography 307
based editors," pp. 1-13 in Conference Record of the Thirteenth ACM Symposium on
Principles of Programming Languages, (St. Petersburg, FL, Jan 13-15, 1986), ACM,
New York, NY (1986).
Reps88.
Reps, T. and Teitelbaum, T., The Synthesizer Generator Reference Manual. Springer-
Verlag, New York, NY (Third Edition: 1988).
Rovner84.
Rovner, P., "On adding garbage collection and runtime types to a strongly-typed,
statically-checked, concurrent language," Rep. CSL-84-7, Xerox Palo Alto Research
Center, Palo Alto, CA (July 1984).
Schwartz84.
Schwartz, M., Delisle, N., and Begwani, V., "Incremental compilation in Magpie,"
Proceedings of the SIGPLAN 84 Symposium on Compiler Construction. (Montreal,
Can., June 20-22,1984), ACM SIGPLAN Notices 19, 6 (June 1984), pp. 122-131.
Stallman81.
Stallman, RM., "EMACS: The extensible, customizable self-documenting display edi-
tor," Proceedings of the ACM SIGPLAN-SIGOA Symposium on Text Manipulation,
(Portland, OR, June 8-10, 1981), ACM SIGPLAN Notices 16, 6 (June 1981), pp. 147-
156.
Teitelbaum81.
Teitelbaum, T. and Reps, T., "The Cornell Program Synthesiszer: A syntax-directed
programming environment," Commun. of the ACM 24,9 (September 1981), pp. 563-
573.
Teitelman78.
Teitelman, W., lnterlisp Reference Manual, Xerox Palo Alto Research Center, Palo
Alto, CA (December 1978).
Tenenbaum74.
Tenenbaum, A., "Automatic type analysis in a very high level language," Ph.D. disser-
tation, Computer Science Department, New York University, New York, NY
(October 1974).
Turing37.
Turing, A.M., "On computable numbers with an application to the Entscheidungsprob-
lem," Proc. London Math. Soc., series 2, 42 (1937), pp. 230-265.
Waite83.
Waite, W.M. and Goos, G., Compiler Construction, Springer-Verlag, New York, NY
(1983).
Warren75.
Warren, S.K., "The efficient evaluation of attribute grammars," M.A. dissertation,
Dept. of Mathematical Sciences, Rice University, Houston, TX (April 1975).
Warren76.
Warren, S.K., "The coroutine model of attribute grammar evaluation," Ph.D. disserta-
tion, Dept. of Mathematical Sciences, Rice University, Houston, TX (April 1976).
308 Bibliography
Wegman80.
Wegman, M., "Parsing for structural editors," pp. 320-327 in Proceedings of the
Twenty-First IEEE Symp. on Foundations of Computer Science (Syracuse, NY,
October 1980), IEEE Computer Society, Washington, D.C. (1980).
Wilcox76.
Wilcox, T.R., Davis, A.M., and Tindall, M.H., "The design and implementation of a
table driven, interactive diagnostic programming system," Commun. of the ACM 19,
11 (November 1976), pp. 609-616.
Wilner71.
Wilner, W.T., "Declarative semantic definition as illustrated by a definition of Simula
67," Ph.D. dissertation, Dept. of Computer Science, Stanford Univ., Stanford, CA
(June 1971).
Wood81.
Wood, S.R., "Z - The 95 percent text editor," Proceedings of the ACM SIGPLAN-
SIGOA Symposium on Text Manipulation, (Portland, OR, June 8-10,1981), ACM SIG-
PLAN Notices 16, 6 (June 1981), pp. 1-7.
Yeh83.
Yeh, D., "On incremental evaluation of ordered attributed grammars," BIT 23 (1983),
pp. 308-320.
Index
Zadeck, FoKo, 44
zc_max_seCsize, 244
zc_set,244
Zimmennann, Eo, 54, 238, 251
Now available from GrammaTech, Inc.:
The Synthesizer Generator System
The Synthesizer Generator System is available on both a research and a
commercial basis. For additional information about how to acquire a
copy of the system, write:
Synthesizer Generator
GrammaTech, Inc.
One Hopkins Place
Ithaca, NY 14850
ISBN 0·387·96910·1
Texts and Monographs in Computer Science
Jeffrey R. Sampson
Adaptive Information Processing: An Introductory Survey
Niklaus Wirth
Programming in Modula-l, 3rd Edition