0% found this document useful (0 votes)
31 views

COMP0020-2020-lecture21-graph Reduction Continued With Captions

Uploaded by

h802651
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

COMP0020-2020-lecture21-graph Reduction Continued With Captions

Uploaded by

h802651
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Functional Programming

Christopher D. Clack
2020

© 2020 Christopher D. Clack 1


FUNCTIONAL PROGRAMMING

Lecture 21
GRAPH REDUCTION
continued

© 2020 Christopher D. Clack 2


This lecture continues our exploration of
FUNCTIONAL PROGRAMMING the implementation technique of Graph
Reduction – specifically, interpretive
GRAPH REDUCTION continued graph reduction
The lecture will explain two different
approaches to traversing the spine of
application cells, and discuss the issue of
lazy versus strict evaluation
CONTENTS This will be followed by a discussion of
combinator graph reduction and how
that differs from graph reduction of l
• Spine traversal expressions
Supercombinator graph reduction is a
• Lazy and strict evaluation further step in the development of
Graph Reduction technology, and we
explain how it provides benefits over
• Combinator Graph Reduction previously covered techniques
As for the last lecture, students are
expected to read Chapters 10, 11, 12
• Supercombinator Graph Reduction and 13 of "The Implementation of
Functional Programming Languages", by
Simon Peyton Jones, available for free
download at this link:
https://www.microsoft.com/en-
us/research/uploads/prod/1987/01/slpj-
book-1987.pdf

© 2020 Christopher D. Clack 3


In the previous lecture we mentioned
FUNCTIONAL PROGRAMMING that interpretive graph reduction would
"descend the spine" . Notice that it will
GRAPH REDUCTION continued also be necessary to "ascend the spine"
to find the correct vertebra node to
overwrite with the result of a reduction
But how is "descending" and
"ascending" (also known as "unwinding"
SPINE TRAVERSAL – the Spine Stack and "rewinding") actually achieved?
Here we give two mechanisms – first,
using a Spine Stack, and then using a
• "descending" ("unwinding") and "ascending" technique called Pointer Reversal
("rewinding") the spine The Spine Stack
The code that performs interpretive
graph reduction uses a stack data
• Using a stack of pointers to the vertebrae structure, where each item on the stack
is the memory address of (a pointer to)
a vertebra on the spine
Base of spine stack Unwinding: The first item at the base of
the stack (by convention shown growing
downwards in memory) points to the
top vertebra – this is set to be the root
@ of the program when graph reduction
starts. By following the pointer, the
address held in the left field of the
@ U vertebra node can be found, and a copy
N R is placed next on the stack. By following
W E the pointer just placed on the stack, the
@ I W address in the left field of the next lower
vertebra node can be found, and a copy
N I placed on the stack. This continues until
Top of spine stack @ D N the tip is found
I D
Rewinding is trivial now that the
f N I
pointers to the vertebra are on the spine
G N
stack
G
© 2020 Christopher D. Clack 4
An advantage of the Spine Stack is that
FUNCTIONAL PROGRAMMING the graph reduction code now not only
has access to every vertebra on the
GRAPH REDUCTION continued spine (so that it can ascend and descend
the spine) but also, via the vertebrae, to
the ribs and therefore the arguments
(which is necessary for both b and d
reduction)
SPINE TRAVERSAL – the Spine Stack When recursively evaluating strict
arguments to built-in functions such as
"+", the pointers for traversing the
• Recursively evaluate strict arguments to built-in argument spine can be built directly on
the Spine Stack, as illustrated in this
functions diagram. Notice that the existing Spine
Stack entries (e.g. the blue and red
pointers in this diagram) will not change
• Each recursively-searched spine can use the same Spine while a more deeply-nested argument
Stack (e.g. the brown pointers) is being
evaluated. Also notice that the Spine
Stack for any spine can be discarded
once the entire spine has been
evaluated (and so the brown Spine Stack
entries can be discarded – popped from
the stack - after that subgraph is
@ evaluated, leaving the red Spine Stack
entries to be considered next, and so
on)
@
When recursively evaluating an
@ argument, the entirety of the existing
+ Spine Stack must be retained so that it
can be returned to when the argument
@ has been evaluated. Essentially, this is
@ the implementation technique of "stack
+ frames" often used in imperative
language runtime systems
@
+ © 2020 Christopher D. Clack 5
A Spine Stack is simple to implement,
FUNCTIONAL PROGRAMMING but has the disadvantage that it requires
an unknown amount of memory (we
GRAPH REDUCTION continued don't know in advance how deep the
stack will become)
An alternative is to use Pointer Reversal
(independently invented by Deutsch and
by Schorr & Waite) This technique only
SPINE TRAVERSAL – Pointer Reversal requires two memory locations that we
shall call B (for Back) and F (for Forward)
To start, B holds a unique pointer called
• Uses finite memory – just two locations B and F TOP and F holds a pointer to the root
vertebra. To unwind one step down the
spine, two changes occur:
• Using a stack of pointers to the vertebrae
• swap the contents of B and the left
field of the vertebra pointed to by F
• then swap the contents of B and F
B: TOP B: B: B: A common technique for swapping the
contents of two memory locations
F: F: F: F: (without needing a third intermediate
location) is to apply a bitwise XOR
TOP TOP TOP function three times
This process repeats for each unwind
@ @ @ @ step, and the result is that the spine is
split into two parts:
@ @ @ @ 1. pointed to by B, wherein the
pointers in the left fields of the
vertebrae get reversed, so that they
@ @ @ @ point UP the spine rather than down
2. pointed to by F, wherein the pointers
@ @ @ @ in the left fields of the vertebrae
point DOWN the spine, unchanged
f f f f Rewinding the spine is the reverse
process, stopping when B=TOP
© 2020 Christopher D. Clack 6
When a built-in operator needs a strict
FUNCTIONAL PROGRAMMING argument to be evaluated in advance,
the pointer-reversal technique requires
GRAPH REDUCTION continued the use of a new tag (a modified
application node tag, indicated here by
"@" in red)
Pointer reversing into an argument
requires the following changes:
SPINE TRAVERSAL – Pointer Reversal • swap the contents of F and the right
field of the vertebra pointed to by B
• Evaluating strict arguments for operators • then change the tag of the node
pointed to by B from @ to @
The reason for changing the tag is so
that when rewinding out of an argument
spine it is clear when B has reached the
parent spine (since in this case we
cannot rely on the special value TOP).
B: B: B: At this point the next rewind step must
reverse the changes stated above, so
that the pointers are reset correctly and
F: F: F: the tag is set back to normal
TOP TOP TOP

@ @ @
@ @ @
+ @ @ @
+ +
@ @ @

f
© 2020 Christopher D. Clack 7
Using a Spine Stack is very fast, with far
FUNCTIONAL PROGRAMMING fewer accesses to heap memory
locations, but uses an unbounded
GRAPH REDUCTION continued amount of memory
Using Pointer Reversal is very frugal with
memory, but is very slow (unless
implemented in hardware)
Using Pointer Reversal is not merely
SPINE TRAVERSAL – Comparison useful for traversing spines (and
argument spines) but also stores
information about how far evaluation
• Using a Spine Stack is very fast, with far fewer accesses has progressed – this can be useful
to heap memory locations, but uses an unbounded when evaluating multiple strict
arguments (with a Spine Stack after
amount of memory evaluating each argument the parent
spine is viewed entirely afresh, with no
such information)
• Using Pointer Reversal is very frugal with memory, but is
very slow (unless implemented in hardware)

• Using Pointer Reversal is not merely useful for traversing


spines (and argument spines) but also stores
information about how far evaluation has progressed –
this can be useful when evaluating multiple strict
arguments (with a Spine Stack after evaluating each
argument the parent spine is viewed entirely afresh,
with no such information)

© 2020 Christopher D. Clack 8


Normal Order evaluation is the process
FUNCTIONAL PROGRAMMING already described
Strict (Applicative Order) evaluation uses
GRAPH REDUCTION continued the same procedure except that all
arguments to all functions and built-in
operators are treated in the same way
as strict arguments for the built-in
operators (i.e. evaluate them first before
LAZY AND STRICT EVALUATION performing either a b or d reduction)
Lazy Evaluation is a combination of
Normal Order evaluation and the fact
• Normal Order evaluation is the process already that each subgraph is evaluated at most
described once – which is what we achieve by (i)
copying pointers to arguments during b
reduction and (ii) overwriting the root
• Strict (Applicative Order) evaluation uses the same node of each redex with an indirection
to the result
procedure except that all arguments to all functions and
built-in operators are treated in the same way as strict
arguments for the built-in operators (i.e. evaluate them
first before performing either a b or d reduction)

• Lazy Evaluation is a combination of Normal Order


evaluation and the fact that each subgraph is evaluated
at most once – which is what we achieve by (i) copying
pointers to arguments during b reduction and (ii)
overwriting the root node of each redex with an
indirection to the result

© 2020 Christopher D. Clack 9


The method used for combinator graph
FUNCTIONAL PROGRAMMING reduction depends on whether a fixed
set or a variable set of combinators is
GRAPH REDUCTION continued used
Assuming a fixed set of combinators,
e.g. S, K and I, these are viewed as built-
in functions with rules for d reduction.
There are no l abstractions and so no b
COMBINATOR GRAPH REDUCTION reductions
The d rules for S, K and I are easily
expressed as graph manipulations and
• Fixed-set combinators – e.g. S, K and I derive directly from the combinator
definitions:
• No b reduction, only d reduction Ix=x (replace root with IND to x)
Kxy=x (replace root with IND to x)
• d rules: @ IND
S f g x = f x (g x) (see diagram)
Notice that none of these combinators
I arg1 arg1 need to know the value of their
arguments, so they are not strict in their
arguments (just as the built-in operator
IND "if" is not strict in its second and third
@ arguments)

@ arg2 arg1 Because we are dealing with a fixed set


of combinators, it is easy to have these d
arg1 rules hard-wired into the graph
K reduction implementation (the runtime
system code). Thus, the symbols S, K
and I (and perhaps others, if a larger set
@ @ of combinators is used) each cause
some (usually small amount of) native
@ arg3 @ @ code to run

@ arg2 arg3
arg1 arg2
S arg1 © 2020 Christopher D. Clack 10
We start with an observation that the
FUNCTIONAL PROGRAMMING (fixed) combinator K is treated like a
built-in function rather than a user-
GRAPH REDUCTION continued defined function, and in particular it is
not evaluated until it has both of its
arguments
• this has nothing to do with the
strictness of those arguments – e.g.
SUPERCOMBINATOR GRAPH REDUCTION the built-in operator "if" is only strict
in its first argument but is not d-
reduced until all three arguments are
• Fixed combinators are treated like operators (not present
evaluated until all arguments are present) • nor does this prevent partial
applications – it is only about the
timing of when evaluation occurs at
• This will also apply to Super-Combinators runtime
This approach will also apply to Super-
Combinators – although their definitions
• We know that Super-Combinators have no free can't be known in advance (as there will
variables, and if they are not evaluated until all be a different set of Super-Combinators
for each program), we can treat them
arguments are present then there can be no need for an like operators and only evaluate them
environment of bindings to be managed at runtime when all arguments are available
Notice that Super-Combinators are (by
definition) combinators and therefore
• Thus, Super-Combinators lead to super-efficient have no free variables, and if they are
implementation using compiled graph reduction never evaluated until all arguments are
present (and their body is not a l
abstraction) this solves the problem
seen in the previous lecture that
required an environment of bindings to
be managed at runtime
Thus, Super-Combinators lead to super-
efficient implementation using compiled
graph reduction
© 2020 Christopher D. Clack 11
This slide gives the definition of Super-
FUNCTIONAL PROGRAMMING Combinators appearing on Page 223
(Section 13.2) of the required-reading
GRAPH REDUCTION continued text book "The Implementation of
Functional Programming Languages", by
Simon Peyton Jones:

A supercombinator $S of arity n is a l
SUPERCOMBINATOR GRAPH REDUCTION expression of the form lx1.lx2 ...
lxn.E
DEFINITION where E is not a l abstraction (this just
ensures that all the ‘leading lambdas’
A supercombinator $S of arity n is a l expression of the are accounted for by x1...xn) such that

form lx1.lx2 ... lxn.E (i) $S has no free variables,


(ii) any l abstraction in E is a
where E is not a l abstraction (this just ensures that all the supercombinator,
‘leading lambdas’ are accounted for by x1...xn) such that (iii) n = 0; that is, there need be no
lambdas at all.
(i) $S has no free variables, A supercombinator redex consists of the
(ii) any l abstraction in E is a supercombinator, application of a supercombinator to n
arguments, where n is its arity
(iii) n = 0; that is, there need be no lambdas at all. A supercombinator reduction replaces a
supercombinator redex by an instance
A supercombinator redex consists of the application of a of the supercombinator body with the
supercombinator to n arguments, where n is its arity arguments substituted for free
occurrences of the corresponding formal
A supercombinator reduction replaces a supercombinator parameter
redex by an instance of the supercombinator body with the
arguments substituted for free occurrences of the
corresponding formal parameter

© 2020 Christopher D. Clack 12


Supercombinators of non-zero arity
FUNCTIONAL PROGRAMMING (that is, having at least one l at the
front) have no free variables (clause (i)
GRAPH REDUCTION continued of the definition) and we can therefore
compile a fixed code sequence for them
Furthermore, clause (ii) of the definition
ensures that any lambda abstractions in
the body have no free variables, and
SUPERCOMBINATORS OF NON-ZERO ARITY hence do not need to be copied when
instantiating the supercombinator body

Supercombinators of non-zero arity (that is, having at least


one l at the front) have no free variables (clause (i) of the
definition) and we can therefore compile a fixed code
sequence for them

Furthermore, clause (ii) of the definition ensures that any


lambda abstractions in the body have no free variables, and
hence do not need to be copied when instantiating the
supercombinator body

© 2020 Christopher D. Clack 13


A supercombinator with arity zero (that
FUNCTIONAL PROGRAMMING is, having no ls at the front) is just a
constant expression (remember that it
GRAPH REDUCTION continued has no free variables)
These supercombinators are often called
"constant applicative forms" or CAFs.
For example:
• the constant 3
SUPERCOMBINATORS OF ZERO ARITY • the constant expression 4 + 5
• the constant function (+3)
A supercombinator with arity zero (that is, having no ls at The last example shows that CAFs can
the front) is just a constant expression (remember that it still be functions. Since a CAF has no ls
has no free variables) at the front, it is never instantiated.
Hence, no code need be compiled for it,
and a single instance of its graph can
These supercombinators are often called "constant freely be shared
applicative forms" or CAFs. For example:
• the constant 3
• the constant expression 4 + 5
• the constant function (+3)

The last example shows that CAFs can still be functions.


Since a CAF has no ls at the front, it is never instantiated.
Hence, no code need be compiled for it, and a single
instance of its graph can freely be shared

© 2020 Christopher D. Clack 14


There are many options for
FUNCTIONAL PROGRAMMING implementing supercombinators:

GRAPH REDUCTION continued • We could keep the body of the


supercombinator as a tree, and use
template-copying interpretation

• Because supercombinators are


constructed once and for all at
SUPERCOMBINATORS CODE compile-time, rather than being
generated on the fly at run-time,
the "code" could just be a graph
What should supercombinator code do? held in a contiguous block of store,
instantiated with a (fast) block copy
• Keep the body of the supercombinator as a tree, and use (but this also misses opportunities
for optimisation)
template-copying interpretation • Compile the body to a linear
sequence of intermediate code
• Let the "code" be a graph held in a contiguous block of graph-manipulation instructions to
store, instantiated with a (fast) block copy direct the operation of a simple
run-time interpreter to create an
instance of the body when
• Compile the body to a linear sequence of intermediate executed
code graph-manipulation instructions to direct the • Compile the body to a linear
operation of a simple run-time interpreter to create an sequence of native code
instance of the body when executed With either intermediate code or native
code we can introduce optimisations
such as using fast stack allocation rather
• Compile the body to a linear sequence of native code than allocating graph nodes in a heap

With either intermediate code or native code we can


introduce optimisations such as using fast stack allocation
rather than allocating graph nodes in a heap

© 2020 Christopher D. Clack 15


In summary, this lecture has continued
FUNCTIONAL PROGRAMMING our exploration of the implementation
technique of interpretive Graph
GRAPH REDUCTION continued Reduction
Two different approaches were covered
for traversing the spine of application
cells, and the issue of lazy versus strict
evaluation was discussed
SUMMARY This was followed by a discussion of
combinator graph reduction and
supercombinator graph reduction and
• Spine traversal how they differ from graph reduction of
l expressions
• Lazy and strict evaluation As for the last lecture, students are
expected to read Chapters 10, 11, 12
and 13 of "The Implementation of
• Combinator Graph Reduction Functional Programming Languages", by
Simon Peyton Jones

• Supercombinator Graph Reduction

© 2020 Christopher D. Clack 16

You might also like