0% found this document useful (0 votes)
6 views

SplayTreesOrig

The document discusses self-adjusting binary trees, which are data structures for priority queues and search trees that do not maintain a balance condition. Instead, they adjust themselves during operations, leading to efficient amortized performance while being simpler and more storage-efficient than traditional balanced trees. The authors present algorithms for self-adjusting heaps and search trees, proving their efficiency and outlining their implementation details.

Uploaded by

GeronimoRomczyk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

SplayTreesOrig

The document discusses self-adjusting binary trees, which are data structures for priority queues and search trees that do not maintain a balance condition. Instead, they adjust themselves during operations, leading to efficient amortized performance while being simpler and more storage-efficient than traditional balanced trees. The authors present algorithms for self-adjusting heaps and search trees, proving their efficiency and outlining their implementation details.

Uploaded by

GeronimoRomczyk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Self-Adjusting Binary Trees

Daniel Dominic Sleator


Robert Endre Tarjan
Bell L a b o r a t o r i e s
M u r r a y H i l i , N e w J e r s e y 07974

Abstract Standard tree structures for representing heaps (e.g.


leftist trees [9,12]) and search trees (e.g. A V L trees [1], 2-3
We use the idea of self-adjusting trees to create new,
trees [2], trees of bounded balance [13]) obtain their effi-
simple data structures for priority queues (which we call
ciency by obeying an explicit balance condition that
heaps) and search trees. Unlike other efficient implementa-
indirectly bounds the length of the relevant paths in the
tions of these data structures, self-adjusting trees have no
balance condition. Instead, whenever the tree is accessed, tree. With such a condition any single access or update
certain adjustments take place. (In the case of heaps, the operation takes O(log n) time in the worst case, where n is
adjustment is a sequence of exchanges of children, in the the number of items in the tree.
case of search trees the adjustment is a sequence of rota- We describe ways of doing away with any explicit
tions.) Self-adjusting trees are efficient in an amortized balance condition while retaining the ability to do access
sense: any particular operation may be slow but any and update operations efficiently. Rather than maintaining
sequence of operations must be fast. balance, we adjust the tree during each operation using sim-
Self-adjusting trees have two advantages over the ple adjustment heuristics. These adjustments are the same
corresponding balanced trees in both applications. First, as those used in balanced trees (exchanging children in the
they are simpler to implement because there are fewer cases case of heaps and performing single rotations in the case of
in the algorithms. Second, they are more storage-efficient search trees). The difference is that they are applied in a
because no balance information needs to be stored. Furth- uniform fashion without regard to balance. The result is
ermore, a self-adjusting search tree has the remarkable pro- that the trees behave (in an amortized sense) as though they
perty that its running time (for any sufficiently long are balanced. This approach has the following advantages
sequence of search operations) is within a constant factor o f (in both applications):
the running time for the same set of searches on any fixed (i) We can save space of at least one bit per node in the
binary tree. It follows that a self-adjusting tree is (up to a tree structure, since no balance information needs to
constant factor) as fast as the optimal fixed tree for a par- be maintained.
ticular probability distribution of search requests, even (ii) Balanced tree algorithms are plagued by a multiplicity
though the distribution is unknown. of cases. Our algorithms are simpler to state and to
program.
1. I n t r o d u c t i o n
(iii) In most balanced search tree schemes the tree remains
In this paper we present new ways of using binary static when only search operations are done. Since
trees to store heaps (otherwise known as "priority queues") self-adjusting search trees adapt to the input sequence
and search trees (also called "dictionaries", "lists", or dynamically, they can perform better (by an arbitrary
"sorted sets"). The ideas and techniques of analysis that we factor) than a fixed tree when the access pattern is
use for these two problems promise to be applicable to non-uniform.
other data structure problems.
Self-adjusting trees have two disadvantages, the signi-
ficance of which depends on the application. One is that
Permission to copy without fee all or part of this material is granted
provided that the copies are not made or distributed for direct more adjustments are made than in the corresponding bal-
commercial advantage, the ACM copyright notice and the title of the anced structures. (Maintaining a self-adjusting search tree
publication and its date appear, and notice is given that copying is by requires more rotations than a balanced tree, and maintain-
permission of the Association for Computing Machinery. To copy ing a self-adjusting heap takes more swapping of children
otherwise, or to republish, requires a fee and/or specific permission.
than a leftist heap.) The cost of a rotation in a search tree,
which we assume to be O(1), depends upon the application.
If rotations are unusually expensive, self-adjusting search
© 1983 A C M 0-89791-099-0/83/004/0235 $00.75 trees may be inefficient.

235
The other possible disadvantage is that by a carefully 2. Self-Adjusting Heaps
chosen sequence of operations it is possible to construct a A heap is a data structure consisting of a set of items
very unbalanced binary tree. Thus the worst-case bound selected from a totally ordered universe, on which the fol-
per operation is O(n), not O(log n). However with our lowing operations are possible.
adjustment heuristics the running time per operation is
O(log n) when amortized over any sequence of operations. findmin(h): Return the minimum item in heap h.
That is, a sequence of m operations (m -> n) will take deletemin(h): Delete the minimum item from heap h and
O(mlog n) time in the worst-case, even though a few opera- return it.
tions in the sequence may take fl(n) time. Since almost all insert(i, h): Insert item i into heap h, not previously con-
uses of heaps and search trees involve a sequence of opera- taining i.
tions rather than just a single operation, an amortized
bound is generally as useful as a bound on each operation.
meld(hb h2): Return the heap formed by combining disjoint
heaps hi and h2. This operation destroys hi
The only situation in which this might not be true is a real-
and h2.
time application in which it is important to have a worst-
case bound on the running time of each individual opera- There are several ways to implement heaps in a self-
tion. adjusting fashion. The one we discuss in detail is related to
the leftist trees of Crane [9] and Knuth [12]. These heaps
There is little previous work on self-adjusting binary
are so simple that we call them simply self-adjusting heaps.
search trees. Allen and Munro [3] (getting their start from
A self-adjusting heap is a binary tree with one item per
Rivest's work [14] on self-organizing linear lists used for
internal node. (All external nodes are null.) Each node x
sequential search) proposed two adjustment heuristics based
has three fields associated with it, denoted item(x), left(x),
on single rotation: single exchange, in which an accessed
and right(x). The left and right fields are pointers to the
item is rotated one step toward the tree root, and move to
left and right children, and the item field contains the item
root, in which an accessed item is moved all the way to the
of that node. The items are stored in heap order: If x and y
tree root by rotating a t every edge along the access path.
are nodes and x is the parent of y, then item(x) <- item(y).
Allen and Munro proved that move to root is efficient on
To identify and access the heap we use a pointer to the tree
the average, but simple exchange is not. Bitner [7] studied
root.
the average-case behavior of several other heuristics.
At the end of this section we give programs for the
Our results are much stronger than those of Bitner
heap operations; here we give an informal description of
and Allen and Munro. Their heuristics are efficient for an
how the operations are implemented. Since heap order
average sequence of operations, but there are pathological
implies that the root is the minimum element, we can per-
sequences for which the running time is l-l(n) per operation.
form findmin in constant time by returning the item at the
A self-adjusting search tree has the further remark- root. The other two operations are implemented using
able property that its running time for any sufficiently long meld. To do deletemin we meld the left and right subtrees
sequence of search operations is within a constant factor of of the root and return the (old) root. To do insert we make
the running time for the same set of searches on any fixed a one-item heap out of the item to be inserted and meld it
binary tree. It follows that a self-adjusting tree is as effi- with the existing heap.
cient (to within a constant factor) as the optimal fixed tree
To do meld we first delete all the edges (but not the
for a particular probability distribution of search requests.
nodes) on the right paths (paths from the roots to null
Such an optimal tree can only be constructed under the
nodes through right children) of the two trees. This creates
optimistic assumption that the access probabilities are avail-
a forest of trees whose roots have no right child. The trees
able in advance.
are then connected together in heap order by a new right
Another application of self-adjusting search trees is path through all of the roots. In other words we merge the
in the data structure for dynamic trees of Sleator and Tar- right paths of the two trees. (See Figure 1.) The time for
jan [15,16,17]. We can substitute self-adjusting trees for a meld is proportional to the length of the new right path.
biased trees [4,5,6] in that structure without affecting the
To make this algorithm efficient we must keep right
running time. The resulting data structure is significantly
paths short. Leftist trees accomplish this by maintaining
simpler since weights no longer have to be maintained.
the following property: from any node, the right path is a
In Section 2 we describe self-adjusting heaps, prove a shortest path to an external node. Maintaining this pro-
bound on their running time, and present programs to perty requires storing at every node the minimum distance
implement them. In Section 3 we describe self-adjusting to an external node, and, after a meld backing up along the
search trees, and prove that they have the claimed proper- merged path recomputing distances and swapping left and
ties. In Section 4 we give programs for two versions of right children as necessary to maintain the leftist property.
self-adjusting search trees, and in Section 5 we discuss addi- The length of the right path in a leftist tree of n nodes is at
tional results and future work. most [lg n], so eac.h of the heap operations has an
O(log n) worst-case time bound.
In our self-adjusting version of this data structure we
meld by merging the right paths of the two trees and then

236
that connect a node to its right child.) This quantity (which
we d e n o t e b y RH) starts at zero. As we meld, RH fluctu-
ates, but it never falls below zero. Let a and b be the two
trees to be melded by the ith meld. Let na and nb be their
weights, and let ni=na+nb be the weight of the tree result-
ing from the meld.
We wish to bound the total length of all the meld
paths (the right paths traversed in the trees to be melded).
To do this we consider the effect of the ith meld on RH .
By Fact 2 the number of light edges on the meld path of
heap a is at most tlg(na) l. Similarly the number of such
edges in heap b is at most Ilg(nb)J. Thus the total number
of light edges on the two paths is at most 2 [ l g ( n l ) J - 1 .
(See Figure 2.)

+light < Lt,o'noJ # light < [l~nbJ


# heavy = h o # heavy = hb

Figure 1. A meld of two self-adjusting heaps.


(a) Merge of right paths.
(b) Swapping of children along
path formed by merge.

swapping the left and right children of every node on the


merged path. (See Figure 1.) This makes the potentially
long right path formed by the merge into a left path. The
theorem and corollary below bound the time needed by a
sequence of melds, and by an arbitrary sequence of self-
adjusting heap operations.
Theorem 1: In a sequence of melds starting with singleton
heaps, the number of edges deleted during the melds is at
a~ # heavy- [IgniJ+l
h i _<

most 3Y [lg(ni)], where ni denotes the number of nodes in


the tree resulting from the ith meld.
Proof: This proof is based on the ideas used by Sleator
[15], and Sleator and Tarjan [16] to bo/md the number of
"splice" operations in a network flow algorithm. We define
the weight of each node in a heap to be the number of Figure 2. The movement of heavy edges in meld.
nodes in the subtree rooted there. We use these weights to
divide the edges into two classes: heavy and light. The edge Let ha be the number of heavy edges on the meld
connecting a node x to its parent p(x) is heavy if the weight path of heap a, and let hb be the number on the meld path
of p(x) is less than twice that of x and light if the weight of of heap b. Let hi be the number of right heavy edges
p(x) is at least twice that of x. Two facts follow immedi- incident to the leftmost path of the tree produced by the ith
ately: meld. Fact 1 tells us that each edge counted by h i (except
possibly a bottom one) corresponds to a light edge in the
Fact h Of the edges from a node to its children; at leftmost path of the heap produced by the ith meld. By
most one can be heavy. Fact 2 the number of such light edges is at most [lg(ni)J,
Fact 2: The number of light edges on the path from a so hi <-- [lg(ni)] + 1. The only right heavy edges removed
node x to the root of a tree of weight w is at in the meld process are those counted by ha and hb. The
most [lg(w)l. only ones added by the meld are those counted by hi. Thus
To get the bound we focus our attention to the RH decreases by at least ha+hb, then increases by at most
number of right heavy edges. (These are the heavy edges hi ~ [lg(ni)J + 1.

237
Since RH is nonnegative the total increase bounds the heap function deletemin(modifies heap h);
total decrease. Therefore the number of heavy edges on all heap i;
the meld paths is at most ~([lg(ni)] + 1). Furthermore the i:=h;
number of light edges on all the meld paths is at most h := meld(left(h), right(h));
Y~(2[lg(ni)]-l). Combining these estimates gives the return i
result. [] end deletemin ;
Note. The version of meld described above (and used in the
proof) is not quite the same as that presented below. In the We have included two versions of meld; a recursive
actual implementation the two right paths are traversed one, rmeld, and an iterative one, imeld. The function
from the top down. When one of the paths ends the other rxmeM is supplied to avoid doing extra tests for null in the
is simply attached to it, and the process terminates. Only recursive version.
those nodes that are traversed have their children
exchanged. (This differs from the description above in heap function rmeld(heap hi, h2);
which all nodes on the right paths are always traversed.) if h 2 : null - return h 1
The same theorem holds for the actual implementation, and [ h2#null - return rxmeld(hb h2) fi
the same proof works with slight modification. RH still end meld;
increases by at most t l g ( n l ) ] + l , and it decreases by at
least the number of heavy edges on the meld paths. The heap function rxmeld(heaphb hE);
original analysis holds for the light edges. [] if h i = n u l l ~ return h2 fi;
Corollary 1: A sequence of m findmin, deletemin, insert, if item(hi) > item(hE) ~ hi--hE fi;
and meld operations takes O(~lg(ni)) time, where n i is the left(hi), right(hi) := rxmeld(right(hl), hE), left(hO;
weight of the largest tree involved in the ith operation. return hi
end rmeld;
Proof: The time for findmin is O(1), and insert is just a
special case of meld. Thus to get the result we only have
to modify the above proof'to consider deletemin. Delete- In the iterative version of meld the invariant at the
min simply removes the root, then does a meld. The only beginning of the loop is that there are three heaps rooted at
relevant effect of deleting the root is that it may decrease x, h~, and hE that contain all of the nodes. Node y is in the
RH by one. This only improves our bound on the length of heap rooted at x, and its left child is eventually going to be
the meld paths, so we have the result. [] the meld of heaps hi and hE.

What follows is an implementation of the four opera- heap function imeld(heap hi, hE);
tions on self-adjusting heaps. The data structure is as we heap x, y;
described it in the text; each node has three fields: item(x), if h l = n u l l - return hE [ h2=null - return hl fi;
left(x) and right(x). The programs are written in a variant if item(hO > item(h2) ~ hl~-hE fi;
of Dijkstra's guarded command language [10]; we have x, y, hi, right(hi) := hi, hi, right(hO, left(hi);
used the symbol "1" to denote Dijkstra's "box", and the do h i 4: null -
symbol " - - " to denote the "swap" operator. The variables if item(hO > item(hE) ~ hl--hE fi;
of type heap are actually pointers to nodes. Parallel y, left(y), hi, right(hO := hi, hi, right(hO, left(hO
assignments all take place simultaneously, so the result is od;
well-defined even if the same variables appear on the left
left(y) := hE;
and right sides. return x
end imeld;
item function findmin(heap h);
return item(h)
end ftndmin ; Note. The swapping of hi and.hE in the loop can be avoid-
ed by writing different pieces of code for the cases
heap function insert(item i, heap h); item(hi) > item(hE) and item(hO <- item(hE). The four-way
create a new node to which n points; parallel assignment can be written with four separate as-
left(n), right(n), item(n) := null, null, i; signments as:
return meld(n, h)
end insert; left(y) := hi;
y := hi;
hi := right(y);
right(y) := left(y);

With this implementation each iteration of the loop takes


four assignments and two comparisons.
We have tested the iterative and recursive versions of
self-adjusting heaps (exactly as shown above) as well as the

238
iterative and recursive versions of leftist heaps. (The itera- 3. Self-Adjusting Search Trees
tive version we used is on page 619 of [12], and the recur- The data structure we call a "search tree" might more
sive version is in [17].) Preliminary results indicate that appropriately be called a "symmetrically ordered binary
recursive self-adjusting heaps and both forms of leftist tree", because most of its applications have nothing to do
heaps are about equally fast. However the iterative version with searching. In the most general sense, a symmetrically
of self-adjusting heaps is significantly faster than the others. ordered binary tree is a data structure that is used to
The leftist heap algorithms must make two passes represent a list of items. The fundamental property of the
over the merged path: one pass down to connect the pieces list of items that is captured by the symmetrically ordered
together, and one pass up to swap children and update the binary tree is the order of the items in the list. The kind of
distance fields. The recursive version does this by saving operations that a symmetrically ordered binary tree can
the path in the recursion stack, and the iterative version efficiently support, are those that involve manipulation of
does this by reversing pointers on the way down, and then nearby items in the list. In general a symmetrically ordered
fixing them on the way up. The iterative version avoids the binary tree can represent the items as internal or as external
overhead of recursion at the cost of more pointer assign- nodes in the tree, but in the trees we discuss the items will
ments. The iterative version of self-adjusting heaps is fast be in the internal nodes. In a symmetrically ordered binary
because it has no recursive calls, does no extra pointer tree the items are arranged in symmetric order: if x is a
manipulation, and makes only one pass over the merged node containing item i, then every item in the left subtree
path. These advantages make up for that fact that the of x comes before i in the list, and every item in the right
average meld path is longer in a self-adjusting heap than in subtree of x comes after i in the list.
a leftist heap. According to Brown [8], binomial heaps are The basic operation that is generally used to modify
faster than leftist heaps. It Would be interesting to find out a symmetrically ordered binary tree is the single rotation,
how self-adjusting heaps compare to binomial heaps. because a rotation maintains the symmetric order of the
In some applications it is useful to have another form nodes. (Case 1 of Figure 4 shows a rotation.) Rotations
of delete: are used in AVL trees [1], trees of bounded balance [13],
biased binary trees [6], and many others.. Our self-
delete(x): Delete node x from the heap containing it,
adjusting symmetrically ordered binary trees are no excep-
and return the resulting heap.
tion. (The reader may be relieved hear that, having
It is impossible to implement this type of delete with the dispelled any misunderstanding about what a search tree is,
data structure described above. To implement delete(x) it we shall proceed to call our data structure a self-adjusting
is necessary to find the parent of x so that its pointer to x search tree.)
can be changed. This means that we n e e d a pointer from
For the purposes of the discussion that follows we
each node to its parent. If there is such a pointer then
have assumed that the object to be represented is a list of
delete(x) can be done as follows: first do deletemin(x)
numbers ordered by value. In a node x, there are three
(which removes node x from the tree rooted at x), then
fields: item(x) (the number stored in node x), and left(x)
connect the resulting tree to the parent of x. All of the
and right(x) (pointers to the left and right subtrees of x).
other operations can be modified in a straightforward
Every external node is null and we access and identify a
fashion to update parent pointers.
tree with a pointer to the tree root. We shall discuss the
There is a way to allow deletion in self-adjusting following operations:
heaps while still using only two pointers per node. In node
x we keep a down pointer and an across pointer. If x is the access(i,s): If item i is in tree s return a pointer to its
root, then across(x) is null. If x is an only child or a right location, otherwise return null.
child then across(x) points to the parent of x. If x is a left insert(i,s): Insert item i into tree s, and return the
child and x has a sibling, then across(x) points to that resulting tree.
sibling. If x has no children then down(x) is null, other-
wise down(x) points to the leftmost child of x. This
delete(i,s): Delete item i from tree s if it is there, and
return the resulting tree.
representation might be called a "triangular heap" since a
node and its two children are connected by a cyclic "trian- join2(sl,s2): Return a tree representing the items in sl
gle" of pointers. Knuth [11] calls this the "binary tree followed by those of s2, destroying Sl and s2.
representation of a tree, with right threads". Notice that if (This assumes all items of sl are less than
a node is an only child there is no distinction between it those of s2.)
being a left child or a right child. This doesn't matter since join3(sl,i,s2): Return a tree representing the items in Sl
the tree is heap ordered, and the algorithm can assume that followed by item i, followed by the items of
an only child is a left child. By following at most two s2. This destroys sl and s2. (This assumes
pointers from a node we can access its parent or its left or that items of s~ are less than i, and i is less
right child, which is all we need to implement all of the than the items of s2.)
heap operations.

239
split(i,s): A s s u m i n g item i is in tree s, return a tree sl splay we begin at node x and traverse the path to the root,
containing all those items of s less than i performing a single rotation at each node. T h e rotations
and a tree s2 containing all. those items are done in pairs, in an order that depends on the structure
greater than i. This operation destroys s. of the tree. The following splay step is repeated until x is
the tree root (see Figure 4): If x has a parent but no
The following operation is unique to self-adjusting
g r a n d p a r e n t , rotate at p(x) (the parent of x). If x has a
search trees, and is the one from which we build all of the
g r a n d p a r e n t and x and p(x) are b o t h left or b o t h right chil-
others.
dren, rotate at p(p(x)) then at p(x). If x h a s a g r a n d p a r e n t
splay(i,s): R e t u r n a tree representing the same list and x is a left and p(x) a right child, or vice-versa, rotate
represented by s. If i is in the tree, then it at p(x) and again at the new p(x). T h e overall effect of the
becomes the root. If i is not in the tree, splay is to move x to the root while r e a r r a n g i n g the rest of
then either the immediate successor of i or the original path from x to the root so that any node in that
the immediate predecessor of i becomes the path is about half as far from the root as it used to be.
root. This operation destroys s. Figure 5 shows a series of splays on a tree that starts out
To do access(i,s) we splay(i,s); then i is in the tree being a long left path.
if and only if it is at the root. To do insert(i,s) we y x
splay(i,s), then break the resulting tree into two trees, one
with items less than i, one with items greater than i. (This
is just b r e a k i n g either the left or the right link from the
root.) Then we m a k e these two trees the children of a new
root with item i. To do join2(sl, s2) we splay(infinity,sO,
which m a k e s the rightmost node of sl into the root; then we z y x
attach s2 as the right child of t h i s root. To do
join3(s], i, s2) we make a node containing item i, and
m a k e its left child s= and its right child s2. To do
delete(i,s) we splay(i,s), delete the root, and join2 the left
and right subtrees. To do split(i,s) we splay(i,s) and
return the left and right subtrees of the root. (See Figure
3.)
z z x
cass3:~ :::~ ~ : ~ : ~ ~
i i
insert(i,s):
Figure 4. A splay step starting at node x.

splay /o Z~ I0
8 9 I
2
i i 6 8
I 4~10
sploy(2,s)

splay i io,nz
:g(\, 3 9
2 2 5
A ,Y
splay i i
.A A
7 6

Figure 3. How the operations are implemented using splay.


splsay(6,
)., 2 / ~
To do splay(i,s) we first use the item fields to find ,o
the vertex that is going to be m o v e d to the root. We start
with y equal to the root of s and repeat the following 5 I0
search step until y = n u l l or item(y)=i: If i<item(y), replace :30~5 0"9
a9
y by its left child; if i>item(y), replace y by its right child.
Let x be the last non-null vertex reached by this process; Figure 5. Four splay operations.
this is the vertex to be moved to the root. To fihish the

240
Splaying is reminiscent of the path compaction Case 2: Node z is defined and x and y are both left or both
heuristics (path halving in particular) used in efficient algo- right children. W e have r'(x)=r(z). The n u m b e r of
rithms for disjoint set union [17,18]. Although the tech- credits that must be added to the tree to m a i n t a i n the
niques that Tarjan and van Leeuwen [18] used to analyze invariant is r ' ( y ) + r ' ( z ) - r ( y ) - r ( x ) <-- 2 ( r ' ( x ) - r ( x ) ) ,
path halving can be modified to apply to splaying, there is a which is two thirds of the credits on hand. If
simpler analysis, which follows. r'(x) > r(x), there is at least one extra credit on h a n d
to pay for the step. Otherwise, r'(x)=r(x)=r(y)=r(z).
To analyze the r u n n i n g time of a sequence of tree In this case r'(z) < r(x) by the r a n k rule. (The r a n k
operations we use a credit invariant. (This is what we
rule is applied to the tree occurring after one rotation,
called a token invariant or chip invariant in our previous
with root y of rank r'(x)=r(z), left subtree rooted at x
work with Bent [4,5,6].) W e assign to each item i an indi-
with rank r(x), and right subtree rooted at z of r a n k
vidual weight iw(i). These weights are real n u m b e r s greater r'(z).) Also r'(y) <- r(y). Thus by putting the credits
than or equal to one, whose values we shall choose later.
from y onto y and putting all but one of the credits
W e define the total weight tw(x) of a node x to be the sum
from x onto z we m a i n t a i n the invariant and get one
of the individual weights of all descendants of x, including
credit with which to pay for the operation.
x itself. Finally, we define the rank of a node x to be
r ( x ) = [lg(tw(x))J. W e m a i n t a i n the following credit invari- Case 3: Node z is defined a n d x is a left a n d y is a right
ant: A n y internal node x holds r(x) credits. child or vice-versa. As in Case 2, r'(x)=r(z). In addi-
tion we have that tw'(y) --< tw(y), so r'(y) <- r(y). To
Each credit represents the power to do a fixed
m a i n t a i n the invariant on x and y we need only move
a m o u n t of work (e.g. rotations, comparisons, or edge
credits from z and y. To satisfy the invariant on z we
traversals). Each time we do any work we must pay for it
use the credits on x and need an additional
with a credit. If we modify the structure we may have to
r ' ( z ) - r ( x ) < - r ' ( x ) - r ( x ) , which is one third of the
put in credits to m a i n t a i n the credit invariant, or we m a y be
credits on hand. If r'(x) > r(x) then there is at least
able to remove credits (because less are required after the
one extra credit on h a n d to pay for the step. Otherwise
modification than before) and use them to do work. If we
r'(x)=r(x)=r(y)=r(z), and by the r a n k rule either
have a structure that initially has C credits in it and we do
r'(y) < r'(x) or r'(z) < r'(x) or both. W e can use the
a sequence of n operations where the ith one requires c(i)
credit from the node that decreased in r a n k to pay for
net credits ( n u m b e r spent on work + n u m b e r put in the
the operation.
tree - n u m b e r taken out), and the final structure has C '
credits in it, then the r u n n i n g time of the sequence is at Summing over all steps of a splay, we find that the
most C - C ' + E c ( i ) . The quantity c(i) is called the credit total number of credits used is at most
time of the ith operation. The following l e m m a tells us the 3 ( r ' ( x ) - r ( x ) ) + l = 3 ( r ( v ) - r ( x ) ) + l , where r ' ( ) and r ( )
credit time of the splay operation. denote the rank function before and after the entire splay. ~3
L e m m a 1. Splaying a tree with root v at a node x while In order to complete the analysis we must consider
maintaining the credit invariant takes 3 ( r ( v ) - r ( x ) ) + l the effect of insertion, deletion, join2, join3, and split on
credits. the ranks of nodes. For the m o m e n t let us define the indi-
vidual weight of every item to be 1. T h e n every node has a
Proof. W e shall need the following rank rule: If s and t are
r a n k in the range [ 0 , / l g n ] ] , and the l e m m a gives a b o u n d
siblings with equal rank, and their parent is p , then
of 3 [lg n] + 1 credits for splaying. To insert a new item i
r(p) >- l+r(s). This follows from the fact that
tw(s) -> 2 r~s~ and tw(t) :> 2r(t), so we first do a splay, then put the new item at the root. The
tw(p) -> t w ( s ) + t w ( t ) -> 2 rls)+l. Thus r(p) is at least n u m b e r of credits needed at the root is Jig n ] . Joining two
trees also requires at most [lg n] new credits at the root.
r(s) + 1.
(In both cases n is the size of the new tree.) A three way
Consider a splay step involving the nodes x, y=p(x), join requires at most [lg nJ credits at the root. To delete,
and z = p ( p ( x ) ) , where p ( ) denotes the parent function we do a splay, remove the root, then do a two way join.
before the step. Let r( ) and r ' ( ) , tw( ) and tw'( ) denote This needs no extra credits beyond those used by the two
the r a n k and total weight functions before and after the splays because the credits on the deleted root can be placed
step, respectively. To this step we allocate 3(r'(x)-r(x)) on the root of the final tree. Split needs no credits beyond
credits and one additional credit if this is the last step. those used in the splay.
T h e r e are three cases (see Figure 4):
Suppose we start with a set of singleton trees, do a
Case 1: Node z is undefined. This is the last step of the series of operations and end up with a forest of trees. The
splay and the extra credit allocated to the step pays for n u m b e r of credits is zero initially, and at the end it is at
the work. W e have r'(x)=r(y). Thus the n u m b e r of least zero. C o m b i n i n g this with the above p a r a g r a p h gives
credits that must be added to the tree to m a i n t a i n the us the following theorem:
invariant is r ' ( y ) - r ( x ) <- r ' ( x ) - r ( x ) , which is one third
of the r e m a i n i n g credits on hand. T h e o r e m 2: The total time required for a sequence of m
self-adjusting search tree operations, starting with singleton
trees, is O(mlog n), where n is the n u m b e r of items.

241
Our analysis of splay allows us to get an analogous operation running time is important. In locally biased
but more general result when the individual weights of the trees, access operations have a good worst-case time bound;
nodes are not all the same. Suppose the initial configura- in globally biased trees, all the operations have a good
tion consists of a set of separate nodes, and the number of worst-case time bound [6].
credits on node i with individual weight iw(i) is Another interesting consequence of Theorem 3 is that
[lg(iw(i))J. After a sequence of operations we reach a we can relate the behavior of a self-adjusting tree to that of
final configuration with the same set of nodes grouped into any static tree.
arbitrary trees. In this final forest of trees, the number of Theorem 4: Let t be the number of comparisons that occur
credits on node i with total weight tw(i) is /lg(tw(i))]. in a sequence of searches f r o m the root in a static binary
Since tw(i)>-iw(i), the number of credits in the final confi- search tree with n nodes. The time to do the same
guration is at least as many as in the initial configuration. sequence of splay operations in a self-adjusting search tree
This means that the total running time of the sequence of is O(t+n2).
operations is bounded by the number of credits allotted to
the operations. Recall that this allotment of credits to an Proof: Let the root of the static tree be r. Let the depth of
operation is called the credit time of the operation. The fol- a node x in the static tree (denoted d(x)) be the distance
lowing theorem bounds the credit time of each of the basic from x to the root, r. (d(r)=O.) We assign individual
operations as a function of the weights of the nodes weights to the nodes as follows: For the root r, iw(r)=3 d
involved. (where d is the largest depth in the tree). For any other
vertex x, iw(x)=3-d(X)iw(r). With this definition the
T h e o r e m 3: deepest node has weight 1. It is easy to show by induction
The credit time of splay(x,s) is O ( l g ~ ) . that 3iw(x) >-- tw(x) for all nodes x. (Here tw(x) denotes
the total weight of x in the static tree.) In particular we
The credit time of split(x,s) is O ( l g ~ ) . have 3iw(r) >- tw(r), from which it follows that
tw( x ) iw(x) >- 3-d(x)-ltw(r). Rearranging and taking logarithms
The credit time of join3(sl,i,s2) is O(lg tW(Sl)+tw(s2) ). gives us
iw(i) tw(r)
tw(sO+tw(s2) (lg3)(d(x)+ l) -> lg iw(x) "
The credit time of join2(sl,s2) is O(lg ),
tw(x)
where x is the rightmost node of tree s~. The left hand side of this inequality is lg3 times the number
The credit time of insert(x,s) is of comparisons needed to search for x in the static tree.
The right hand side is the credit time to splay at x in a serf
[, tw'(s) ] where x - is the adjusting tree with the individual weights as specified
0 ~lg min(tw(x-), tw(x), tw(x+)) j '
above.
node immediately before x in the final tree, x + is the
one immediately after, and tw'(s) is the total weight It remains for us to show that the number of credits
of s after the operation. initially in the self-adjusting tree is O(n2). It is clear that
the total weight of any node in the self-adjusting tree is at
tw(s) } most tw(r). But tw(r) ~ 3iw(r)=3 a+l ~ 3", because d, the
The credit time of delete(x,s) is O lgmin(tw(x_), tw(x)) '
largest depth in the tree, is at most n - 1 . This means that
where x - is the node immediately before x in the ini- the number of credits on each node is at most (lg3)n, so the
tial tree. total number of credits in the tree initially is at most
Proof: All of these results follow by considering the credit (lg3)n 2. []
times of the appropriate splay operations and combining A corollary of this result is that the running time of a
these with the changes in the credits needed on various self-adjusting tree is within a constant factor of the running
nodes. [] time of the optimal static tree for any particular distribu-
The remarkable thing about this result is that the tion. The surprising thing about this is that the self-
algorithm achieves these bounds without actually having adjusting tree behaves this way without knowing anything
any information about the weights. This means that what- about the distribution in advance. (Note however that the
ever the running time is, it must simultaneously satisfy the self-adjusting tree takes some time to "learn" the distribu-
bounds given in Theorem 3 for all weight distributions. tion. This learning time is embodied in the O(n 2) over-
head .)
The credit times for split, three-way join, insert, and
delete given in Theorem 3 are the same as those that Bent, There is another interesting result concerning the
Sleator, and Tarjan give for biased trees [6]. The credit running time of sequences of splays. It follows from the
times for two-way joins on self-adjusting trees and biased alternate analysis of splay (involving ideas from path
trees are not comparable, because in self-adjusting trees the compression), and we do not prove it here.
items are stored in the internal nodes and in biased trees T h e o r e m $ (log(At) theorem): The credit time of a splay
they are stored in the external nodes. Self-adjusting trees of node x is O ( l o g ( l + A t ) ) , where At is the number of
are simpler than biased trees because no weight information splays done between the current splay of x and the previous
needs to be stored or updated. The only situation in which splay of x.
biased trees may have an advantage is if worst-case per-

242
4. Implementation o f Self-Adjusting Trees looking for, and the right tree together and returns the
result. The three-value variable state is used to alter the
There are several slightly different formulations of way in which a tree is added to the left tree or the right
the splay operation that all have the desirable properties tree. This variable is what enables this version of splay to
discussed in Section 3. The method of Section 3 is easy to achieve the time bounds of Section 3.
describe, but results in a fairly complicated program. In
• this section we present programs for two splay algorithms: a In many potential uses of self-adjusting trees the pro-
top-down method, and a bottom-up method. cess of searching for an item is unnecessary. In these cases
we already have a pointer to the node we are interested in,
In the first formulation the information stored in and we want to find out something about the relationship
each node is the same as that described in Section 3: each between it and the rest of the nodes. (An example of this
node has left, right and item fields. The difference is that
is in the data structure for dynamic trees, in which we want
here the splay operation rearranges the tree on the way
to find out the minimum cost node of all those to the left of
down, instead of searching down and rearranging the tree
a particular node.) In these applications the item field is
on the way up. (We do not present programs for the other
useless, but in its place we need a parent pointer. With a
operations, because they are easy to write given splay.)
parent pointer we can find our way to the root of a tree
given only a pointer to some node in it. If desired we can
global tree dummy;
avoid using extra parent pointers by using the "triangular"
dummy := create tree node;
representation method at the end of Section 2.
tree function splay(item i, tree s); The following is a bottom-up implementation of splay
string state; that uses parent pointers. The input to splay is a pointer to
tree l, r, lp, rp ; a node. Splay makes that node into the root of the tree.
if s = n u l l ~ return null fi; The parent of the root is null.
left(dummy) := null; right(dummy) := null;
state := "N"; l := r := dummy; tree function rotateleft(tree a);
do s#:null and i > item (s) - tree b;
if state#"L" - right(l) := s; lp := 1; state := "L" b := right(a);
[ s t a t e = " L '' ~ right(l) := left(s); right(lp) "= s; right(a), parent(left(b)) := left(b), a;
left(s) := 1; state := "N" left(b), parent(a) := a, b;
fi; return b
l := s; s := right(s) end rotateleft;
[ s # n u l l and i < item(s)
if state ~ "R " ~ left(r) := s; rp := r; state := "R" tree function rotateright(tree a); [analogous to rotateleft]
[ s t a t e = " R " - left(r) := right(s); left(rp) := s;
right(s) := r; state := "N" procedure splay(tree x);
fi; string state;
r := s; s := left(s); tree l, r, y, z;
od; state = "N";
if s:/:null ~ right(l) := left(s); left(r) := right(s) 1, r, z := left(x), right(x), x;
I s = null - y := parent(x);
if r = d u m m y - s := I; right(lp) := left(l) do y ~ null
[ r # d u m m y - s := r; left(rp) := right(r) fi if right(y)=z
fi; right(y), parent(I) := l, y;
left(s) := right(dummy); right(s) := left(dummy); z, 1, y := y, y, parent(y);
return s if s t a t e ~ " L " ~ state := "L"
end splay; [ state="L" ~ l := rotateleft(l);
state := "N" fi
Variables of type tree are pointers to nodes in the I left(y)=z -
tree. Initially s (assumed to be non-null) points to the root left(y), parent(r) := r, y;
of the tree, and the "left" tree and "right" trees are empty. z, r, y := y, y, parent(y);
(The right and left fields of the dummy node point to the if state~ "R" ~ state := "R"
roots of the left and right tree respectively. The dummy [ state="R" ~ r := rotateright(r);
node obviates special purpose code for the case when the state := "N" fi
left or the right tree is null.) Splay walks down the tree fi
building up the left tree from all those subtrees to the left od;
of i and building the right tree from all those subtrees to left(x), parent(l) := I, x;
the right of i. (The variables 1 and r and lp and rp point to right(x), parent(r) := r, x;
the places in the left and right trees where the building parent(x) := null
takes place.) When it reaches either null or the item it was end splay;
looking for, it stops and puts the left tree, the node it was

243
The function rotateleft does a left rotation of a node adjusting search trees. Weights must be specified in
advance and kept in the nodes, and a different version of
(a) and its right child (b). It returns the new parent (b).
splay must be used for the split and join operations.
In the case that the left child of b is null, an assignment to
the parent of null takes place. Allowing this simplifies the There are also other versions of self-adjusting heaps
code, and only costs one extra node. that deserve mention. We can build a heap directly out of
a self-adjusting search tree by making each node of the
This implementation of splay is analogous to the first
search tree correspond to one of the keys in the heap. (The
one, except that the traversal is bottom-up instead of top-
keys can be stored in any an arbitrary order.) In each node
down. The variables l and r point to the current left and
right trees. The variable state is used as before to guide of the self-adjusting search tree we store two fields, a key
the rotations. The advantage of this bottom-up method field and a minkey field. The key field is just the key
represented by that node. The minkey field contains the
over the one described in Section 3 is that there are fewer
minimum key in the subtree rooted there. To do a delete-
pointer updates.
min we first find the node with the minimum key by walk-
Split, join and insert can be implemented essentially ing down the tree from the root always taking the Child that
as described in Section 3. The operation delete(x) can be has the minimum minkey field, until we get to the node
implemented with only one call to splay. First the children whose key is the minkey of the root. We then use our self-
of x are joined together, then the resulting tree is placed adjusting search tree routine to delete this node from the
where x was in the oritinal tree. (We needn't traverse the tree. (The routines can easily be modified to maintain the
path from x to the root, nor splay along that path.) minkey field.) The advantage of this form of heap is that
the keys can also be maintained in total order that is
5. Additional Results independent of the heap order of the keys. (Splitting and
Our results on self-adjusting binary trees raise a joining of these heaps based on the symmetric order of the
number of general and specific questions about self- nodes is possible.)
adjusting data structures, and we are continuing our work We can use a similar method to represent a heap by
in this area. Below we mention one major application and a self-adjusting search tree with all of the keys stored in the
several variants of self-adjusting search trees and heaps. external nodes. In this version each external node has a key
Self-adjusting search trees can be used in place of field, and each internal node has a minkey field. The
biased trees in the dynamic tree structure of Sleator and advantage of this scheme is that we can meld two heaps (by
Tarjan [15,16,17]. The result is a considerable simplifica- joining the self-adjusting trees) in constant actual time, as
tion of that structure. Details may be found in [17], and mentioned earlier in this section.
will appear in the full version of this paper. We conjecture that under a suitable measure of com-
There is another version of splay that might be called plexity, self-adjusting trees perform within a constant factor
"move half way to the root". In this version the splayed of any binary search tree scheme, on any sequence of
node does not move all the way to the root, but its distance operations. We have formulated a rigorous version of this
to the root is halved. The only difference between it and conjecture and are attempting to prove it. Details will
the bottom-up version is that in case 2 (see Figure 4) only appear in the full paper.
the first rotation is done. The theorems of Section 3 apply
to this version of splay. The reason it is not as useful as References
the move to root versions is that the join and split opera-
[1] G . M . Adel'son-Vel'skii and E. M. Landis, "An
tions have to be implemented separately, rather than fol-
algorithm for the organization of information," Soviet
lowing automatically from splay.
Math. Dokl., 3 (1962), 1259-1262.
It is possible to implement self-adjusting search trees
[2] A . V . Aho, J. E. Hopcroft, and J. D. Ullman, The
with all of the items stored in the external nodes. Splay
Design and Analysis of Computer Algorithms,
then moves a specified external node to Within two steps of
Addison-Wesley, Reading, MA, 1974.
the root. The algorithm to do this is a simple modification
of the algorithm to do bottom-up splay given in Section 4. [3] B. Allen and I. Munro, "Self-organizing search
One advantage of this is that the actual time to join two trees," Journal ACM 25 (1978), 526-535
trees is constant, although the credit time is O(Iog n). [4] S . W . Bent, Dynamic Weighted Data Structures, TM
Another alternative implementation of splay is one in STAN-CS-82-916 Computer Science Dept., Stanford
which the ranks are actually kept in the nodes. Rotations University, Stanford, CA 94305, 1982.
are only done in the splay when the rank of a node is equal [5] S . W . Bent, D. D. Sleator and R. E. Tarjan, "Biased
to the rank of its grandparent (using the splay of Section 2-3 trees," Proc. Twenty-First Annual IEEE Symp. on
3). With this implementation the number of credits in the Foundations of Computer Science (1980), 248-254
tree decreases by at least one with each rotation. This
[6] S . W . Bent, D. D. Sleator and R. E. Tarjan, "Biased
means that if we do a series of searches the number of rota-
search trees," to appear.
tions is bounded by the number of credits initially in the
tree. Although this version does far fewer rotations, it
lacks much of the beauty of the other forms of self-

244
[7] J . R . Bitner, "Heuristics that dynamically organize
data structures," SlAM Journal on Computing 8
(1979), 82-110.
[8] M . R . Brown, The Analysis Of a Practical and Nearly
Optimal Priority Queue, TM STAN-CS-77-600 Com-
puter Science Dept., Stanford University, Stanford,
CA 94305, 1977.
[9] C . A . Crane, Linear Lists and Priority Queues as Bal-
anced Binary Trees, TM STAN-CS-72-259 Computer
Science Dept., Stanford University, Stanford, CA
94305, 1972.
[10] E. W. Dijkstra, A Discipline of Programming, Pren-
tice Hall, Englewood Cliffs, NJ, 1976.
[11] D. E. Knuth, The Art of Computer Programming,
Volume 1: Fundamental Algorithms, Second Edition,
Addison-Wesley, Reading, MA, 1973.
[12] D. E. Knuth, The Art of Computer Programming,
Volume 3: Sorting and Searching, Addison-Wesley,
Reading, MA, 1973.
[13] J. Nievergelt and E. M. Reingold, "Binary search
trees of bounded balance," SIAM journal on Comput-
ing, 2 (1973) 33-43.
[14] R. L. Rivest, "On Self-organizing sequential search
heuristics," Comm. ACM 19 (1976), 63-67.
[15] D . D . Sleator, An O(nmlog n) Algorithm for Maximum
Network Flow, TM STAN-CS-80-831 Computer Sci-
ence Dept., Stanford University, Stanford, CA
94305, 1980.
[16] D . D . Sleator and R. E. Tarjan, "A data structure
for dynamic trees," Journal Computer and System Sci-
ences, to appear; see also Thirteenth Annual ACM
Symposium on Theory of Computing (1981), 114-122.
[17] R. E. Tarjan, Data Structures and Network Algo-
rithms, Society for Industrial and Applied Mathemat-
ics, Philadelphia, PA, 1983, to appear.
[18] R. E. Tarjan and J. van Leeuwen, "Worst-case
analysis of set union algorithms," Journal ACM, sub-
mitted.

245

You might also like