notes07
notes07
Roberto Tamassia
Computational Geometry Sem. II, 1992–1993
1 Introduction
In the last lecture, we looked at interval trees. For interval point enclosure problems, they
use linear space and optimal time. Today, we shall study priority search trees, useful for
problems that involve intervals intersecting other intervals.
Our examples will be based on a set of 13 intervals a through m along a line l, shown in
Fig. 1. The intervals have been drawn underneath the line so as to distinguish overlapping
sections.
l
a
f j m
h
k
i b
n d
c
e
g
We start off by marking off on l the endpoints of each interval. Then, we extend the two
endpoints of each interval diagonally till they meet (see Fig. 2).
We now have a one-to-one and onto mapping between points above the line l and intervals
on l. We shall use small letters for names of intervals, and corresponding capital letters for
names of points.
2 Range Queries
We have looked at range queries in earlier lectures. They answer questions of the following
type: Given a set of points S and a range R, find all points from S which fall within R. A
range is specified by providing one or more of the following: (1) an upper bound on X, (2)
a lower bound on X, (3) an upper bound on Y, (4) a lower bound on Y.
Now, let us consider the following three questions on a fixed set of intervals:
1
A
B
H
C
N
D
G
E
F J
K
M
I
l
a
f j m
h
k
i b
n d
c
e
g
We will show that all these questions can be represented as range queries on the set of
points created as above. For this purpose, we first want to rotate Fig. 2 by 45 degrees,
obtaining Fig. 3.
The next three diagrams (Figs. 4, 5, 6) show how the above three queries on intervals are
represented as queries on the corresponding points. In each case, the query is a quadrangle
delineated by two rays, and the answer to the query is the set of intervals which correspond
to the points that fall within the quadrangle (they are thickened in the diagrams).
In a regular range query, the range is a rectangle (i.e., all four of the bounds are specified).
We have already seen that balanced search trees are the best data structure for these queries.
However, as we can see from the diagrams above, in the range queries used for interval
containment problems, the range is always unbounded on two sides. This led to a search for
a faster algorithm that would somehow benefit from this fact.
2
F
N
f A
n a H
I
G
i J
h B
j C
b D
E
g k
c d
M
m
e
F
N
f A
a H
n I
G
i J
h B
C
j
K
b D
E
g k
c
d
M
m
e
3
F
N
f A
a H
n I
G
J
h i B
C
j
K
D
b E
g k
c d
M
m
e
F
N
f A
a H
n I
G
i J
h B
j C
D
b E
g k
c d
M
m
e
4
3 Priority Search Trees: the Background
We can find other examples of range searching on a non-rectangular range. For example,
finding all points between two given vertical lines (see Fig. 7 (a)). The best approach for
this problem would be to use a balanced binary search tree.
(a) (b)
Another example would be to find all points above a given horizontal line (see Fig. 7
(b)). Though we could also use a balanced search tree here, a better approach in this case
is a heap.
A priority search tree is a hybrid of a heap and a balanced search tree. They were
discovered recently [1], to be used for range queries where at least one of the sides of the
range is unbounded.
The rest of the paper assumes that we are dealing with a priority search tree where the
upper bound on Y is missing (i.e., the range for the queries will only specify a lower bound
on Y). It should be easy to modify the algorithms to apply to cases where a different bound
is missing.
• The point P in S with the the greatest Y-coordinate becomes the root R.
• Let X(P ) be a value such that half of points in S − P have X-coordinate lower than
X(P ), and half have higher.
• Recursively create a priority search tree on the lower half of S − P , let its root be the
left child of R.
• Recursively create a priority search tree on the upper half of S − P , let its root be the
right child of R.
5
F F
N A N A
H H
I I
G G
J J
B B
C C
K K
D D
E E
M M
X(N) X(A)
X(F)
F F(-1,9):8
N N(1,8):3 A(15,7):11
A
H
I H(7,6):5
I(-2,5)
G
J
B
C G(6,4) B(16,2):13
J(2,3)
K
D
E
C(12,1)
K(4,0)
M E(10,-2) D(14,-1)
X(I) X(H) X(E) X(C)
M(9,-3)
3rd level completed tree
Fig. 8 illustrates the construction of a priority search tree on our set of 13 points. The
points already in the tree are solid; the points being chosen next are shaded; null pointers
are shown as pointers to little boxes. In the final picture, each tree node P is labeled by its
coordinates, followed by X(P) if it is different from the X-coordinate. These labels will be
used in the next chapter.
6
5 Querying a Priority Search Tree
We shall not be concerned with updating the priority search tree dynamically; this topic will
be addressed in the next lecture.
A query on a priority search tree is as follows: given x’, x”, and y’, what are all the points
in the tree whose X-coordinate is between x’ and x”, and whose Y-coordinate is greater than
y’? The following is the algorithm for performing this query:
• Let R be the root of the tree, x be its X-coordinate, y be its Y-coordinate, and X(R)
be the value of the axis separating the X-ranges of R’s child subtrees.
• Compare y to y 0 . If y < y 0 , we return without finding any points (all other nodes in
the tree will have an even smaller Y-coordinate).
• If x0 < X(R), the X-range of the left subtree must overlap with the X-range of the
query. Recursively search the left subtree of R.
• If X(R) < x00 , the X-range of the right subtree must overlap with the X-range of the
query. Recursively search the right subtree of R.
An example of a priority search tree query, with x0 = 0, x00 = 11, y 0 = 4.5, is shown
in Fig. 9. Nodes that are visited but not reported are shaded; nodes that are visited and
reported are solid.
2. A node is visited and not reported, but its X-coordinate falls within the [x 0 , x00 ] range.
7
x’ = 0 y’ = 4.5 x’’ = 11
F(-1,9):8
N(1,8):3 A(15,7):11
H(7,6):5
I(-2,5)
G(6,4)
J(2,3) B(16,2):13
C(12,1)
K(4,0)
D(14,-1)
E(10,-2)
M(9,-3)
8
• In this case, its parent had to be visited, and have a good Y-coordinate, falling
into one of the other categories.
• Therefore, the total number of these nodes is at most 2 * (all other visited nodes).
Summing up the total number of visited nodes, we obtain a query time of O(k + log n),
as is desired.
References
[1] Edward M. McCreight, “Priority Search Trees,” SIAM J. Comput., Vol. 14, No. 2, pp.
257–276, May 1985.