0% found this document useful (0 votes)
341 views

Segment Tree and Interval Tree

This document discusses segment trees and interval trees, which are data structures used for solving problems involving intervals or segments. Segment trees allow querying which intervals contain a given point in logarithmic time and are used to solve problems like reporting intersecting rectangles. Interval trees improve on segment trees for some problems. Higher dimensional versions are also discussed. The document outlines segment trees and how they can be used for problems like window queries, rectangle intersection counting, and computing the measure of unions of boxes. It provides details on the structure of segment trees, how to perform insertion and querying, and analyzes the time and space complexity.

Uploaded by

fkou0623
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
341 views

Segment Tree and Interval Tree

This document discusses segment trees and interval trees, which are data structures used for solving problems involving intervals or segments. Segment trees allow querying which intervals contain a given point in logarithmic time and are used to solve problems like reporting intersecting rectangles. Interval trees improve on segment trees for some problems. Higher dimensional versions are also discussed. The document outlines segment trees and how they can be used for problems like window queries, rectangle intersection counting, and computing the measure of unions of boxes. It provides details on the structure of segment trees, how to perform insertion and querying, and analyzes the time and space complexity.

Uploaded by

fkou0623
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Segment trees and interval trees

Lekcija 11

Sergio Cabello
[email protected]
FMF
Univerza v Ljubljani

Includes slides by Antoine Vigneron

Sergio Cabello RC – More trees


Outline

I segment trees
• stabbing queries

• windowing problem

• rectangle intersection

• Klee’s measure problem

I interval trees
• improvement for some problems

I higher dimension

Sergio Cabello RC – More trees


Data structure for stabbing queries

I orthogonal range searching: data is points, queries are rectangles


I stabbing problem: data is rectangles, queries are points
I in one dimension
• data: a set of n intervals

• query: report the k intervals that contain a query point q

I in Rd
• data: a set of n isothetic (axis-parallel) boxes

• query: report the k boxes that contain a query point q

Sergio Cabello RC – More trees


Motivation

I in graphics and databases, objects are often stored according to


their bounding box

Object

Bounding box

I query: which objects does point x belong to?


I first find objects whose bounding boxes contain x
I then perform screening

Sergio Cabello RC – More trees


Data structure for windowing queries

I windowing queries
• data: a set of n disjoint segments in R2

• query: report the k segments that intersect a query

rectangle R.
I motivation: zoom in maps

Sergio Cabello RC – More trees


Rectangle intersection

I input: a set B of n isothetic boxes in R2


I output: all the intersecting pairs in B 2

b4
b1

b3
b5

b2

I output: (b1 , b3 ),(b2 , b3 ),(b2 , b4 ),(b3 , b4 )

Sergio Cabello RC – More trees


Klee’s measure problem

I input: a set B of n isothetic boxes


I output: the area/volume of the union

I well understood in R2 ⇒O(n log n) time


I the union can have complexity Θ(n2 ). Example?
I poorly understood in Rd for d > 2

Sergio Cabello RC – More trees


Segment tree
I a data structure to store intervals, or segments
I allows to answer stabbing queries
• in R2 : report the segments that intersect a query vertical

line l
• in R: report the segments that intersect a query point

reported

reported

reported

l
• query time: O(log n + k)
• space usage: O(n log n)
• preprocessing time: O(n log n)
Sergio Cabello RC – More trees
Notations

I let S = (s1 , s2 , . . . sn ) be a set of segments in R


I let E be the set of the x–coordinates of the endpoints of the
segments of S
I we assume general position, that is: |E | = 2n
I first sort E in increasing order
I E = {e1 < e2 < · · · < e2n }

Sergio Cabello RC – More trees


Atomic intervals
I E splits R into 2n + 1 atomic intervals:
• [−∞, e ]
1
i i +1 ] for i ∈ {1, 2, . . . 2n − 1}
• [e , e

• [e
2n , ∞]
I these are the leaves of the segment tree

Sergio Cabello RC – More trees


Internal nodes
I the segment tree T is a balanced binary tree
I each internal node u with children v and v 0 is associated with an
interval Iu = Iv ∪ Iv0
I an elementary interval is an interval associated with a node of T
(it can be an atomic interval)

v v0
Iu

Iv Iv 0

Sergio Cabello RC – More trees


Example
root

Sergio Cabello RC – More trees


Partitioning a segment
I let s ∈ S be a segment whose endpoints have x–coordinates ei
and ej
I [ei , ej ] is split into several elementary intervals
I they are chosen as close as possible to the root
I s is stored in each node associated with these elementary
intervals
root

E
s

Sergio Cabello RC – More trees


Canonical subsets

I each node u is associated with a canonical subset S(u) of


segments
I let ei < ej be the x–coordinates of the endpoints of s ∈ S
I then s is stored in S(u) iff Iu ⊂ [ei , ej ] and Iparent(u) 6⊂ [ei , ej ]
I standard segment tree: S(u) is stored as a list pointed from u
I we can also add more structure/data/pointers from u
I useful for multi-level data structures
I we will use it

Sergio Cabello RC – More trees


Example

root

Sergio Cabello RC – More trees


Answering a stabbing query

root

Sergio Cabello RC – More trees


Answering a stabbing query

Algorithm ReportStabbing (u, xl )


Input: root u of T , x–coordinate of l
Output: segments in S that cross l
1. if u == NULL
2. then return
3. output S(u) traversing the list pointed from u
4. if xl ∈ Iu.left
5. then ReportStabbing (u.left, xl )
6. if xl ∈ Iu.right
7. then ReportStabbing (u.right, xl )

I it clearly takes O(k + log n) time

Sergio Cabello RC – More trees


Inserting a segment

root

E
s

Sergio Cabello RC – More trees


Insertion in a segment tree

Algorithm Insert(u, s)
Input: root u of T , segment s. Endpoints of s have x–coordinates
x− < x+
1. if Iu ⊂ [x − , x + ]
2. then insert s into the list storing S(u)
3. else
4. if [x − , x + ] ∩ Iu.left 6= ∅
5. then Insert(u.left, s)
6. if [x − , x + ] ∩ Iu.right 6= ∅
7. then Insert(u.right, s)

Sergio Cabello RC – More trees


Main Property
Lemma
A segment s is stored at most twice at each level of T .
Dokaz.
I by contradiction
I if s stored at more than 2 nodes at level i
I let u be the leftmost such node, u 0 be the rightmost
I let v be another node at level i containing s
v .parent

u v u0

I then Iv .parent ⊂ [x − , x + ]
I so s cannot be stored at v
Sergio Cabello RC – More trees
Analysis

I lemma of previous slide implies


• each segment stored in O(log n) nodes

• space usage: O(n log n)

I insertion in O(log n) time


• at most four nodes are visited at each level

I actually space usage is Θ(n log n) (example?)


I query time: O(k + log n)
I preprocessing
• sort endpoints: Θ(n log n) time

• build empty segment tree over these endpoints: O(n) time

• insert n segments into T : O(n log n) time

• overall: Θ(n log n) preprocessing time

Sergio Cabello RC – More trees


Rectangle intersection

I input: a set B of n isothetic boxes in R2


I output: all the intersecting pairs in B 2
I using segment trees, we give an O(n log n + k) time algorithm,
where k is the number of intersecting pairs
I this is optimal. Why?
I note: faster than our line segment intersection algorithm
I space usage: Θ(n log n) due to segment trees
I space usage is suboptimal

Sergio Cabello RC – More trees


Two kinds of intersections
I overlap

I inclusion

I intersecting edges I we can find them using


I reduces to intersection reporting stabbing queries
for isothetic segments
I done as exercise (first
homework)

Sergio Cabello RC – More trees


Reporting overlaps

I equivalent to reporting intersecting edges


I plane sweep approach
I sweep line status: BBST containing the horizontal line segments
that intersect the sweep line, by increasing y –coordinates
I each time a vertical line segment is encountered, report
intersection by range searching in the BBST
I preprocessing time: O(n log n) for sorting endpoints
I running time: O(k + n log n)

Sergio Cabello RC – More trees


Reporting inclusions

I also using plane sweep: sweep a horizontal line from top to


bottom
I sweep line status: the boxes that intersect the sweep line l , in a
segment tree with respect to x–coordinates
• the endpoints are the x–coordinates of the horizontal edges

of the boxes
• at a given time, only rectangles that intersect l are in the

segment tree
• we can perform insertion and deletions in a segment tree in

O(log n) time
I each time a vertex of a box is encountered, perform a stabbing
query in the segment tree

Sergio Cabello RC – More trees


Remarks

I at each step a box intersection can be reported several times


I in addition there can be overlap and vertex stabbing a box at the
same time

I to obtain each intersecting pair only once, make some simple


checks. How?

Sergio Cabello RC – More trees


Stabbing queries for boxes

I in Rd , a set B of n boxes
I for a query point q find all the boxes that contain it
I we use a multi-level data structure, with a segment tree in each
level
I inductive definition, induction on d
I first, we store B in a segment tree T with respect to
x1 –coordinate
I for each node u of T , associate a (d − 1)–dimensional
multi–level segment tree for the segments S(u), with respect to
(x2 , x3 . . . xd )

Sergio Cabello RC – More trees


Performing queries

I search for q in T using x1 –coordinate


I for all nodes in the search path, query recursively the
(d − 1)–dimensional multi–level segment tree
I there are log n such queries
I by induction on d, it follows that
• query time: O(k + log n)
d
d
• space usage: O(n log n)
d
• preprocessing time : O(n log n)

I can be slightly improved. . .

Sergio Cabello RC – More trees


Windowing queries

I in Rd , a set S of n disjont segments


I for a query axis-aligned rectangle R, find all the segments
intersecting R
I three types of segments intersect R:
• segments with one endpoint inside R

• segments that intersect vertical side of R

• segments that intersect horizontal side of R

I first type: range tree over the endpoints of the segments


I second type: multi-level data structure with segment tree
• store S in a segment tree T with respect to x–coordinate

• for each node u of T , store the segments S(u) sorted by

their intersection with vertical line in BST

Sergio Cabello RC – More trees


Windowing queries

I for segments of the second type:


• a query visits O(log n) nodes of the main tree

• the canonical subsets of those nodes are disjoint

• in each node we spend O(log n) time, plus time to report

segments (1d range-tree)


• each segment is reported once, because disjointness

I each segment reported at most twice: filter them


I For n disjoint segments:
2
• preprocessing: O(n log n) time
2
• query: O(k + log n) time

I where did we use that the segments are disjoint?

Sergio Cabello RC – More trees


Klee’s measure problem
I in R2 , a set S of n axis-parallel rectangles
I compute area of the union
I solution using O(n log n) time
I sweep a vertical line ` from left to right
T S
• keep the length of ` ( S)
• events: length changes when rectangles start or stop

intersecting `
• relevant values: distance between consecutive events and

the length
• we compute the area to the left of `, updating it at each

event
I use segment trees to maintain the length
I

http://www.cgl.uwaterloo.ca/~krmoule/courses/cs760m/klee
Sergio Cabello RC – More trees
Klee’s measure problem

I we need to maintain the length of union of intervals under


insertion and deletion of intervals
I make a segment tree (we know all endpoints in advance)
I at each node u we store
• list of S(u) (actually its cardinality is enough)

• length(u): the length of I covered by segments stored


u
below u
• note that length(u) only depends on subtree rooted at u

• this allows quick updates

I length(root) is the real length we want


I insertion or deletion of interval takes O(log n) time
• if S(u) 6= ∅, then length(u) = length(I )
u
• else, length(u) = length(u.left) + length(u.right)

Sergio Cabello RC – More trees


Klee’s measure problem

I in R3 best known algorithm in O(n3/2 ) time


I only lower bound: Ω(n log n)
I in R3 , recent progress for unit boxes

Sergio Cabello RC – More trees


Interval trees

I interval trees allow to perform stabbing queries in one dimension


• query time: O(k + log n)

• preprocessing time: O(n log n)

• space: O(n)

I based on different approach

Sergio Cabello RC – More trees


Preliminary

I let xmed be the median of E


• S : segments of S that are completely to the left of x
l med
• S
med : segments of S that contain xmed
• S : segments of S that are completely to the right of x
r med
Smed

Sl
Sr

xmed

Sergio Cabello RC – More trees


Data structure

I recursive data structure


I left child of the root: interval tree storing Sl
I right child of the root: interval tree storing Sr
I at the root of the interval tree, we store Smed in two lists
• M is sorted according to the coordinate of the left endpoint
L
(in increasing order)
• M
R is sorted according to the coordinate of the right
endpoint (in decreasing order)

Sergio Cabello RC – More trees


Example
s1
s3 s2
s4

s5 s6
s7

Ml = (s4 , s6 , s1 )
Mr = (s1 , s4 , s6 )

Interval tree on Interval tree on


s3 and s5 s2 and s7

Sergio Cabello RC – More trees


Stabbing queries

I query: xq , find the intervals that contain xq


I if xq < xmed then
• Scan M in increasing order, and report segments that are
l
stabbed. When xq becomes smaller than the x–coordinate
of the current left endpoint, stop.
• recurse on S
l
I if xq > xmed
• analogous, but on the right side

Sergio Cabello RC – More trees


Analysis

I query time
• size of the subtree divided by at least two at each level

• scanning through M or M : proportional to the number of


l r
reported intervals
• conclusion: O(k + log n) time

I space usage: O(n) (each segment is stored in two lists, and the
tree is balanced)
I preprocessing time: easy to do it in O(n log n) time
I pseudocode

Sergio Cabello RC – More trees

You might also like