Segment Tree: Efficient Range Queries
Segment Tree: Efficient Range Queries
Unit 5
1. Segment Tree
Segment Tree is a data structure that allows efficient querying and updating of
intervals or segments of an array.
The tree is built recursively by dividing the array into segments until each
segment represents a single element.
This structure enables fast query and update operations with a time complexity
of O(log n)
The following diagram shows a segment tree built for an array [1, 4, 5, 5, 9, 10, 10,
12, 19, 31, 41] of size 11. The example tree is built for range sum queries. Every node
stores sum of a range. The root nodes stores sum of the whole array and leaf nodes
store sums of single elements in the array.
Types of Operations:
The operations that the segment tree can perform must be binary and associative.
Some of the examples of operations are:
Finding Range Sum Queries
Finding number of zeros in the given range or finding index of Kth zero
At each level, we divide the array segments into two parts. If the given array
had [0, . . ., N-1] elements in it then the two parts of the array will be [0, . . .,
N/2-1] and [N/2, . . ., N-1].
We will then recursively go on until the lower and upper bounds of the range
become equal.
The segment tree is generally represented using an array where the first value stores
the value for the total array range and the child of the node at the ith index are at (2*i
+ 1) and (2*i + 2).
There are two important points to be noted while constructing the segment tree:
If the problem definition states that we need to calculate the sum over ranges, then the
value at nodes should store the sum of values over the ranges.
The child node values are merged back into the parent node to hold the value
for that particular range, [i.e., the range covered by all the nodes of its subtree].
In the end, leaf nodes store information about a single element. All the leaf
nodes store the array based on which the segment tree is built.
Following are the steps for constructing a segment tree:
The merge operation will take constant time if the operator takes constant time. SO
building the whole tree takes O(N) time.
Range Query
Given two integers L and R return the sum of the segment [L, R]
The first step is constructing the segment tree with the addition operator and 0 as the
neutral element.
If the range is one of the node's range values then simply return the answer.
Otherwise, we will need to traverse the left and right children of the nodes and
recursively continue the process till we find a node that covers a range that
totally covers a part or whole of the range [L, R]
While returning from each call, we need to merge the answers received from
each of its child.
As the height of the segment tree is logN the query time will be O(logN) per query.
Point Updates
Given an index, idx, update the value of the array at index idx with value V
The element's contribution is only in the path from its leaf to its parent. Thus
only logN elements will get affected due to the update.
For updating, traverse till the leaf that stores the value of index idx and update the
value. Then while tracing back in the path, modify the ranges accordingly.
Example code:
Below is the implementation of construction, query, and point update for a segment
tree:
class SegmentTreeExample {
if (start == end) {
tree[node] = arr[start];
return;
static void update(int node, int start, int end, int idx, int value) {
if (start == end) {
arr[idx] += value;
tree[node] += value;
return;
else
update(2 * node + 1, mid + 1, end, idx, value);
static int query(int node, int start, int end, int l, int r) {
return 0;
return tree[node];
build(1, 0, n - 1);
+ query(1, 0, n - 1, 0, 3));
update(1, 0, n - 1, 1, 100);
+ query(1, 0, n - 1, 1, 3));
Output
We can delay some updates (avoid recursive calls in update) and do such
updates only when necessary when there are several updates and updates are
being performed on a range.
A node in a segment tree stores or displays the results of a query for a variety
of indexes.
Additionally, all of the node's descendants must also be updated if the update
operation's range includes this node.
o Take the node with the value 27 in the picture above as an example. This
node contains the sum of values at the indexes 3 to 5. This node and all
of its descendants must be updated if our update query covers the range
of 2 to 5.
We make an array called lazy[] to stand in for the lazy node. The size
of lazy[] is the same as the array used to represent the segment tree in the code
following, which is tree[].
o There are no pending changes on the segment tree node i if lazy[i] has a
value of 0.
o A non-zero value for lazy[i] indicates that before doing any queries on
node i in the segment tree, this sum needs to be added to the node.
Time_Complexity: O(N)
Auxiliary Space: O(MAX)
Applications:
Advantages:
Disadvantages:
Time complexity: The time complexity of segment tree operations like build,
update, and the query is O(log N) , which is higher than some other data
structures like the Fenwick tree.
2. Tries
The trie data structure works based on the common prefixes of strings. The root node
can have any number of nodes considering the amount of strings present in the set.
The root of a trie does not contain any value except the pointers to its child nodes.
Standard Tries
Compressed Tries
Suffix Tries
The trie data structures also perform the same operations that tree data structures
perform. They are −
Insertion
Deletion
Search
Insertion operation
The insertion operation in a trie is a simple approach. The root in a trie does not hold
any value and the insertion starts from the immediate child nodes of the root, which
act like a key to their child nodes. However, we observe that each node in a trie
represents a singlecharacter in the input string. Hence the characters are added into the
tries one by one while the links in the trie act as pointers to the next level nodes.
Example
import [Link];
import [Link];
class TrieNode {
boolean isEndOfWord;
TrieNode() {
isEndOfWord = false;
class Trie {
Trie() {
curr = [Link](ch);
[Link] = true;
TrieNode getRoot() {
return root;
if ([Link]) {
[Link](prefix);
[Link]("Lamborghini");
[Link]("Mercedes-Benz");
[Link]("Land Rover");
[Link]("Maruti Suzuki");
Output
Lamborghini
Land Rover
Maruti Suzuki
Mercedes-Benz
Deletion operation
The deletion operation in a trie is performed using the bottom-up approach. The
element is searched for in a trie and deleted, if found. However, there are some special
scenarios that need to be kept in mind while performing the deletion operation.
Case 1 − The key is unique − in this case, the entire key path is deleted from the node.
(Unique key suggests that there is no other path that branches out from one path).
Case 2 − The key is not unique − the leaf nodes are updated. For example, if the key
to be deleted is see but it is a prefix of another key seethe; we delete the see and
change the Boolean values of t, h and e as false.
Case 3 − The key to be deleted already has a prefix − the values until the prefix are
deleted and the prefix remains in the tree. For example, if the key to be deleted
is heart but there is another key present he; so we delete a, r, and t until only he
remains.
Example
import [Link];
import [Link];
class TrieNode {
boolean isEndOfWord;
TrieNode() {
isEndOfWord = false;
class Trie {
Trie() {
curr = [Link](ch);
[Link] = true;
}
TrieNode getRoot() {
return root;
if (index == [Link]()) {
if (![Link]) {
char ch = [Link](index);
if () {
if (shouldDeleteChild) {
return false;
}
if ([Link]) {
[Link](prefix);
[Link]("Lamborghini");
[Link]("Mercedes-Benz");
[Link]("Land Rover");
[Link]("Maruti Suzuki");
//Before Deletion
printWords([Link](), "");
String s1 = "Lamborghini";
[Link](s1);
[Link](s2);
printWords([Link](), "");
Output
lamborghini
landrover
marutiouzuki
mercezenz
marutiouzuki
mercezenz
Search operation
Searching in a trie is a rather straightforward approach. We can only move down the
levels of trie based on the key node (the nodes where insertion operation starts at).
Searching is done until the end of the path is reached. If the element is found, search is
successful; otherwise, search is prompted unsuccessful.
Example
import [Link];
import [Link];
class TrieNode {
boolean isEndOfWord;
TrieNode() {
isEndOfWord = false;
class Trie {
Trie() {
curr = [Link](ch);
[Link] = true;
TrieNode getRoot() {
return root;
if () {
return false;
}
curr = [Link](ch);
return [Link];
if () {
return false;
curr = [Link](ch);
return true;
if ([Link]) {
[Link](prefix);
[Link]("Lamborghini");
[Link]("Mercedes-Benz");
[Link]("Land Rover");
[Link]("Maruti Suzuki");
[Link]("Searching Cars");
Output
Lamborghini
Land Rover
Maruti Suzuki
Mercedes-Benz
Searching Cars
Found? true
Found? true
Found? false
Found? true
Found? false
3. Game Theory
Here we will focus on two-player games that do not contain random elements.
Our goal is to find a strategy we can follow to win the game no matter what the
opponent does if such a strategy exists.
It is assumed that the game will end at some point after a fixed number of
moves. Unlike chess, where you can have an unlimited number of moves
possible especially when you are left with the only king, but if you add an extra
constraint that says “game should be ended within ‘n’ numbers of moves”, that
will be a terminal condition. This is the kind of assumption a game theory is
looking for.
It turns out that there is a general strategy for such games, and we can analyze
the games using the nim theory.
Initially, we will analyze simple games where players remove sticks from
heaps, and after this, we will generalize the strategy used in those games to
other games.
Let us consider a game where there is initially a heap of n-sticks. Players A and B
move alternately, and player A begins. On each move, the player has to remove 1, 2,
or 3 sticks from the heap, and the player who removes the last stick wins the game.
This game consists of states 0, 1, 2,..., n, where the number of the state corresponds to
the number of sticks left.
Tic Tac Toe = Tic Tac Toe is a classic two-player game where the players take
turns placing either X or O in a 3x3 grid until one player gets three in a row
horizontally, vertically, or diagonally, or all spaces on the board are filled.
A winning state is a state where the player will win the game if they play optimally,
and a Losing state is a state where the player will lose the game if the opponent plays
optimally. It turns out that we can classify all states of a game so that each state is
either a winning state or a losing state
In the above game, state 0 is clearly a losing state because the player cannot make any
moves.
State 4, in turn, is a losing state, because any move leads to a state that is a
winning state for the opponent.
More generally, if there is a move that leads from the current state to a losing state, the
current state is a winning state, and otherwise, the current state is a losing state.
Using this observation, we can classify all states of a game starting with losing states
where there are no possible [Link] states 0...15 of the above game can be
classified as follows (W denotes a winning state and L denotes a losing state):
Stat 1 1 1 1 1 1
0 1 2 3 4 5 6 7 8 9
es 0 1 2 3 4 5
Res W
L W W L W W W L W W W L W W W
ult
Example:-
Basketball = In basketball, a winning state is when a team scores more points than
their opponent at the end of the game, while a losing state is when a team scores
fewer points than their opponent.
Chess = In chess, a winning state is when a player checkmates their opponent's king,
while a losing state is when a player's king is checkmated.
State graph:
Let us now consider another stick game, where in each state k, it is allowed to remove
any number x of sticks such that x is smaller than k and divides k.
For example, in state 8 we may remove 1, 2 or 4 sticks, but in state 7 the only
allowed move is to remove 1 stick. The following picture shows the states 1...9 of the
game as a state graph, whose nodes are the states and edges are the moves between
them:
The states 1...9 of the game as a state graph, whose nodes are the states and edges are
the moves between them
The final state in this game is always state 1, which is a losing state because there are
no valid moves. The classification of states 1...9 is as follows:
1 2 3 4 5 6 7 8 9
L W L W L W L W L
Surprisingly, in this game, all even-numbered states are winning states, and all odd-
numbered states are losing states
The nim game is a simple game that has an important role in game theory because
many other games can be played using the same strategy.
First, we focus on nim, and then we generalize the strategy to other games.
There are n heaps in nim, and each heap contains some number of sticks.
The players move alternately, and on each turn, the player chooses a heap that still
contains sticks and removes any number of sticks from it.
The winner is the player who removes the last stick.
The states in nim are of the form [x1, x2,..., xn], where xk denotes the number of sticks
in heap k.
Analysis:
The states whose nim sum is 0 are losing states, and all other states are winning states.
For example, the nim sum of [10,12,5] is 10⊕12⊕5 = 3, so the state is a winning state.
Losing states:
The final state [0,0,...,0] is a losing state, and its nim sum is 0, as expected.
In other losing states, any move leads to a winning state, because when a single value
xk changes, the nim sum also changes, so the nim sum is different from 0 after the
move.
Winning states:
We can move to a losing state if there is any heap k for which xk ⊕ s < xk.
In this case, we can remove sticks from heap k so that it will contain xk ⊕ s sticks,
which will lead to a losing state.
There is always such a heap, where xk has a one bit at the position of the leftmost one
bit of s.
As an example:
This state is a winning state because its nim sum is 3. Thus, there has to be a move
that leads to a losing state. Next, we will find out such a move. The nim sum of the
state is as follows:
10 1010
12 1100
5 0101
3 0011
In this scenario, the heap with 10 sticks is the only heap that has a one bit at the
position of the leftmost one bit of the nim sum:
10 1010
12 1100
5 0101
3 0011
The new size of the heap has to be 10⊕ 3 = 9, so we will remove just one stick. After
this, the state will be [9,12,5], which is a losing state:
9 1001
12 1100
5 0101
0 0000
In a misère game, the goal of the game is the opposite, so the player who removes the
last stick loses the game.
It turns out that the misère nim game can be optimally played almost like the standard
nim game.
The idea is to first play the misère game like the standard game, but change the
strategy at the end of the game.
The new strategy will be introduced in a situation where each heap would contain at
most one stick after the next move.
In the standard game, we should choose a move after which there is an even number
of heaps with one stick.
However, in the misère game, we choose a move so that there is an odd number of
heaps with one stick.
This strategy works because a state where the strategy changes always appear in the
game, and this state is a winning state because it contains exactly one heap that has
more than one stick so the nim sum is not 0.
The Sprague–Grundy theorem generalizes the strategy used in nim to all games that
fulfil the following requirements:
The game consists of states, and the possible moves in a state do not depend
on whose turn it is.
The players have complete information about the states and allowed moves,
and there is no randomness in the game.
The idea is to calculate for each game state a Grundy number that corresponds to the
number of sticks in a nim heap. When we know the Grundy numbers of all states, we
can play the game like the nim game.
where g1, g2,..., gn are the Grundy numbers of the states to which we can move, and
the mex function gives the smallest non-negative number that is not in the set.
For example:
The Grundy number of a state corresponds to the number of sticks in a nim heap. If
the Grundy number is 0, we can only move to states whose Grundy numbers are
positive, and if the Grundy number is x > 0, we can move to states whose Grundy
numbers include all numbers 0,1,..., x−1.
As an example:
On each turn, the player has to move the figure some number of steps left or
up.
The winner of the game is the player who makes the last move.
The following picture shows a possible initial state of the game, where @ denotes the
figure and # denotes a square where it can move.
The states of the game are all floor squares of the maze. In the above maze, the
Grundy numbers are as follows:
Thus, each state of the maze game corresponds to a heap in the nim game. For
example, the Grundy number for the lower-right square is 2, so it is a winning state.
We can reach a losing state and win the game by moving either four steps left or two
steps up.
Note that unlike in the original nim game, it may be possible to move to a state whose
Grundy number is larger than the Grundy number of the current state.
However, the opponent can always choose a move that cancels such a move, so it is
not possible to escape from a losing state.
3.7 Subgames:
Next, we will assume that our game consists of subgames, and on each turn, the player
first chooses a subgame and then move into the subgame. The game ends when it is
not possible to make any move in any [Link] this case, the Grundy number of a
game is the nim sum of the Grundy numbers of the subgames.
The game can be played like a nim game by calculating all Grundy numbers for
subgames and then their nim sum.
As an example, consider a game that consists of three mazes. In this game, on each
turn, the player chooses one of the mazes and then moves the figure in the maze.
Assume that the initial state of the game is as follows:
the player chooses one of the mazes and then moves the figure in the maze. Assume
that the initial state of the game is as follows
The Grundy numbers for the mazes are as follows
In the initial state, the nim sum of the Grundy numbers is 2⊕3⊕3 = 2, so the first
player can win the game. One optimal move is to move two steps up in the first maze,
which produces the nim sum 0⊕3⊕3 = 0.
Sometimes a move in a game divides the game into subgames that are independent of
each other. In this case, the Grundy number of the game is mex({g1, g2,..., gn}),
where n is the number of possible moves and gk = ak,1 ⊕ ak,2 ⊕...⊕ ak,m,
An example of such a game is Grundy’s game. Initially, there is a single heap that
contains n sticks.
On each turn, the player chooses a heap and divides it into two nonempty heaps such
that the heaps are of different size. The player who makes the last move wins the
game.
Let f (n) be the Grundy number of a heap that contains n sticks. The Grundy number
can be calculated by going through all ways to divide the heap into two heaps.
In this game, the value of f (n) is based on the values of f (1),..., f (n−1). The base
cases are f (1) = f (2) = 0, because it is not possible to divide the heaps of 1 and 2
sticks. The first Grundy numbers are:
f (1) = 0
f (2) = 0
f (3) = 1
f (4) = 0
f (5) = 2
f (6) = 1
f (7) = 0
f (8) = 2
The Grundy number for n = 8 is 2, so it is possible to win the game. The winning
move is to create heaps 1+7 because f (1)⊕ f (7) = 0.
4. Computability of Algorithms
In computer science, problems are divided into classes known as Complexity Classes.
In complexity theory, a Complexity Class is a set of problems with related complexity.
With the help of complexity theory, we try to cover the following.
The common resources required by a solution are are time and space, meaning how
much time the algorithm takes to solve a problem and the corresponding memory
usage.
An algorithm having time complexity of the form O(nk) for input n and
constant k is called polynomial time solution. These solutions scale well. On
the other hand, time complexity of the form O(kn) is exponential time.
4.1 P Class
The P in the P class stands for Polynomial Time. It is the collection of decision
problems(problems with a "yes" or "no" answer) that can be solved by a deterministic
machine (our computers) in polynomial time.
Features:
Most of the coding problems that we solve fall in this category like the below.
3. Merge Sort
4.2 NP Class
The solutions of the NP class might be hard to find since they are being solved
by a non-deterministic machine but the solutions are easy to verify.
Example:
It indicates that if someone can provide us with the solution to the problem, we can
find the correct and incorrect pair in polynomial time. Thus for the NP class problem,
the answer is possible, which can be calculated in polynomial time.
This class contains many problems that one would like to be able to solve effectively:
3. Graph coloring.
Co-NP Class
Co-NP stands for the complement of NP Class. It means if the answer to a problem in
Co-NP is No, then there is proof that can be checked in polynomial time.
Features:
For an NP and CoNP problem, there is no need to verify all the answers at once
in polynomial time, there is a need to verify only one particular answer "yes" or
"no" in polynomial time for a problem to be in NP or CoNP.
Features:
It takes a long time to check them. This means if a solution for an NP-hard
problem is given then it takes a long time to check whether it is right or not.
1. Halting problem.
3. No Hamiltonian cycle.
Features:
If one could solve an NP-complete problem in polynomial time, then one could
also solve any NP problem in polynomial time.
1. Hamiltonian Cycle.
2. Satisfiability.
3. Vertex cover.
Complexity
Characteristic feature
Class
Problem Statement: Given a graph G=(V, E), the problem is to determine if graph G
contains a Hamiltonian cycle consisting of all the vertices belonging to V.
If the 2nd condition is only satisfied then the problem is called NP-Hard. But it is not
possible to reduce every NP problem into another NP problem to show its NP-
Completeness all the time. That is why if we want to show a problem is NP-Complete,
we just show that the problem is in NP and if any NP-Complete problem is reducible
to that, then we are done, i.e. if B is NP-Complete and [Tex]B$\
leqslant_P$C [/Tex]for C in NP, then C is NP-Complete.
flag=true
If flag is true:
Solution is correct
Else:
Solution is incorrect
E’ = Add edges E of the original graph G and add new edges between
the newly added vertex and the original vertices of the graph. The
number of edges increases by the number of vertices V, that is, E’=E+V.
Let us assume that the graph G contains a hamiltonian path covering
the V vertices of the graph starting at a random vertex say Vstart and
ending at Vend, now since we connected all the vertices to an arbitrary
new vertex Vnew in G’. We extend the original Hamiltonian Path to a
Hamiltonian Cycle by using the edges Vend to Vnew and Vnew to
Vstart respectively. The graph G’ now contains the closed cycle traversing
all vertices once.
Thus we can say that the graph G’ contains a Hamiltonian Cycle if graph G contains
a Hamiltonian Path. Therefore, any instance of the Hamiltonian Cycle problem can be
reduced to an instance of the Hamiltonian Path problem. Thus, the Hamiltonian
Cycle is NP-Hard. Conclusion: Since, the Hamiltonian Cycle is both, a NP-
Problem and NP-Hard. Therefore, it is a NP-Complete problem.
The Traveling Salesman Problem (TSP) is one of the most famous problems in
computer science and combinatorial optimization. In the context of Design and
Analysis of Algorithms (DAA), it serves as a primary example of an NP-Hard
optimization problem, while its "Decision" version is NP-Complete.
"Given a list of cities and the distances between each pair of cities, what is the shortest
possible route that visits each city exactly once and returns to the origin city?"
Key Components:
Graph Representation: Cities are nodes, and paths between them are weighted
edges.
Hamiltonian Cycle: The salesman must find a cycle that visits every vertex
once.
1. NP: If someone gives you a solution (a tour), you can verify if the total weight
is less than a value $K$ in polynomial time.
Note: The "Optimization" version (find the absolute shortest) is NP-Hard. The
"Decision" version (is there a route shorter than $L$?) is NP-Complete.
5.3 Example:
Given a set of cities and distance between every pair of cities, the problem is to find
the shortest possible tour that visits every city exactly once and returns to the starting
point.
From \ To A B C D
A 0 10 15 20
B 10 0 35 25
C 15 35 0 30
D 20 25 30 0
For $n=4$ cities, we fix the starting point at A. There are $(n-1)! = 3! = 6$ possible
Hamiltonian cycles.
Route
Hamiltonian Cycle Calculation of Total Weight Total Cost
Index
In DAA, we visualize these calculations using a State-Space Tree. Each level of the
tree represents choosing the next city to visit.
4. Mathematical Complexity
Since we cannot solve TSP in polynomial time for large datasets, we use different
algorithmic strategies:
Exhaustive
Brute Force O(n!) Guaranteed optimal; slow.
Search
5. Applications