0% found this document useful (0 votes)
10 views

Lecture 18

Uploaded by

L A A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Lecture 18

Uploaded by

L A A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 47

CSE 326: Data Structures

Lecture #18
Exploring Graphs
Bart Niswonger
Summer Quarter 2001
Today’s Outline
• Stuff Bart didn’t finish Friday
• Graph Algorithms
– Shortest Path
• Djikstra
– Minimum Spanning Tree
• Kruskal
• Prim
Single Source, Shortest Path
Given a graph G = (V, E) and a vertex s 
V, find the shortest path from s to every
vertex in V

Many variations:
– weighted vs. unweighted
– cyclic vs. acyclic
– positive weights only vs. negative weights
allowed
– multiple weight types to optimize
Dijkstra’s Algorithm for Single Source Shortest Path
• Classic algorithm for solving shortest path
in weighted graphs without negative
weights
• A greedy algorithm (irrevocably makes
decisions without considering future
consequences)
• Intuition:
– shortest path from source vertex to itself is 0
– cost of going to adjacent nodes is at most edge
weights
– cheapest of these must be shortest path to that
node
– update paths for new node and continue picking
cheapest path
Dijkstra’s Pseudocode
(actually, our pseudocode for Dijkstra’s algorithm)

Mark every node as unknown


Initialize the cost of each node to 
Initialize the cost of the source to 0
While there are unknown nodes left in the
graph
Select the unknown node n with the lowest
cost
Mark n as known
For each node a which is adjacent to n
a’s cost = min( a’s old cost,
n’s cost + cost of (n, a))
Dijkstra’s Algorithm in Action
2 2
B 3
A F H
1
1 2
4 10 1
9
4 G
2 C 8 vertex known cost
D 1 A
E
7 B
C
D
E
F
G
H
The Cloud Proof
Next shortest path from
G inside the known cloud
Better path
to the same node
THE KNOWN
CLOUD
P
Source

But, if the path to G is the next shortest path,


the path to P must be at least as long.
So, how can the path through P to G be shorter?
Inside the Cloud (Proof)
Everything inside the cloud has the correct shortest
path

Proof is by induction on the # of nodes in the cloud:


– initial cloud is just the source with shortest path 0
– inductive step: once we prove the shortest path to G is
correct, we add it to the cloud

Negative weights blow this proof away!


Data Structures (for Dijkstra’s Algorithm)
|V| times:
Select the unknown node with the lowest cost

findMin/deleteMin
|E| times:
a’s cost = min(a’s old cost, …)

decreaseKey

find by name
runtime:
Revenge of Dijkstra Pseudocode
Initialize the cost of each node to 
s.cost = 0;
heap.insert(s);
while (! heap.empty())
n = heap.deleteMin()
for (each node a which is adjacent to n)
if (n.cost + edge[n,a].cost < a.cost) then
a.cost = n.cost + edge[n,a].cost
a.path = n;
if (heap.contains(a)) then
heap.decreaseKey(a)
else heap.insert(a)
Single Source & Goal
Suppose we only care about
shortest path from source s to a
particular vertex g
– Run Dijkstra to completion
– Stop early? When?
• When g is added to the priority queue
• When g is removed from the priority
queue
• When the priority queue is empty
Spanning Trees
Spanning tree: a subset of the edges from a
connected graph that…
…touches all vertices in the graph (spans the
graph)
…forms a tree (is connected and contains no
cycles)
4 7
9
2
1 5

Minimum spanning tree (MST): the spanning


tree with the least total edge cost.
Applications of MSTs
• Communication networks

• VLSI design

• Transportation systems

• Good approximation to some NP-


hard problems
Kruskal’s Algorithm for MSTs
A greedy algorithm:

Initialize all vertices to unconnected


While there are still unmarked edges
Pick a lowest cost edge e = (u, v) and
mark it
If u and v are not already connected, add e
to the minimum spanning tree and
connect u and v
Kruskal’s Algorithm in Action (1/5)

2 2 3
A B F H
2
1 1
4 9 10
G
C 4
2 8
D
E
7
Kruskal’s Algorithm in Action (2/5)

2 B 2 3
A F H
2
1 1
4 9 10
G
C 4
2 8
D
E
7
Kruskal’s Algorithm in Action (3/5)

2 B 2 3
A F H
2
1 1
4 9 10
G
C 4
2 8
D
E
7
Kruskal’s Algorithm in Action (4/5)

2 B 2 3
A F H
2
1 1
4 9 10
G
C 4
2 8
D
E
7
Kruskal’s Algorithm Completed (5/5)

2 B 2 3
A F H
2
1 1
4 9 10
G
C 4
2 8
D
E
7
Why Greediness Works
The algorithm produces a spanning tree. Why?
Proof by contradiction: Kruskal’s finds the minimum:
Assume another spanning tree has lower cost than
Kruskal’s
Pick an edge e1 = (u,v) in that tree that’s not in Kruskal’s
Kruskal’s connects u’s and v’s sets with another edge e2
But e2 must have at most the same cost as e1!
So, swap e2 for e1 (at worst keeping the cost the same)
Repeat until the tree is identical to Kruskal’s:
contradiction!

QED: Kruskal’s algorithm finds a MST


Data Structures (for Kruskal’s Algorithm)
Once:
Initialize heap of edges… buildHeap

|E| times:
Pick the lowest cost edge… findMin/deleteMin

|E| times:
If u and v are not already connected… union
…connect u and v.

|E| + |E| log |E| + |E| ack(|E|,|V|)


runtime:
Prim’s Algorithm
• Can also find Minimum Spanning Trees using a
variation of Dijkstra’s algorithm:
Pick a initial node
Until graph is connected:
Choose edge (u,v) which is of minimum cost
among edges where u is in tree but v is not
Add (u,v) to the tree

• Same “greedy” proof, same asymptotic


complexity
Does Greedy Always Work?
• Consider the following problem:
– Given a graph G = (V,E) and a designed
subset of vertices S, find a minimum cost
tree that includes all of S
• Exactly the same as a minimum
spanning tree, except that it doesn’t
have to include ALL the vertices – only
the specified subset of vertices.
– Does Kruskal or Prim work?
Nope!
• Greedy can fail to be optimal
– because different solutions may contain different
“non-designed” vertices, proof that you can covert
one to the other doesn’t go through
• This Minimum Steiner Tree problem has no
known solution of O(nk) for any fixed k

– This is a NP-complete problem


– Finding a spanning tree and then pruning it a
pretty good approximation
Huge Graphs
• Consider some really huge graphs…
– All cities and towns in the World Atlas
– All stars in the Galaxy
– All ways 10 blocks can be stacked
Huh???
Implicitly Generated Graphs
• A huge graph may be implicitly specified by rules
for generating it on-the-fly
• Blocks world:
– vertex = relative positions of all blocks
– edge = robot arm could stack one block
stack(green,blue)

stack(blue,red)

stack(green,red)
Blocks World
source: initial state of the blocks
goal: desired state of the blocks
path from source to goal = sequence
of actions (program) for robot arm!

• n blocks  nn states
• 10 blocks  10 billion states
Problem: Branching Factor
• Dijkstra’s algorithm is basically breadth-
first search (modulo arc weights)
– Visits all nodes (exhaustive search)
• Suppose we know that goal is only d steps
away.
• If out-degree of each node is 10,
potentially visits 10d vertices
– 10 step plan => 10 billion vertices!

Cannot search such huge graphs


exhaustively!
An Easier Case
• Suppose you live in Manhattan; what do you do?

S
52 St
nd

G
51st St

50th St
10th Ave

9th Ave

8th Ave

2nd Ave
3rd Ave
7th Ave

6th Ave

5th Ave

4th Ave
Best-First Search
The Manhattan distance ( x+  y) is an
estimate of the distance to the goal
– a heuristic value

Best-First Search
– Order nodes in priority to minimize
estimated distance to the goal
Compare: Dijkstra
– Order nodes in priority to minimize
distance from the start
Best First in Action
• Suppose you live in Manhattan; what do you do?

S
52 St
nd

G
51st St

50th St
10th Ave

9th Ave

8th Ave

2nd Ave
3rd Ave
7th Ave

6th Ave

5th Ave

4th Ave
2nd Ave
G

3rd Ave
Being Mislead

4th Ave
5th Ave
• Will get back on track!

6th Ave
7th Ave
8th Ave
9th Ave

S
10th Ave

50th St
52 St

51st St
nd
Optimality
• Does Best-First Search find the
shortest path
– when the goal is first seen?
– when the goal is removed from
priority queue?
Sub-Optimal Solution
• Goal is by definition at distance 0: will be
removed from priority queue immediately,
even if a shorter path exists!

52nd St
S (5 blocks)

51st St
G
9th Ave

8th Ave

7th Ave

6th Ave

5th Ave

4th Ave
Synergy?
Dijkstra / Breadth First guaranteed to find
optimal solution
Best First often visits far fewer vertices,
but may not provide optimal solution

– Can we get the best of both?


A*
A* - Order vertices in priority queue to
minimize
(distance from start) + (estimated
distance to goal)

f(n) = g(n) + h(n)

f(n) = priority of a node


g(n) = true distance from start
h(n) = heuristic distance to goal
Optimality
• Suppose the estimated distance (h)
is  the true distance to the goal
– (heuristic is a lower bound)

• Then: when the goal is removed


from the priority queue, we are
guaranteed to have found a shortest
path!
Optimality Revisited

52 St
nd
S (5 blocks) 5+2=6

51st St
G
1+4=5

50th St

Dijkstra
9th Ave

would
8th Ave

7th Ave

6th Ave

5th Ave

4th Ave
have
visited
these
guys!
Revised Cloud Proof
• Suppose have found a path of cost c to G which is not optimal
– priority(G) = f(G) = g(G) + h(G) = c + 0 = c
• Say N is the last vertex on an optimal path P to G which has been
added to the queue but not yet dequeued.
– There must be such an N, otherwise the optimal path would have been found.
– priority(N) = f(N) = g(N) + h(N)  g(N) + actual cost N to G
= cost of path P < c
• So N will be dequeued before G is dequeued
• Repeat argument to show entire optimal path will be expanded before G
is dequeued.

N
S G
c
A Little History
• A* invented by Nils Nilsson &
colleagues in 1968
– or maybe some guy in Operations
Research?
• Cornerstone of artificial intelligence
– still a hot research topic!
– iterative deepening A*, automatically
generating heuristic functions, …
• Method of choice for search large
(even infinite) graphs when a good
heuristic function can be found
What About Those Blocks?
• “Distance to goal” is not always physical
distance
• Blocks world:
– distance = number of stacks to perform
– heuristic lower bound = number of blocks out of place

# out of place = 2, true distance to goal = 3


Other Examples
• Simplifying Integrals
– vertex = formula
– goal = closed form formula without
integrals
– arcs = mathematical transformations
n 1
x
 x dx  n  1
n

– heuristic = number of integrals remaining


in formula
DNA Sequencing
• Problem: given chopped up DNA,
reassemble
• Vertex = set of pieces
• Arc = stick two pieces together
• Goal = only one piece left
• Heuristic = number of pieces remaining - 1
Solving Simultaneous Equations
• Input: set of equations
• Vertex = assignment of values to some
of the variables
• Edge = Assign a value to one more
variable
• Goal = Assignment that simultaneously
satisfies all the equations
• Heuristic = Number of equations not
yet satisfied
What ISN’T A*?

essentially, nothing.
To Do
• Project IV
– Write a Graph Data Structure!
• Start reading Chapter 10
• Think about your favorite T-shirt
Coming Up
• Kinds of algorithms

• No Quiz!
• Other Data Structures

You might also like