Dsa Lecture 14 Graphs
Dsa Lecture 14 Graphs
Lecture 14 – Graphs
Mwenge
3
5
5
Oysterbay
4
Magomeni 5
8
Town
JNIA 10
Applications of graphs
Besides reading the graphs which can of course be used to represent many things e.g. tunnel of
pipes, railway map, cities connected by flights, ferries etc the graphs can also be used to ask:
how to get quickly from A to B
What is the shortest route between A and B
There is also the famous Travelling Salesman Problem which involves finding the shortest route
through the structure that visits each city precisely once.
Graphs are used extensively in computer science to depict network graphs, or semantic graphs or
even to depict the flow of computation.
Graphs are widely used in Compilers to depict allocation of resources to processes or to indicate
data flow analysis, etc.
Graphs are also used for query optimization in database languages in some specialized compilers.
In social networking sites, graphs are main the structures to depict the network of people.
Graphs are extensively used to build the transportation system especially the road network. A
popular example is Google maps that extensively uses graphs to indicate directions all over the
world.
Graph terminology
The kind of structure in the above figure is known formally as a graph.
A graph consists of a series of nodes (also called vertices or points), displayed as nodes, and
edges (also called lines, links or, in directed graphs, arcs), displayed as connections between
the nodes.
A graph is said to be simple if it has no self-loops (i.e., edges connected at both ends to the
same vertex) and no more than one edge connecting any pair of vertices.
If there are labels on the edges (usually non-negative real numbers), we say that the graph is
weighted.
In directed graphs (also called digraphs), each edge comes with one or two directions, which
are usually indicated by arrows.
In undirected graphs, we assume that every edge can be viewed as going both ways, that is, an
edge between A and B goes from A to B as well as from B to A.
A path is a sequence of nodes or vertices v1, v2, . . . , vn such that vi and vi+1 are connected by an
edge for all 1 ≤ i ≤ n−1.
Graph terminology (cont)
A circle is a non-empty path whose first vertex is the same as its last vertex.
A path is simple if no vertex appears on it twice (with the exception of a circle, where the first and last
vertex may be the same – this is because we have to ‘cut open’ the circle at some point to get a path, so this
is inevitable).
An undirected graph is connected if every pair of vertices has a path connecting them.
A diagraph (directed) graph is:
weakly connected if for every two vertices A and B there is either a path from A to B or a path from B to A.
strongly connected if there are paths leading both ways.
So, in a weakly connected digraph, there may be two vertices i and j such that there exists no path from i to
j.
Because a graph, unlike a tree, does not come with a natural ‘starting point’ from which there is a unique
path to each vertex, it does not make sense to speak of parents and children in a graph.
Instead, if two vertices A and B are connected by an edge e, we say that they are neighbours, and the edge
connecting them is said to be incident to A and B.
Two edges that have a vertex in common (for example, one connecting A and B and one connecting B and
C) are said to be adjacent.
Graph terminology (cont)
6 Masaki
Mwenge
3
5
5
Oysterbay
4
Magomeni 5
8
Town
JNIA 10
Graph representations
Graphs represented in computer memory
Two common ways
Adjacency matrices
Adjacency lists
8 Adjacency Matrices
Let G be a graph with n vertices where n > zero
Let V(G) = {v1, v2, ..., vn}
Adjacency matrix
9 Adjacency Lists
Given:
Graph G with n vertices, where n > zero
V(G) = {v1, v2, ..., vn}
For each vertex v: linked list exists
Linked list node contains vertex u: (v, u) E(G)
Use array A, of size n, such that A[i]
Reference variable pointing to first linked list node containing vertices to which vi adjacent
Each node has two components: vertex, link
Component vertex
Contains index of vertex adjacent to vertex i
10 Adjacency Lists (cont)
Adjacency Lists (cont)
Implementing graphs
Different from other data structures we have connections between the nodes that we need to
track.
Array-based implementation:
rename the vertices of the graph so that they are labelled by non-negative integer indices, say
from 0 to n − 1, if they do not have these labels already.
Then we only need to keep track of which vertex has an edge to which other vertex, and, for
weighted graphs, what the weights on the edges are.
For unweighted graphs, we can do this quite easily in an n × n two-dimensional binary array
adj, also called a matrix , the so-called adjacency matrix .
In the case of weighted graphs, we instead have an n × n weight matrix weights.
Example of array based implementation - directional
A B C D E A B C D E
0 1 2 3 4 0 1 2 3 4
A 0 0 1 0 1 0 A 0 0 1 ∞ 4 ∞
B 1 0 0 1 0 0 B 1 2 0 2 2 6
C 2 1 0 0 0 1
C 2 ∞ 3 0 2 1
D 3 0 0 1 0 1
D 3 ∞ ∞ ∞ 0 1
E 4 0 0 0 0 0
E 4 ∞ ∞ 3 2 0
Example of array based implementation – un-directional
6 Masaki
Mwenge
3
5
5
Oysterbay
4
Magomeni 5
8
Town
JNIA 10
0 1 2 3 4 5
JNIA 0 0 8 ∞ ∞ ∞ 10
Magomeni 1 ∞ 0 ∞ ∞ 4 ∞
Masaki 2 ∞ ∞ 0 6 3 ∞
Mwenge 3 ∞ 5 6 0 5 ∞
4 ∞ 4 3 5 0 ∞
Oysterbay
5 10 ∞ ∞ ∞ 5 0
Town
Array based implementation
This means that adj[i][j] == adj[j][i] will hold for all i and j from 0 to n − 1, so there is
some redundant information here. We say that such a matrix is symmetric – it equals its
mirror image along the main diagonal.
Mixed implementation
The array based implementation has a potential problem with the adjacency/weight matrix
representation:
If the graph has very many vertices, the associated array will be extremely large (e.g.,
10,000 entries are needed if the graph has just 100 vertices).
Then, if the graph is sparse (i.e., has relatively few edges), the adjacency matrix contains
many 0s and only a few 1s, and it is a waste of space to reserve so much memory for so
little information.
A solution to this problem is to number all the vertices as before, but, rather than using a
two-dimensional array, use a one-dimensional array that points to a linked list of
neighbours for each vertex.
the above weighted graph can be represented as follows, with each triple consisting of a
vertex name, connection weight and pointer to the next triple:
This implementation is using so-called adjacency lists.
Mixed implementation (cont)
If there are very few edges, we will have very short lists at each
entry of the array, thus saving space over the adjacency/weight
matrix representation.
Note that if we are considering undirected graphs, there is still a
certain amount of redundancy in this representation, since every
edge is represented twice, once in each list corresponding to the
two vertices it connects.
Pointer-based implementation
The standard pointer-based implementation of binary trees, which is essentially a
generalization of linked lists, can be generalized for graphs.
Graph ADT operations
Edge Objects:
The edge object for an edge e has member variables for:
Start vertex value
End vertex value
Weight of the edge
Graph Data Structure (adjacency list)
Processing a graph
Requires ability to traverse the graph, i.e. to systematically visit all vertices/edges
Traversing a graph
Similar to traversing a binary tree but a bit more complicated
Two most common graph traversal algorithms
Depth first traversal
Breadth first traversal
Depth first traversal
Given a vertex i to start from, put it on an empty stack
Then we take it from the stack, mark it as done, look up its neighbours one after the other,
and put them onto the stack.
We then repeatedly pop the next vertex from the stack, mark it as done, and put its
neighbours on the stack, provided they have not been marked as done.
Depth first traversal algorithm
Breadth first traversal
Given a vertex i to start from, put it on an empty queue
We then remove the first vertex from the queue, mark it as done, and one by one put its
neighbours at the end of the queue.
We then visit the next vertex in the queue and again put its neighbours at the end of the
queue, provided they have not been marked as done.
We do this until the queue is empty.
To see why this is called breadth first search, we can imagine a tree being built up in this
way, where the starting vertex is the root, and the children of each vertex are its neighbours
(that haven’t already been visited). We would then first follow all the edges emanating
from the root, leading to all the vertices on level 1, then find all the vertices on the level
below, and so on, until we find all the vertices on the ‘lowest’ level.
Breadth first traversal algorithm
Difference between the two traversals
Difference between DFS and BFS (cont)
No. Key DFS BFS
1 Definition DFS, stands for Depth First Search. BFS, stands for Breadth First Search.
Data DFS uses Stack to find the shortest path. BFS uses Queue to find the shortest path.
2 structure
Source DFS is better when target is far from source. BFS is better when target is closer to Source.
3
Suitablity for DFS is more suitable for decision tree. As with one As BFS considers all neighbour so it is not suitable
decision tree decision, we need to traverse further to augment the for decision tree used in puzzle games.
decision. If we reach the conclusion, we won.
4
Time Time Complexity of DFS is also O(V+E) where V is Time Complexity of BFS = O(V+E) where V is
6 Complexity vertices and E is edges. vertices and E is edges.
Note: DFS is synonymously used for DFT and BFS is synonymously used for BFT
Shortest Path
A common graph based problem is that we have some situation represented as a weighted
digraph with edges labelled by non-negative numbers and need to answer the following
question: For two particular vertices, what is the shortest route from one to the other?
Here, by “shortest route” we mean a path which, when we add up the weights along its
edges, gives the smallest overall weight for the path.
This number is called the length of the path.
Thus, a shortest path is one with minimal length. Sometimes the weight can be amount of
money or time.
Applications of shortest-path algorithms include:
internet packet routing e.g. OSPF, BGP
train-ticket reservation systems
driving route finders like Google map
Dijkstra’s algorithm
given start node s to a given end node z, it is actually most convenient to compute the
shortest paths from s to all other nodes
Given the start node, Dijkstra’s algorithm computes shortest paths starting from s and
ending at each possible node.
It maintains all the information it needs in simple arrays, which are iteratively updated
until the solution is reached.
Dijkstra’s algorithm pseudocode
Minimum Spanning Tree
Suppose you have been given a weighted undirected graph such
as the one on the right.
think of the vertices as representing houses, and the weights as
the distances between them.
imagine that you are tasked with supplying all these houses with
some commodity such as water, gas, or electricity.
you will want to keep the amount of digging and laying of pipes
or cable to a minimum.
So, what is the best pipe or cable layout that you can find, i.e.
what layout has the shortest overall length?
A spanning tree of a graph is a subgraph that is a tree which
connects all the vertices together, so it ‘spans’ the original graph
but using fewer edges.
Other applications are Network design (telephone, electrical,
hydraulic, TV cable, computer, road), handwriting recognition etc
Minimum Spanning Trees
Algorithms to find MST
Two well-known algorithms for finding a minimum spanning tree of a graph, based on Greedy
algorithms:
Prim’s algorithm
Kruskal’s algorithm
Greedy Algorithms
We say that an algorithm is greedy if it makes its decisions based only on what is best from the
point of view of ‘local considerations’, with no concern about how the decision might affect the
overall picture.
Prim’s algorithm (cont)