0% found this document useful (0 votes)
74 views67 pages

Chapter 6 - DS

This document summarizes key concepts related to graphs and hashing. It defines graphs as a group of vertices connected by edges. Graphs can be directed or undirected. Common graph representations include adjacency matrices and adjacency lists. Graph traversal algorithms like depth-first search and breadth-first search are used to search graphs. Applications of graphs include representing networks and finding shortest paths. Hashing is also discussed but not described in detail.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views67 pages

Chapter 6 - DS

This document summarizes key concepts related to graphs and hashing. It defines graphs as a group of vertices connected by edges. Graphs can be directed or undirected. Common graph representations include adjacency matrices and adjacency lists. Graph traversal algorithms like depth-first search and breadth-first search are used to search graphs. Applications of graphs include representing networks and finding shortest paths. Hashing is also discussed but not described in detail.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

Chapter 6

Graph and Hashing

CO6- Demonstrate Basic terminologies and


representation of graph and Hashing
Graph
• A graph can be defined as group of vertices
and edges that are used to connect these
vertices.
• A graph can be seen as a cyclic tree, where
the vertices (Nodes) maintain any complex
relationship among them instead of having
parent child relationship.
• A graph G can be defined as an ordered set
G(V, E) where V(G) represents the set of
vertices and E(G) represents the set of edges
which are used to connect these vertices.
• A Graph G(V, E) with 5 vertices (A, B, C, D, E)
and six edges ((A,B), (B,C), (C,E), (E,D), (D,B),
(D,A)) is shown in the following figure.
Graph Terminology
• Node or Vertex - The elements of a graph are
connected through edges.
• Edges - A path or a line between two vertices in a
graph.
• Adjacent Nodes - If two nodes u and v are
connected via an edge e, then the nodes u and v
are called as neighbors or adjacent nodes.
• Degree of the Node - A degree of a node is the
number of edges that are connected with that
node. A node with degree 0 is called as isolated
node.
Directed and Undirected Graph
• A graph can be directed or undirected.
• In an undirected graph, edges are not associated with the
directions with them.

• An undirected graph is shown in the above figure since its


edges are not attached with any of the directions.
• If an edge exists between vertex A and B then the vertices can
be traversed from B to A as well as A to B.
• In a directed graph, edges form an ordered
pair.
• Edges represent a specific path from some
vertex A to another vertex B.
• Node A is called initial node while node B is
called terminal node.
• A directed graph is shown in the following
figure.
• Predecessor- Predecessor of a node mean the node
that immediately precedes that node.
Node 2 is a predecessor of node 1.

• Successor- Successor of a node mean the node that


immediately next of that node.
Node 1 is a successor of node 2.

• Indegree- Indegree of vertex V is the number of edges


which are coming into the vertex V. Notation −
deg−(V).

• Outdegree - Outdegree of vertex V is the number of


edges which are going out from the vertex V.
Notation − deg+(V).
Vertex 'a' has two edges, 'ad' and 'ab', which are going
outwards. Hence its outdegree is 2. Similarly, there is an edge
'ga', coming towards vertex 'a'. Hence the indegree of 'a' is 1.
• Path- A path can be defined as the sequence of
nodes that are followed in order to reach some
terminal node V from the initial node U.
Path(A, C) = { AB, BC }
• Closed Path - A path will be called as closed path
if the initial node is same as terminal node. A
path will be closed path if V0=VN.
• Simple Path- If all the nodes of the graph are
distinct with an exception V0=VN, then such path
P is called as closed simple path.
• Cycle -A cycle can be defined as the path which
has no repeated edges or vertices except the first
and last vertices.
• Connected Graph - A connected graph is the
one in which some path exists between every
two vertices (u, v) in V. There are no isolated
nodes in connected graph.

Not Connected Graph Connected Graph


• Complete Graph - A complete graph is the one in which every
node is connected with all other nodes. A complete graph contain
n(n-1)/2 edges where n is the number of nodes in the graph.

• Weighted Graph - In a weighted graph, each edge is assigned


with some data such as length or weight. The weight of an edge e
can be given as w(e) which must be a positive (+) value indicating
the cost of traversing the edge.
• Length - Length of the graph is defined as the
number of edges contained in the graph.

Length of the graph: 8


AB, BC, CD, DE, EF, FA, AC, CE
Representations of a graph
• In graph theory, a graph representation is a
technique to store graph into the memory of
computer.
• To represent a graph need the set of vertices,
and for each vertex the neighbors of the vertex
(vertices which is directly connected to it by an
edge). If it is a weighted graph, then the weight
will be associated with each edge.
• Adjacent matrix, adjacency list these
representations are commonly used.
Adjacency Matrix
• Adjacency matrix is a sequential representation.
• It is used to represent which nodes are adjacent
to each other. i.e. is there any edge connecting
nodes to a graph.
• In this representation, have to construct a nXn
matrix A. If there is any edge from a vertex i to
vertex j, then the corresponding element of A,
ai,j = 1, otherwise ai,j= 0.
• If there is any weighted graph then instead of 1s
and 0s, store the weight of the edge.
Undirected graph representation

1 represents an edge from row vertex to column vertex, and 0


represents no edge from row vertex to column vertex.
Directed graph representation

1 represents an edge from row vertex to column vertex, and 0


represents no edge from row vertex to column vertex.
Undirected weighted graph representation
• Advantage: Representation is easier to
implement and follow.

• Disadvantage: It takes a lot of space and time


to visit all the neighbors of a vertex, we have
to traverse all the vertices in the graph, which
takes quite some time.
Adjacency List

• Adjacency list is a linked representation.


• In this representation, for each vertex in the
graph, maintain the list of its neighbors. It
means, every vertex of the graph contains list
of its adjacent vertices.
• We have an array of vertices which is indexed
by the vertex number and for each vertex v,
the corresponding array element points to
a singly linked list of neighbors of v.
• Advantages:
1. Adjacency list saves lot of space.
2. We can easily insert or delete as we use linked
list.
3. Such kind of representation is easy to follow and
clearly shows the adjacent nodes of node.

• Disadvantages:
1. The adjacency list allows testing whether two
vertices are adjacent to each other but it is
slower to support this operation.
Traversal of graphs
• Graph traversal is a technique used for a
searching vertex in a graph. ...
• It is also use to calculate the order of vertices in
traverse process
• A graph traversal finds the edges to be used in
the search process without creating loops. That
means using graph traversal we visit all the
vertices of the graph without getting into looping
path.
• Two types of traversal algorithms-
1. Breadth First Search
2. Depth First Search.
DFS (Depth First Search)
• DFS traversal of a graph produces a spanning tree as final
result.
• It is a edge based technique.

• Use Stack data structure with maximum size of total number


of vertices in the graph to implement DFS traversal.


• A, B, D, C, E, F
A spanning tree is a tree that connects all the
vertices of a graph with the minimum possible
number of edges.
BFS (Breadth First Search)

• BFS traversal of a graph produces a spanning tree as final


result.
• It is a vertex based technique
• Use Queue data structure with maximum size of total number
of vertices in the graph to implement BFS traversal.

A, B, C, D, E, F
Steps to implement DFS traversal...

Step 1 - Define a Stack of size total number of vertices in the


graph.
Step 2 - Select any vertex as starting point for traversal. Visit
that vertex and push it on to the Stack.
Step 3 - Visit any one of the non-visited adjacent vertices of a
vertex which is at the top of stack and push it on to the
stack.
Step 4 - Repeat step 3 until there is no new vertex to be
visited from the vertex which is at the top of the stack.
Step 5 - When there is no new vertex to visit then
use backtracking and pop one vertex from the stack.
(Backtracking is coming back to the vertex from which we
reached the current vertex.)
Step 6 - Repeat steps 3, 4 and 5 until stack becomes Empty.
Step 7 - When stack becomes Empty, then produce final
spanning tree by removing unused edges from the graph
Steps to implement BFS traversal...

Step 1 - Define a Queue of size total number of vertices in


the graph.
Step 2 - Select any vertex as starting point for traversal.
Visit that vertex and insert it into the Queue.
Step 3 - Visit all the non-visited adjacent vertices of the
vertex which is at front of the Queue and insert them
into the Queue.
Step 4 - When there is no new vertex to be visited from
the vertex which is at front of the Queue then delete
that vertex.
Step 5 - Repeat steps 3 and 4 until queue becomes empty.
Step 6 - When queue becomes empty, then produce final
spanning tree by removing unused edges from the
graph
Applications of Graph
Computer Science
• In computer science graph theory is used for the study of algorithms like:
1) Dijkstra's Algorithm
2) Prims's Algorithm
3) Kruskal's Algorithm

• Graphs are used to define the flow of computation.


• Graphs are used to represent networks of communication.
• Graphs are used to represent data organization.
• Graph transformation systems work on rule-based in-memory
manipulation of graphs. Graph databases ensure transaction-safe,
persistent storing and querying of graph structured data.
• Graph theory is used to find shortest path in road or a network.
• In Google Maps, various locations are represented as vertices or nodes
and the roads are represented as edges and graph theory is used to find
the shortest path between two nodes.
• Graph theory is also used in network security.
2. In Electrical Engineering graph theory is used
in designing of circuit connections. These circuit
connections are named as topologies.
3. Graph theory is also used in sociology. For example, to
explore rumor spreading, or to measure actors'
prestige notably through the use of social network
analysis software.
4. In physics and chemistry, graph theory is used to study
molecules.
5. In mathematics, operational research is the important
field. Graph theory provides many useful applications
in operational research.
Hashing
Hashing
• Hashing is one of the searching techniques that uses a constant
time. The time complexity in hashing is O(1).
• Two techniques for searching, are linear search and binary search
• The worst time complexity in linear search is O(n), and O(logn) in
binary search. In both the searching techniques, the searching
depends upon the number of elements but we want the technique
that takes a constant time. So, hashing technique came that
provides a constant time.
• In Hashing technique, the hash table and hash function are used.
Using the hash function, we can calculate the address at which the
value can be stored.
• The main idea behind the hashing is to create the (key/value) pairs.
If the key is given, then the algorithm computes the index at which
the value would be stored. It can be written as:
Index = hash(key)
Hash Table
• Hash table is one of the most important data
structures that uses a special function known
as a hash function that maps a given value
with a key to access the elements faster.
• A Hash table is a data structure that stores
some information, and the information has
basically two main components, i.e., key and
value.
• The hash table can be implemented with the
help of an associative array.
• For example, suppose the key value is John and the
value is the phone number, so when we pass the key
value in the hash function shown as below:
Hash(key)= index;
When we pass the key in the hash function, then it
gives the index.
Hash(john) = 3;

• Drawback of Hash function


A Hash function assigns each value with a unique key.
Sometimes hash table uses an imperfect hash function
that causes a collision because the hash function
generates the same key of two different values.
There are three ways of calculating the hash
function:
• Division method
• Multiplication method
• Mid square method
• Folding method
Division Method
This is the easiest method to create a hash function. The
hash function can be described as −
h(k) = k mod n or h(k) = k mod n +1
Here, h(k) is the hash value obtained by dividing the key
value k by size of hash table n using the remainder.
It is best that n is a prime number as that makes sure
the keys are distributed with more uniformity.
An example of the Division Method is as follows −
k=1276 n=10
h(1276) = 1276 mod 10 = 6
The hash value obtained is 6
A disadvantage of the division method id that consecutive
keys map to consecutive hash values in the hash table.
This leads to a poor performance.
Multiplication Method
The hash function used for the multiplication method is −
h(k) = n( kA mod 1 )
Here, k is the key
A can be any constant value between 0 and 1.
Both k and A are multiplied and their fractional part is separated. This is then
multiplied with n to get the hash value.
An example of the Multiplication Method is as follows −
k=123
n=100
A=0.618033
h(123) = 100 (123 * 0.618033 mod 1)
= 100 (76.018059 mod 1)
= 100 (0.018059)
=1
The hash value obtained is 1
An advantage of the multiplication method is that it can work with any value of
A, although some values are believed to be better than others.
Mid Square Method
The mid square method is a very good hash function. It
involves squaring the value of the key and then
extracting the middle r digits as the hash value. The
value of r can be decided according to the size of the
hash table.
An example of the Mid Square Method is as follows −
Suppose the hash table has 100 memory locations.
So r=2 because two digits are required to map the key
to memory location.
k = 50
k*k = 2500
h(50) = 50
The hash value obtained is 50
Folding Method
• The key k is partitioned into a number of parts k1,
k2.... kn where each part except possibly the last,
has the same number of digits as the required
address.
• Then the parts are added together, ignoring the
last carry.
• H (k) = k1+ k2+.....+kn
Divide the key k into 2 parts and adding yields the
following hash address:
• H (3205) = 32 + 50 = 82 H (7148) = 71 + 84 = 55

• H (2345) = 23 + 45 = 68
Collision
When the two different input values have the same index value, then
the problem occurs between the two input values, known as a
collision.

Example- In Division Method


Key value=6
6%10=6
Key value=26
26%10=6

Therefore, two values are stored at the same index, i.e. at index 6,
and this leads to the collision problem.

Techniques to resolve collisions-


1. Open Hashing: It is also known as closed addressing.
2. Closed Hashing: It is also known as open addressing.
Open Hashing
• In Open Hashing, one of the methods used to resolve the
collision is known as a chaining method.
A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10 and h(k) = 2k+3

index = h(3) = (2(3)+3)%10 = 9


The value 3 would be stored at the index 9. 0
1 9
index = h(2) = (2(2)+3)%10 = 7 2
The value 2 would be stored at the index 7. 3
4
index = h(9) = (2(9)+3)%10 = 1
The value 9 would be stored at the index 1. 5 6 11
6
index = h(6) = (2(6)+3)%10 = 5 7 2 7 12
The value 6 would be stored at the index 5. 8
9 3 13
index = h(11) = (2(11)+3)%10 = 5

index = h(13) = (2(13)+3)%10 = 9

index = h(7) = (2(7)+3)%10 = 7

index = h(12) = (2(12)+3)%10 = 7


Closed Hashing

• In Closed hashing, three techniques are used


to resolve the collision:
1. Linear probing
2. Quadratic probing
3. Double Hashing technique
Linear probing
• Linear probing is one of the forms of open
addressing.
• As we know that each cell in the hash table
contains a key-value pair, so when the collision
occurs by mapping a new key to the cell already
occupied by another key, then linear probing
technique searches for the closest free locations
and adds a new key to that empty cell.
• In this case, searching is performed sequentially,
starting from the position where the collision
occurs till the empty cell is not found.
• Consider the above example for the linear probing:
A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3
• The key values 3, 2, 9, 6 are stored at the indexes 9, 7,
1, 5 respectively.
• The calculated index value of 11 is 5 which is already
occupied by another key value, i.e., 6. When linear
probing is applied, the nearest empty cell to the index
5 is 6; therefore, the value 11 will be added at the
index 6.
• The next key value is 13. The index value associated
with this key value is 9 when hash function is applied.
The cell is already filled at index 9. When linear probing
is applied, the nearest empty cell to the index 9 is 0;
therefore, the value 13 will be added at the index 0.
• The next key value is 7. The index value associated with
the key value is 7 when hash function is applied. The
cell is already filled at index 7. When linear probing is
applied, the nearest empty cell to the index 7 is 8;
therefore, the value 7 will be added at the index 8.
• The next key value is 12. The index value associated
with the key value is 7 when hash function is applied.
The cell is already filled at index 7. When linear probing
is applied, the nearest empty cell to the index 7 is 10;
therefore, the value 12 will be added at the index 10.
A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10 and h(k) = 2k+3
index = h(3) = (2(3)+3)%10 = 9
The value 3 would be stored at the index 9.

index = h(2) = (2(2)+3)%10 = 7


The value 2 would be stored at the index 7. 0 13
index = h(9) = (2(9)+3)%10 = 1 1 9
The value 9 would be stored at the index 1. 2
3
index = h(6) = (2(6)+3)%10 = 5
The value 6 would be stored at the index 5. 4
5 6
index = h(11) = (2(11)+3)%10 = 5 6 11
index = h(13) = (2(13)+3)%10 = 9 7 2
8 7
index = h(7) = (2(7)+3)%10 = 7 9 3
index = h(12) = (2(12)+3)%10 = 7
Quadratic Probing
• In case of linear probing, searching is
performed linearly. In contrast, quadratic
probing is an open addressing technique that
uses quadratic polynomial for searching until a
empty slot is found.
• With quadratic probing, rather than always
moving one spot, move i2 spots from the point
of collision, where i is the number of attempts
to resolve the collision.
Double Hashing
• Double hashing is an open addressing technique which
is used to avoid the collisions.
• When the collision occurs then this technique uses the
secondary hash of the key.
• It uses one hash value as an index to move forward
until the empty location is found.
• First hash function is typically
hash1(key) = key % TABLE_SIZE
A popular second hash function is :
hash2(key) = PRIME – (key % PRIME)
where PRIME is a prime smaller than the TABLE_SIZE.

You might also like