0% found this document useful (0 votes)
24 views54 pages

Lec10 12 Edit Distance

The document discusses the concept of Edit Distance, which measures the similarity between two strings based on the minimum number of edit operations (insertion, deletion, substitution) required to convert one string into another. It includes examples to illustrate how to compute the edit distance using dynamic programming, specifically the Wagner-Fischer algorithm, and highlights its applications in spelling correction and DNA sequence alignment. The document also explains the initialization and recursion involved in the dynamic programming approach to calculate the edit distance efficiently.

Uploaded by

krishn05082002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views54 pages

Lec10 12 Edit Distance

The document discusses the concept of Edit Distance, which measures the similarity between two strings based on the minimum number of edit operations (insertion, deletion, substitution) required to convert one string into another. It includes examples to illustrate how to compute the edit distance using dynamic programming, specifically the Wagner-Fischer algorithm, and highlights its applications in spelling correction and DNA sequence alignment. The document also explains the initialization and recursion involved in the dynamic programming approach to calculate the edit distance efficiently.

Uploaded by

krishn05082002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

ABV-Indian Institute of Information Technology and Management

Gwalior

Lecture : 10
Edit distance (w1,w2)
(Cost of all Edit Operations)

===================================================================================

Instructors – Dr. Sunil Kumar


Office : A – 108 (A-Block) :: Tel No – 0751-2449710 (O)
Email - [email protected]
Spelling Correction!!!
▪ Let us assume: a user wants to write a word “sentence”
• sentance
• sentanse Wrongly typed word
• sentense

▪ Often, we have seen that the system use to suggest us a set of words which are
similar to the wrongly typed word.
DNA Sequence Alignment!!!
▪ In computation biology, DNA sequence alignment is one of the important
problems:
• X : ACCGGTCGAGTGCGCGGAAGCCGGCCGAA
• Y : GTCGTTCGGAATGCGTTGCTCTGTAA

▪ What is the best possible alignment between the above two DNA sequences
▪ Best : I mean, it could be in terms of minimizing the minimum edit distance
between X and Y.
On what basis, the system suggests us?

Minimum Edit Distance


Edit Distance
❑ Edit distance gives a measure of similarity between two strings in terms of edit
operations.
❑ Minimum Edit Distance is the minimum number of edit operations that we need
to perform to convert one string (a source word) to another string (a target word)
❑ What are edit operations :
▪ Insertion (I)
▪ Deletion (D)
▪ Substitution (S)
❑ Each edit operation can be associated with a cost: For example,
▪ Cost (I) = 𝑐1 units
▪ Cost (D) = 𝑐2 units
▪ Cost (S) = 𝑐3 units
❑ If Cost (I) = Cost (D) = Cost (S) = 1 unit : Levenshtein distance (v1)
❑ If Cost (I) = Cost (D) = 1 units & Cost (S) = 2 units : Levenshtein distance (v2)
Computing Edit Distance
❑ Example – 01 : SENTANCE to SENTENCE

• Source word 𝑤1 S E N T A N C E

• Target word 𝑤2 S E N T E N C E
s

• Single edit needed i.e., substitute E for A to obtain target word from source
word.
• Hence, 𝐷 𝑤1 , 𝑤2 = cost(s) = 1
Computing Edit Distance
❑ Example – 02 : SENTANSE to SENTENCE

• Source word 𝑤1 S E N T A N S E

• Target word 𝑤2 S E N T E N C E
s s

• Two edit operations needed i.e., substitute E for A & substitute C for S to obtain
target word from source word.
• Hence, 𝐷 𝑤1 , 𝑤2 = cost(s)+cost(s) = 1+1 = 2
Computing Edit Distance
❑ Example – 03 : SNTANSE to SENTENCE

• Source word 𝑤1 S * N T A N S E

• Target word 𝑤2 S E N T E N C E
i s s

• Three edit operations needed:


• Insert E
• Substitute E for A and
• Substitute C for S
• Hence, 𝐷 𝑤1 , 𝑤2 = cost (I) + cost(s) + cost(s) = 1+1+1= 3
Computing Edit Distance
❑ Example – 04 : SSNTANSE to SENTENCE

• Source word 𝑤1 S S * N T A N S E

• Target word 𝑤2 * S E N T E N C E
d i s s

• Four edit operations needed:


• Delete S
• Insert E
• Substitute E for A and
• Substitute C for S
• Hence, 𝐷 𝑤1 , 𝑤2 = cost (D) + cost (I) + cost(s) + cost(s) = 1+1+1+1= 4
Computing Edit Distance
❑ Example – 05 : INTENTION to EXECUTION

• Source word 𝑤1 I N T E N T I O N

• Target word 𝑤2 E X E C U T I O N
s s s s s

• Five edit operations needed:


• Substitute E for I
• Substitute X for N
• Substitute E for T
• Substitute C for E
• Substitute U for N
• Hence, 𝐷 𝑤1 , 𝑤2 = 5cost(S) = 5 x 1 = 5 [Levenshtein (v1)]
• 𝐷 𝑤1 , 𝑤2 = 5cost(S) = 5 x 2 = 10 [Levenshtein (v2)]
Computing Edit Distance
▪ Example – 06 : INTENTION to EXECUTION

• Source word 𝑤1 I N T E * N T I O N

• Target word 𝑤2 * E X E C U T I O N
d s s i s

• Five edit operations needed:


• Delete I
• Substitute E for N
• Substitute X for T
• Insert C
• Substitute U for N
• Hence, 𝐷 𝑤1 , 𝑤2 = cost (D) + cost (S) + cost(S) + cost(I) + cost(S) = 5 [Levenshtein (v1)]
• 𝐷 𝑤1 , 𝑤2 = cost (D) + cost (S) + cost(S) + cost(I) + cost(S) = 8 [Levenshtein (v2)]
Minimum Edit Distance as Search
▪ The space of all possible edits is enormous, so we can’t search naively.
▪ Lot of distinct path end up to the same state.
▪ Not all of them are important.
▪ We can use dynamic programming to search the shortest path [path with
minimum cost]
▪ Idea of dynamic programming:
▪ To solve a large problem
▪ Decompose the problem at hand into several subproblems
▪ Solve each subproblem
▪ Combine solution of each subproblem to arrive the solution of the
larger problem at hand.
Minimum Edit Distance as Search
▪ Dynamic programming for going : INTENTION to EXECUTION

I N T E N T I O N I N T E N T I O N
Delete “I”
N T E N T I O N
Substitute N by E
Large Problem E T E N T I O N
at Hand Substitute T by X
E X E N T I O N
Insert U
E X E N U T I O N
Substitute N by C
E X E C U T I O N E X E C U T I O N
ABV-Indian Institute of Information Technology and Management
Gwalior

Lecture : 11
Edit distance (w1,w2)
(Cost of all Edit Operations)

===================================================================================

Instructors – Dr. Sunil Kumar


Office : A – 108 (A-Block) :: Tel No – 0751-2449710 (O)
Email - [email protected]
Minimum Edit Distance Algorithm
(By Wagner and Fischer)
▪ Consider TWO Strings X and Y:
• X of length n
• Y of length m

▪ Let us define 𝐷(𝑖, 𝑗) : the edit distance between X[1…i] & Y[1…j]
• X[1…i] : first 𝑖 character of X
• Y[1…j] : first 𝑗 character of Y

▪ Thus, Edit distance between X and Y is 𝐷(𝑛, 𝑚)

Wagner, Robert A., and Michael J. Fischer. "The string-to-string correction problem." Journal of the ACM (JACM)
21.1 (1974): 168-173.
Minimum Edit Distance Algorithm
▪ Dynamic Programming Approach:
▪ To compute 𝐷(𝑛, 𝑚):
• Compute 𝐷(𝑖, 𝑗) for small i and j
• Use previous 𝐷(𝑖, 𝑗) to compute 𝐷(𝑖, 𝑗) for larger i and j
• Do till end up with 𝐷(𝑛, 𝑚)
Minimum Edit Distance Algorithm
▪ Dynamic Programming Formulation:
▪ Initialisation:
• 𝐷 0,0 = 0
• 𝐷 𝑖, 0 = 𝑖 Delete 𝑖 character from X # E X E C U
• 𝐷 0, 𝑗 = 𝑗 Insert 𝑗 character from Y
# U T I O N

# E X E C U
# U T I O N
Minimum Edit Distance Algorithm
▪ Dynamic Programming Formulation:
▪ Initialisation:
• 𝐷 0,0 = 0
• 𝐷 𝑖, 0 = 𝑖
• 𝐷 0, 𝑗 = 𝑗
• Recursion:
• for i = 1 to n
• for j = 1 to m
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ] [Levenshtein (v2) is
used here]
+0 if X[ i ] = Y[ j ]

• Obtain 𝐷 𝑛, 𝑚
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
▪ Initialisation:
• 𝐷 0,0 = 0
• 𝐷 𝑖, 0 = 𝑖
R 5
• 𝐷 0, 𝑗 = 𝑗
E 4

Y 3 𝐷 4,0 = 4

A 2

L 1 𝐷 0,0 = 0

𝑖 # 0 1 2 3 4

# Y E A R
𝐷 0,2 = 2
𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 +0 if X[ i ] = Y[ j ]

Y 3 𝐷 0,1 + 1

A 2 𝐷 1,1 = min 𝐷 1,0 + 1


𝐷 0,0 + 2 if X[ 1 ] ≠ Y[ 1 ]
L 1 2
+0 if X[ 1 ] = Y[ 1 ]
𝑖 # 0 1 2 3 4
2
# Y E A R
𝐷 1,1 = min 2 ⇒ 𝐷 1,1 = 2
2
𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 +0 if X[ i ] = Y[ j ]

Y 3 𝐷 0,2 + 1

A 2 𝐷 1,2 = min 𝐷 1,1 + 1


𝐷 0,1 + 2 if X[ 1 ] ≠ Y[ 2 ]
L 1 2 3
+0 if X[ 1 ] = Y[ 2 ]
𝑖 # 0 1 2 3 4
3
# Y E A R
𝐷 1,2 = min 3 ⇒ 𝐷 1,2 = 3
3
𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 +0 if X[ i ] = Y[ j ]

Y 3 𝐷 0,3 + 1

A 2 𝐷 1,3 = min 𝐷 1,2 + 1


𝐷 0,2 + 2 if X[ 1 ] ≠ Y[ 3 ]
L 1 2 3 4
+0 if X[ 1 ] = Y[ 3 ]
𝑖 # 0 1 2 3 4
4
# Y E A R
𝐷 1,3 = min 4 ⇒ 𝐷 1,2 = 4
4
𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 +0 if X[ i ] = Y[ j ]

Y 3 𝐷 0,4 + 1

A 2 𝐷 1,4 = min 𝐷 1,3 + 1


𝐷 0,3 + 2 if X[ 1 ] ≠ Y[ 4 ]
L 1 2 3 4 5
+0 if X[ 1 ] = Y[ 4 ]
𝑖 # 0 1 2 3 4
5
# Y E A R
𝐷 1,4 = min 5 ⇒ 𝐷 1,2 = 5
5
𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 +0 if X[ i ] = Y[ j ]

Y 3 𝐷 1,1 + 1

A 2 3 𝐷 2,1 = min 𝐷 2,0 + 1


𝐷 1,0 + 2 if X[ 2 ] ≠ Y[ 1 ]
L 1 2 3 4 5
+0 if X[ 2 ] = Y[ 1 ]
𝑖 # 0 1 2 3 4
3
# Y E A R
𝐷 2,1 = min 3 ⇒ 𝐷 1,2 = 3
3
𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 +0 if X[ i ] = Y[ j ]

Y 3 𝐷 1,2 + 1

A 2 3 4 𝐷 2,2 = min 𝐷 2,1 + 1


𝐷 1,1 + 2 if X[ 2 ] ≠ Y[ 2 ]
L 1 2 3 4 5
+0 if X[ 2 ] = Y[ 2 ]
𝑖 # 0 1 2 3 4
4
# Y E A R
𝐷 2,2 = min 4 ⇒ 𝐷 2,2 = 4
4
𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 +0 if X[ i ] = Y[ j ]

Y 3 𝐷 1,3 + 1

A 2 3 4 3 𝐷 2,3 = min 𝐷 2,2 + 1


𝐷 1,2 + 2 if X[ 2 ] ≠ Y[ 3 ]
L 1 2 3 4 5
+0 if X[ 2 ] = Y[ 3 ]
𝑖 # 0 1 2 3 4
5
# Y E A R
𝐷 2,3 = min 5 ⇒ 𝐷 2,3 = 3
3
𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 +0 if X[ i ] = Y[ j ]

Y 3 𝐷 1,4 + 1

A 2 3 4 3 4 𝐷 2,4 = min 𝐷 2,3 + 1


𝐷 1,3 + 2 if X[ 2 ] ≠ Y[ 4 ]
L 1 2 3 4 5
+0 if X[ 2 ] = Y[ 4 ]
𝑖 # 0 1 2 3 4
6
# Y E A R
𝐷 2,4 = min 𝟒 ⇒ 𝐷 2,4 = 4
6
𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 +0 if X[ i ] = Y[ j ]

Y 3 2

A 2 3 4 3 4

L 1 2 3 4 5

𝑖 # 0 1 2 3 4

# Y E A R

𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 +0 if X[ i ] = Y[ j ]

Y 3 2 3

A 2 3 4 3 4

L 1 2 3 4 5

𝑖 # 0 1 2 3 4

# Y E A R

𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 +0 if X[ i ] = Y[ j ]

Y 3 2 3 4

A 2 3 4 3 4

L 1 2 3 4 5

𝑖 # 0 1 2 3 4

# Y E A R

𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 +0 if X[ i ] = Y[ j ]

Y 3 2 3 4 5

A 2 3 4 3 4

L 1 2 3 4 5

𝑖 # 0 1 2 3 4

# Y E A R

𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 3 +0 if X[ i ] = Y[ j ]

Y 3 2 3 4 5

A 2 3 4 3 4

L 1 2 3 4 5

𝑖 # 0 1 2 3 4

# Y E A R

𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 3 2 +0 if X[ i ] = Y[ j ]

Y 3 2 3 4 5

A 2 3 4 3 4

L 1 2 3 4 5

𝑖 # 0 1 2 3 4

# Y E A R

𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 3 2 3 +0 if X[ i ] = Y[ j ]

Y 3 2 3 4 5

A 2 3 4 3 4

L 1 2 3 4 5

𝑖 # 0 1 2 3 4

# Y E A R

𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 3 2 3 4 +0 if X[ i ] = Y[ j ]

Y 3 2 3 4 5

A 2 3 4 3 4

L 1 2 3 4 5

𝑖 # 0 1 2 3 4

# Y E A R

𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 4 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 3 2 3 4 +0 if X[ i ] = Y[ j ]

Y 3 2 3 4 5

A 2 3 4 3 4

L 1 2 3 4 5

𝑖 # 0 1 2 3 4

# Y E A R

𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 4 3 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 3 2 3 4 +0 if X[ i ] = Y[ j ]

Y 3 2 3 4 5

A 2 3 4 3 4

L 1 2 3 4 5

𝑖 # 0 1 2 3 4

# Y E A R

𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 4 3 4 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 3 2 3 4 +0 if X[ i ] = Y[ j ]

Y 3 2 3 4 5

A 2 3 4 3 4

L 1 2 3 4 5

𝑖 # 0 1 2 3 4

# Y E A R

𝑗
Edit Distance Table
▪ Example – 06 : LAYER to YEAR
𝐷 𝑖 − 1, 𝑗 + 1

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 1
R 5 4 3 4 3 𝐷 𝑖 − 1, 𝑗 − 1 +2 if X[ i ] ≠ Y[ j ]

E 4 3 2 3 4 +0 if X[ i ] = Y[ j ]

Y 3 2 3 4 5

A 2 3 4 3 4

L 1 2 3 4 5

𝑖 # 0 1 2 3 4

# Y E A R

𝑗
ABV-Indian Institute of Information Technology and Management
Gwalior

Lecture : 12
Edit distance (w1,w2)
(Cost of all Edit Operations)

===================================================================================

Instructors – Dr. Sunil Kumar


Office : A – 108 (A-Block) :: Tel No – 0751-2449710 (O)
Email - [email protected]
Edit Distance with Backtrace
▪ Example – 06 : LAYER to YEAR PATH-01

R 5 4 3 4 3 Substitute
E 4 3 2 3 4 No change
Y 3 2 3 4 5 Delete
A 2 3 4 3 4 Insert

L 1 2 3 4 5 L A Y E * R

𝑖 # 0 1 2 3 4

# Y E A R Y E A R

𝑗
NOTE : There could be more than one path with same cost.
Practice Problem – (Major – 2024)

Time : 5 min
▪ Major-2024 (Solution-3a)

▪ Dynamic Programming Formulation:


▪ Initialisation:
• 𝐷 0,0 = 0
• 𝐷 𝑖, 0 = 2𝑖
• 𝐷 0, 𝑗 = 2𝑗
• Recursion:
• for i = 1 to n
• for j = 1 to m
𝐷 𝑖 − 1, 𝑗 + 2

𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 2
𝐷 𝑖 − 1, 𝑗 − 1 +3 if X[ i ] ≠ Y[ j ]

+0 if X[ i ] = Y[ j ]

• Obtain 𝐷 𝑛, 𝑚
Practice Problem – (Major – 2024)

Time : 5 min
▪ Major-2024 (Solution) : ANCHOR to ACTOR

R ▪ Initialisation:
• 𝐷 0,0 = 0
O • 𝐷 𝑖, 0 = 2 ∗ 𝑖
• 𝐷 0, 𝑗 = 2 ∗ 𝑗
H

A
𝑖
#

# A C T O R

𝑗
▪ Major-2024 (Solution) : ANCHOR to ACTOR

𝐷 𝑖 − 1, 𝑗 + 2
R 12
𝐷 𝑖, 𝑗 = min 𝐷 𝑖, 𝑗 − 1 + 2
O 10 𝐷 𝑖 − 1, 𝑗 − 1 +3

H 8 +0

C 6
if X[ i ] ≠ Y[ j ]
N 4
if X[ i ] = Y[ j ]
A 2
𝑖
# 0 2 4 6 8 10

# A C T O R

𝑗
▪ Major-2024 (Solution) : ANCHOR to ACTOR

R 12 10 8 9 7 5 Substitute
No change
O 10 8 6 7 5 7
Delete
H 8 6 4 5 7 9
Insert
C 6 4 2 4 6 8

N 4 2 3 5 7 9 A N C H O R

A 2 0 2 4 6 8
𝑖 A * C T O R
# 0 2 4 6 8 10

# A C T O R

𝑗
Minimum Edit Distance as Search
▪ Example – 07 : INTENTION to EXECUTION

N 9
O 8
I 7
T 6
N 5
E 4
T 3
N 2
I 1
# 0 1 2 3 4 5 6 7 8 9
# E X E C U T I O N
Minimum Edit Distance as Search
▪ Example – 07 : INTENTION to EXECUTION PATH-01

Substitute
No change
Delete
Insert
Minimum Edit Distance as Search
▪ Example – 07 : INTENTION to EXECUTION
All Possible Alignments
Xn
• Every non-decreasing path from (0,0)
to (m, n) is corresponds to a possible
alignment between source string to
the target string.
• The optimal alignment is one that has
the minimum cost.

• Think of the following performances:


• Time complexity O(mn)
• Space (mn)
• Backtrace O(m+n)
X0
Y0 Ym
Applications of Edit Distance
▪ Spell Checking: Finding the closest correct word to a misspelled one.
▪ Plagiarism Detection: Measuring similarity between two documents.
▪ Speech Recognition: Comparing predicted and actual transcriptions.
▪ Machine Translation Evaluation: Comparing machine-generated
translations to human translations.
Acknowledgement!
Sources for this lecture include materials from works by Speech and Language
Processing by Jurafsky, Puspak Bhattacharya, Christopher D. Manning, Mohit Iyyer,
Tanmoy Chakraborty, James Martin, Song young in, Christian Korthals, Andrew
McCallum, Diana Maynard, and others.
References are given for the source image contents.

Queries!

You might also like