Multimedia Application L3
Multimedia Application L3
Application
By
NLP is field of computer science and AI that concerned with computer linguistics
and interaction between human and computer natural language.
Application
Text processing and analysis
Text to speech , speech to text, speech to speech
Machine translation
Search engine.
Sentiment analysis and opinion mining.
Advanced text editor and IDE
Question answering
Spam and fraud detection
Regular expression
Regular expression (often shortened to regex), a language for
specifying text search strings.
Application
Search in text, pattern in text, string matching, linux terminal
E.g. Ctrl+F
classification :
“0” = zero or more
“1” = 1 or more
Necessity of normalization
Word boundary detection
Separated word from each other
Example: Uzbekistan, New York, cats and dogs
Example 2: #nlp
While the Unix command sequence just removed all the numbers and
punctuation, for most NLP applications we’ll need to keep these in our
tokenization. We often want to break off punctuation as a separate
token; commas are a useful piece of information for parsers, periods
help indicate sentence boundaries.
Methods of Text Normalization:
Tokenization
Input: Uzbekistan is beautiful
Output: [Uzbekistan, is, beautiful]
Stop words: Remove the common word that are not important
words: am is are, an the
Example:
Input: Tashkent has a beautiful park.
Output: Tashkent beautiful park
Preprocessing Text
Word tokenization
Text Normalization: Case folding
- Example
Input
. [CAT, I have an Apple, General Motors, US]
Output
. [cat, i have an apple, general motors, us]
Minimum Edit distance
Spell correction
Cornel Applications
Korenel
• Search
Corrnel
• Machine translation
Cononel • Information extraction
• Speech recognition
Bioinformatics
sample 1 : [A,T,G,G,C]
sample 2 : [A,T,C,G,C]
Minimum Edit Distance
I NTE*NTION
INTENTION
EXECUTION
*E XECUTION
Editing Operation
o Insertion, I DSS IS
o Deletion, D
o Substitution, S
Minimum Edit Distance
Operation cost
I=1
D=1
S = (D+I)=2
Cost = 1+2+2+1 +2 = 8
Minimum Edit Distance
DEAL DE*A L
S SS S D I
Destination(j
)
0 1 2 3 4 5
0 # L E D A
1 # 0 1 2 3 4 Initialization
Source 2 D 1
D(i,0) = i
(i) 3 E 2 D (0,j) = j
4 A 3
5 L 4
The Edit distance table
Destination(j
)
0 1 2 3 4 5
0 # L E D A
1 # 0 1 2 3 4 Initialization
Source 2 D 1 2 3
i = 2 (index value: D)
(i) 3 E 2 j = 2 (index value: L)
4 A 3
5 L 4 D(i,j)=
min{D(1,2)+1 ,
D(2,1)+1,
D(1,1)+2}
= min{2,2,2} = 2
The Edit distance table
Destination(j)
0 1 2 3 4 5
D(i,j)= min{D(1,2)+1 , D(2,1)+1,
0 # L E D A D(1,1)+2}
= min{2,2,2} = 2
1 # 0 1 2 3 4
i= 2(index value: D)
Source (i) 2 D 1 2 3 j= 3(index value: E)
3 E 2
D(i,j)= min {D(1,3)+1 , D(2,2)+1,
4 A 3 D(1,2)+2}
= min {3,3,3} = 3
5 L 4
The Edit distance table
Destination(j)
0 1 2 3 4 5
0 # L E D A
1 # 0 1 2 3 4
i= 5(index value: L)
Source (i) 2 D 1 2 3 2 3 j= 5(index value: A)
3 E 2 3 2 3 4
D(i,j)= min {D(4,5)+1 , D(5,4)+1,
4 A 3 4 3 4 3 D(4,4)+2}
= min {4,6,6} = 4
5 L 4 3 4 5 4
DP for the minimum Edit Distance
Table
N 9
O 8
I 7
T 6
N 5
E 4
T 3
N 2
I 1
# 0 1 2 3 4 5 6 7 8 9
# E X E C U T I O N
The Edit Distance Table
N 9
O 8
I 7
T 6
N 5
E 4
T 3
N 2
I 1
# 0 1 2 3 4 5 6 7 8 9
# E X E C U T I O N
Edit Distance
N 9
O 8
I 7
T 6
N 5
E 4
T 3
N 2
I 1
# 0 1 2 3 4 5 6 7 8 9
# E X E C U T I O N
The Edit Distance Table
N 9 8 9 10 11 12 11 10 9 8
O 8 7 8 9 10 11 10 9 8 9
I 7 6 7 8 9 10 9 8 9 10
T 6 5 6 7 8 9 8 9 10 11
N 5 4 5 6 7 8 9 10 11 10
E 4 3 4 5 6 7 8 9 10 9
T 3 4 5 6 7 8 7 8 9 8
N 2 3 4 5 6 7 8 7 8 7
I 1 2 3 4 5 6 7 6 7 8
# 0 1 2 3 4 5 6 7 8 9
# E X E C U T I O N
Context Free Grammar (CFG)
Context free grammar is a formal grammar which is used to generate
all possible strings in a given formal language.
Advantages of CFG
Production rules:
S → aSa
S → bSb
S→c
check that abbcbba string can be derived from the given CFG.
S ⇒ aSa
S ⇒ abSba
S ⇒ abbSbba
S ⇒ abbcbba
Categories of CFG
Recursive CFG
S -> Aa output: { ca, cba, cbba, cbbba, …. } S->
Aa
->Aba
-> cba
A -> Ab|c
Context Free Grammar (CFG)
Example of a CFG
S -> Aa S-> Aa
A -> Ab| c -> Aba
-> Abba
-> cbba
Input: {c, b, b, a}
Parsing using
CFG
Parse tree
Parse tree
Chapter 2 Chapter 5
Question
Thank you