Analysis of Statistical Parsing in Natural Language Processing
Analysis of Statistical Parsing in Natural Language Processing
ISSN No:-2456-2165
Abstract:- A statistical language model is a probability Test sentence (s): The Arabian knights are the fairy tales
distribution P(s] over all possible word sequences (or of the east.
any other linguistic unit like words, sentences,
paragraphs, documents, or spoken utterances). A P(The/<s>) x P(Arabian/the) x P(Knights/Arabian) x
number of statistical language models have been P(are/knights)
proposed in literature. The dominant approach in x P(the/are) x P(fairy/the) x P(tales/fairy) x P(of/tales) x
statistical language modeling is the n-gram model. P(the/of)
x P(east/the)
I. INTRODUCTION = 0.67 x 0.4 x 1.0 x 1.0 x 0.5 x 0.2 x 1.0 x 1.0 x 1.0 x
0.2
A. n-gram Model =0.0268
As discussed earlier, the goal of a statistical
language model is to estimate the probability (likelihood) As each probability is necessarily less than 1,
of a sentence. This is achieved by decomposing sentence multiplying the probabilities might cause a numerical
probability into a product of conditional probabilities underflow, particularly in long sentences. To avoid this,
using the chain rule as follows: calculations are made in log space, where a calculation
corresponds to adding log of individual probabilities and
P(s)= P(w1, w2, w3,……….,wn) taking antilog of the sum.
P(s)= P(w1) P(w2/w1) P(w3/w1 w2)……
P(wn/w1 w2 ……wn-1) B. Add-one Smoothing
This is the simplest smoothing technique. It adds a
value of one to each n-gram frequency before
= Language Processing and normalizing them into probabilities. In general, add-one
Information Retrieval smoothing is not considered a good smoothing technique.
It assigns the same probability to all missing n-grams,
Where h, is history of word w, defined as W 1 W2... even though some of them could be more intuitively
W i_ 1 appealing than others. Gale and Church (1994) reported
that variance of the counts produced by the add-one
Example-1 smoothing is worse than the unsmoothed MLE method.
Another problem with this technique is that it shifts too
Training set much of the probability mass towards the unseen n-grams
The Arabian Knights (n-grams with 0 probabilities) as there number is usually
These are the fairy tales of the east quite large, Good-Turing smoothing (Good 1953
The stories of the Arabian knights are translated in attempts to improve the situation by looking at the
many languages number of n-grams with a high frequency in order to
estimate the probability mass that needs to be assigned to
Bi-gram model missing or low-frequency n-grams.