NLP Unit 4

Uploaded by

masumamemories12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views22 pages

NLP Unit 4

Uploaded by

masumamemories12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 22

NLP UNIT 4

https://www.youtube.com/watch?v=X92-Chomhw8

https://www.youtube.com/watch?v=hGuXUIefVkc

https://www.youtube.com/watch?v=wSONlMwa9rE
NLP PARSING BASICS
• Parsing in NLP is the process of determining the syntactic structure of a text by analysing
its constituent words based on an underlying grammar (of the language).
• See this example grammar below, where each line indicates a rule of the grammar to be
applied to an example sentence “Tom ate an apple”.
• the outcome of the parsing process would be a parse tree like the following,
where sentence is the root, intermediate nodes such as noun_phrase, verb_phrase etc.
have children - hence they are called non-terminals and finally, the leaves of the
tree ‘Tom’, ‘ate’, ‘an’, ‘apple’ are called terminals.
• https://devopedia.org/natural-language-parsing
Statistical Parsing
• Statistical parsing is a group of parsing methods within natural language processing. The
methods have in common that they associate grammar rules with a probability.
Grammar rules are traditionally viewed in computational linguistics as defining the valid
sentences in a language.
• Within this mindset, the idea of associating each rule with a probability then provides
the relative frequency of any given grammar rule and, by deduction, the probability of a
complete parse for a sentence. (The probability associated with a grammar rule may be
induced, but the application of that grammar rule within a parse tree and the
computation of the probability of the parse tree based on its component rules is a form
of deduction.)
• Using this concept, statistical parsers make use of a procedure to search over a space of
all candidate parses, and the computation of each candidate's probability, to derive the
most probable parse of a sentence. The Viterbi algorithm is one popular method of
searching for the most probable parse.
• "Search" in this context is an application of search algorithms in artificial intelligence.
• As an example, think about the sentence "The can can hold water".
• A reader would instantly see that there is an object called "the can" and that this object
is performing the action 'can' (i.e. is able to); and the thing the object is able to do is
"hold"; and the thing the object is able to hold is "water".
• Using more linguistic terminology, "The can" is a noun phrase composed of a
determiner followed by a noun, and "can hold water" is a verb phrase which is itself
composed of a verb followed by a verb phrase.
• But is this the only interpretation of the sentence? Certainly "The can-can" is a perfectly
valid noun-phrase referring to a type of dance, and "hold water" is also a valid verb-
phrase, although the coerced meaning of the combined sentence is non-obvious.
• This lack of meaning is not seen as a problem by most linguists but from a pragmatic
point of view it is desirable to obtain the first interpretation rather than the second and
statistical parsers achieve this by ranking the interpretations based on their probability.
PCFG
• A Probabilistic Context-Free Grammar (PCFG) is simply a Context-Free Grammar with probabilities assigned
to the rules such that the sum of all probabilities for all rules expanding the same non-terminal is equal to
one.
• A PCFG is a simple extension of a CFG in which every production rule is associated with a probability (Booth
and Thompson 1973).
• Formally, a PCFG is a quintuple with G = (, N, S, R, D), where  is a finite set of terminal symbols, N is a
finite set of nonterminal symbols, S ∈ N is the start symbol, R is a finite set of production rules of the form A
→ α, where A ∈ N and D : R → [0, 1] is a function that assigns a probability to each member of R.
• A PCFG can be used to estimate a number of useful probabilities concerning a sentence and its parse tree(s),
including the probability of a particular parse tree (useful in disambiguation) and the probability of a
sentence or a piece of a sentence (useful in language modeling).
Generative Models
• People all around the world speak so many different languages, but a
Computer System or any other Computerized Machine only understands a
single language i.e. binary language (1s and 0s).
• This system or a process that converts human language to computer
understandable language is known as Natural Language Processing (NLP),
though various diversified models have suggested so far, yet the need for a
generative predictive model which can optimize depending upon the nature of
problem being addressed is still an area of research under work.
• The generative model is a single platform for diversified areas of NLP that can
address specific problems relating to read text, hear speech, interpret it,
measure sentiment and determine which parts are important.
• This is achieved by process of elimination once the relevant components are
identified. Single platform provides same model generating and reproducing
optimized solutions and addressing different issues.
• Generative models are considered as a class of statistical models that can generate new
data instances. These models are used in unsupervised machine learning as a means
to perform tasks such as:
 Probability and Likelihood estimation,
 Modeling data points,
 To describe the phenomenon in data,
 To distinguish between classes based on these probabilities.
• Since these types of models often rely on the Bayes theorem to find the joint
probability, so generative models can tackle a more complex task than analogous
discriminative models.
• So, Generative models focus on the distribution of individual classes in a dataset and
the learning algorithms tend to model the underlying patterns or distribution of the
data points. These models use the concept of joint probability and create the instances
where a given feature (x) or input and the desired output or label (y) exist at the same
time.
• These models use probability estimates and likelihood to model data points and
differentiate between different class labels present in a dataset. Unlike discriminative
models, these models are also capable of generating new data points.
Mathematical things involved in Generative Models
• ‌ raining generative classifiers involve estimating a function f: X -> Y, or
T
probability P(Y|X):
• Assume some functional form for the probabilities such as P(Y), P(X|Y)
• With the help of training data, we estimate the parameters of P(X|Y), P(Y)
• Use the Bayes theorem to calculate the posterior probability P(Y |X)
Some Examples of Generative Models
• ‌Naïve Bayes
• Bayesian networks
• Markov random fields
• ‌Hidden Markov Models (HMMs)
• Latent Dirichlet Allocation (LDA)
• Generative Adversarial Networks (GANs)
• Autoregressive Model
Discriminative Models

• The discriminative model refers to a class of models used in Statistical

Classification, mainly used for supervised machine learning. These types of
models are also known as conditional models since they learn the
boundaries between classes or labels in a dataset.
• Discriminative models (just as in the literal meaning) separate classes
instead of modeling the conditional probability and don’t make any
assumptions about the data points. But these models are not capable of
generating new data points. Therefore, the ultimate objective of
discriminative models is to separate one class from another.
• If we have some outliers present in the dataset, then discriminative models
work better compared to generative models i.e, discriminative models are
more robust to outliers.
Mathematical things involved in Discriminative Models
• ‌ raining discriminative classifiers involve estimating a function f: X -> Y, or
T
probability P(Y|X)
• Assume some functional form for the probability such as P(Y|X)
• With the help of training data, we estimate the parameters of P(Y|X)
Some Examples of Discriminative Models
• ‌Logistic regression
• Scalar Vector Machine (SVMs)
• ‌Traditional neural networks
• ‌Nearest neighbor
• Conditional Random Fields (CRFs)
• Decision Trees and Random Forest
e e n
t w
e and
e b
e n c i ve l s
f e r n a t d e
i f
D c r i m ve M i o
D i s a t i
n e r
G e
• A father has two kids, Kid A and Kid B. Kid A has a special character whereas he can learn
everything in depth. Kid B have a special character whereas he can only learn the differences
between what he saw.
• One fine day, The father takes two of his kids (Kid A and Kid B) to a zoo. This zoo is a very small
one and has only two kinds of animals say a lion and an elephant. After they came out of the
zoo, the father showed them an animal and asked both of them “is this animal a lion or an
elephant?”
• The Kid A, the kid suddenly draw the image of lion and elephant in a piece of paper based on
what he saw inside the zoo. He compared both the images with the animal standing before and
answered based on the closest match of image & animal, he answered: “The animal is Lion”.
• The Kid B knows only the differences, based on different properties learned, he answered:
“The animal is a Lion”.
• Here, we can see both of them is finding the kind of animal, but the way of learning and the
way of finding answer is entirely different. In Machine Learning, We generally call Kid A as a
Generative Model & Kid B as a Discriminative Model.
• In General, A Discriminative model ‌models the decision boundary between the classes. A
Generative Model ‌explicitly models the actual distribution of each class. In final both of them
is predicting the conditional probability P(Animal | Features). But Both models learn different
probabilities.
• A Generative Model ‌learns the joint probability distribution p(x,y). It predicts the conditional
probability with the help of Bayes Theorem. A Discriminative model ‌learns the conditional
probability distribution p(y|x). Both of these models were generally used in supervised
learning problems.
Difference between Discriminative and
Generative Models
• Discriminative models draw boundaries in the data space, while generative models try to model how data is
placed throughout the space.
• A generative model focuses on explaining how the data was generated, while a discriminative model focuses
on predicting the labels of the data.
• In mathematical terms, a discriminative machine learning trains a model which is done by learning
parameters that maximize the conditional probability P(Y|X), while on the other hand, a generative model
learns parameters by maximizing the joint probability of P(X, Y).
• Discriminative models recognize existing data i.e, discriminative modeling identifies tags and sorts data and
can be used to classify data while Generative modeling produces something.
• Since these models use different approaches to machine learning, so both are suited for specific tasks i.e,
Generative models are useful for unsupervised learning tasks while discriminative models are useful for
supervised learning tasks.
• Discriminative models are computationally cheap as compared to generative models.
• Generative models need fewer data to train compared with discriminative models since generative models are
more biased as they make stronger assumptions i.e, assumption of conditional independence.
• In general, if we have missing data in our dataset, then Generative models can work with these missing data,
while on the contrary discriminative models can’t.
• Local discriminative models generally take the form of conditional history-
based models, where the derivation of a candidate analysis y is modeled as
a sequence of decisions with each decision conditioned on relevant parts of
the derivation history.
• In a local discriminative model, the score of an analysis y, given the
sentence x, factors into the scores of different decisions in the derivation of
y. In a global discriminative model, by contrast, no such factorization is
assumed, and component scores can all be defined on the entire analysis y.
• This has the advantage that the model may incorporate features that
capture global properties of the analysis, without being restricted to a
particular history-based derivation of the analysis (whether generative or
discriminative).
NI T 4
OF U
E N D

Secure Coding Principles and Practices PDF
No ratings yet
Secure Coding Principles and Practices PDF
200 pages
Cohen, Shay & Hirst, Graeme (2019) - Bayesian Analysis in Natural Language Processing
No ratings yet
Cohen, Shay & Hirst, Graeme (2019) - Bayesian Analysis in Natural Language Processing
345 pages
The Present Continuous - Exercises: 1. Check Your Grammar: True or False - Present Continuous
100% (1)
The Present Continuous - Exercises: 1. Check Your Grammar: True or False - Present Continuous
2 pages
Cover Letter
No ratings yet
Cover Letter
1 page
Calibre
No ratings yet
Calibre
383 pages
Mock Assessment Grammar Set 4
100% (1)
Mock Assessment Grammar Set 4
12 pages
Generative Models for Ambiguity Resolution
No ratings yet
Generative Models for Ambiguity Resolution
8 pages
NLP UNIT-4
No ratings yet
NLP UNIT-4
6 pages
Lecture03 Naive Bayes
No ratings yet
Lecture03 Naive Bayes
33 pages
Introduction To Computational Linguistics: Eugene Charniak and Mark Johnson
No ratings yet
Introduction To Computational Linguistics: Eugene Charniak and Mark Johnson
148 pages
Unit 3
No ratings yet
Unit 3
19 pages
Lecture 02
No ratings yet
Lecture 02
31 pages
13) Natural Language Processing
No ratings yet
13) Natural Language Processing
28 pages
module5_DS_ppt
No ratings yet
module5_DS_ppt
38 pages
Machine Learning For Natural Language Processing Lecture Notes Columbia E6998 Itebooks download
No ratings yet
Machine Learning For Natural Language Processing Lecture Notes Columbia E6998 Itebooks download
42 pages
UNIT 3 Language Modelling
No ratings yet
UNIT 3 Language Modelling
15 pages
Statistical NLP
No ratings yet
Statistical NLP
19 pages
Stat NLP
No ratings yet
Stat NLP
19 pages
NLP - Viva - Que & Ans
No ratings yet
NLP - Viva - Que & Ans
15 pages
NLP Unit-4
No ratings yet
NLP Unit-4
48 pages
Lecture 4
No ratings yet
Lecture 4
87 pages
Unit-3 (NLP)
No ratings yet
Unit-3 (NLP)
28 pages
NLP - N-Gram Language Model
No ratings yet
NLP - N-Gram Language Model
22 pages
anlp-03-lm-seqmod
No ratings yet
anlp-03-lm-seqmod
68 pages
NLP ANONYMOUS QB Ans
No ratings yet
NLP ANONYMOUS QB Ans
21 pages
Noun Phrase Extraction: A Description of Current Techniques
No ratings yet
Noun Phrase Extraction: A Description of Current Techniques
36 pages
NLP Summary
No ratings yet
NLP Summary
2 pages
NLP_basics
No ratings yet
NLP_basics
119 pages
Eisenstein
No ratings yet
Eisenstein
305 pages
CRF Klinger Tomanek
No ratings yet
CRF Klinger Tomanek
32 pages
Unit 5
No ratings yet
Unit 5
107 pages
kiran2019
No ratings yet
kiran2019
4 pages
CS 388: Natural Language Processing:: N-Gram Language Models
No ratings yet
CS 388: Natural Language Processing:: N-Gram Language Models
22 pages
Ngrams
100% (1)
Ngrams
22 pages
CH 6
No ratings yet
CH 6
30 pages
404 Ba (P1) Artificial Intelligence in Business
No ratings yet
404 Ba (P1) Artificial Intelligence in Business
12 pages
Machine Learning and Statistical Natural Language Processing
No ratings yet
Machine Learning and Statistical Natural Language Processing
27 pages
AIML Notes Sess 2
No ratings yet
AIML Notes Sess 2
12 pages
AI
No ratings yet
AI
24 pages
(IJCST-V6I3P19) :vignesh Venkatesh
No ratings yet
(IJCST-V6I3P19) :vignesh Venkatesh
16 pages
Last Unit Motes
No ratings yet
Last Unit Motes
11 pages
Reference Material NLP - 2
No ratings yet
Reference Material NLP - 2
40 pages
Unit 5 NLP
No ratings yet
Unit 5 NLP
24 pages
Unit 5 Updated
No ratings yet
Unit 5 Updated
107 pages
NLP l IA2
No ratings yet
NLP l IA2
23 pages
404 Ba P2 Artificial Intelligence in Businessapplications
No ratings yet
404 Ba P2 Artificial Intelligence in Businessapplications
13 pages
Module-5:: Network Analysis
No ratings yet
Module-5:: Network Analysis
22 pages
AI Mid-Term
No ratings yet
AI Mid-Term
3 pages
A Survey On Semantic Parsing
No ratings yet
A Survey On Semantic Parsing
22 pages
Natural Language Processing 5
No ratings yet
Natural Language Processing 5
24 pages
Applied Natural Language Processing: Barbara Rosario
No ratings yet
Applied Natural Language Processing: Barbara Rosario
39 pages
Sem Parse
No ratings yet
Sem Parse
76 pages
NLP UNIT-4
No ratings yet
NLP UNIT-4
62 pages
Probabilistic Language Modeling Challenges
No ratings yet
Probabilistic Language Modeling Challenges
12 pages
NLP_Unit2 (2)
No ratings yet
NLP_Unit2 (2)
65 pages
Lecture 2
No ratings yet
Lecture 2
28 pages
L3 LanguageModels
No ratings yet
L3 LanguageModels
118 pages
Lec-3 Language Modeling N-Grams
No ratings yet
Lec-3 Language Modeling N-Grams
41 pages
3-Lecture Three - (Chapter Two-N-gram Language Models)
No ratings yet
3-Lecture Three - (Chapter Two-N-gram Language Models)
28 pages
05 Ar 4
No ratings yet
05 Ar 4
145 pages
CD AAT (Techtalk)
No ratings yet
CD AAT (Techtalk)
22 pages
NLP Viva
No ratings yet
NLP Viva
14 pages
NLP UNIT 1
No ratings yet
NLP UNIT 1
46 pages
NLp
No ratings yet
NLp
12 pages
Visualizing Data Structures
From Everand
Visualizing Data Structures
Rhonda Hoenigman
No ratings yet
Language Mind
No ratings yet
Language Mind
15 pages
Theory and Problems of Translation Studies
No ratings yet
Theory and Problems of Translation Studies
4 pages
5
No ratings yet
5
1 page
Stopwatch 1 Test Plus U1
100% (1)
Stopwatch 1 Test Plus U1
4 pages
New Password B1+ ST 8A - odp
No ratings yet
New Password B1+ ST 8A - odp
1 page
Aptis SpeakingAptis Speaking - Part 2 Model Answer
50% (2)
Aptis SpeakingAptis Speaking - Part 2 Model Answer
1 page
A World Language
No ratings yet
A World Language
74 pages
Fluent With Friends Success Guide
No ratings yet
Fluent With Friends Success Guide
3 pages
Phrases Clauses and Sentence
No ratings yet
Phrases Clauses and Sentence
11 pages
Nab 080617 James Bond PDF
No ratings yet
Nab 080617 James Bond PDF
8 pages
Take-Home Exam Modelo English Semantics
No ratings yet
Take-Home Exam Modelo English Semantics
6 pages
Big Fish for Max
No ratings yet
Big Fish for Max
6 pages
1 Bac Celebrations Writing Article
No ratings yet
1 Bac Celebrations Writing Article
2 pages
Software PLC PDF
No ratings yet
Software PLC PDF
17 pages
Latin Palaeography (Part I (I.1) Chapter 3 - The Palaeography of Numerals) Oxford pp. 48-59
No ratings yet
Latin Palaeography (Part I (I.1) Chapter 3 - The Palaeography of Numerals) Oxford pp. 48-59
12 pages
Themes (For Whom The Bell Tolls)
67% (3)
Themes (For Whom The Bell Tolls)
3 pages
RgVedaMuller II Text 看图王
No ratings yet
RgVedaMuller II Text 看图王
972 pages
bp-b2p-cb-ak-answer-key-of-bussiness-partner-b2
No ratings yet
bp-b2p-cb-ak-answer-key-of-bussiness-partner-b2
52 pages
8 Mass BM (PG 157-181)
No ratings yet
8 Mass BM (PG 157-181)
25 pages
Grade 4 Reciprocal Reading Lesson
No ratings yet
Grade 4 Reciprocal Reading Lesson
11 pages
Subject of A Sentence
No ratings yet
Subject of A Sentence
11 pages
Cases of Noun
No ratings yet
Cases of Noun
4 pages
Ogham Gematria - Grian More Sunshine !5
No ratings yet
Ogham Gematria - Grian More Sunshine !5
5 pages
SPM Listening Practice 1
100% (1)
SPM Listening Practice 1
4 pages
Rhyme Scheme Definition
No ratings yet
Rhyme Scheme Definition
9 pages

NLP Unit 4

Uploaded by

NLP Unit 4

Uploaded by

NLP UNIT 4

• The discriminative model refers to a class of models used in Statistical

You might also like