0% found this document useful (0 votes)
11 views13 pages

NLP-1

Natural Language Processing (NLP) involves various components such as text preprocessing, syntax analysis, and semantics processing, utilizing both linguistic knowledge and statistical models to enhance machine understanding of human language. Key techniques include N-grams for language modeling, smoothing techniques to address zero probabilities, and the Naïve Bayes classifier for text classification. Constituency parsing and context-free grammars are essential for syntactic analysis, while word sense disambiguation and discourse coherence models help in understanding the meaning and flow of text.

Uploaded by

s.shanmugapriya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views13 pages

NLP-1

Natural Language Processing (NLP) involves various components such as text preprocessing, syntax analysis, and semantics processing, utilizing both linguistic knowledge and statistical models to enhance machine understanding of human language. Key techniques include N-grams for language modeling, smoothing techniques to address zero probabilities, and the Naïve Bayes classifier for text classification. Constituency parsing and context-free grammars are essential for syntactic analysis, while word sense disambiguation and discourse coherence models help in understanding the meaning and flow of text.

Uploaded by

s.shanmugapriya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

NLP

11.a. Explain the major components of NLP


system. Discuss how linguistic knowledge and
statistical models contribute to building
effective NLP systems
Natural Language Processing (NLP) is a branch of artificial intelligence (AI) that
enables machines to understand, interpret, and generate human language. A well-
structured NLP system comprises several fundamental components:
Text Preprocessing: Before applying NLP techniques, raw text data must be
processed to enhance accuracy:
Tokenization: Splitting text into words, sentences, or phrases (e.g., "Natural
Language Processing" → ["Natural", "Language", "Processing"])
Stopword Removal: Eliminating common words like "the" and "is" to focus on
significant terms.
Stemming & Lemmatization: Reducing words to their base form (e.g., "running"
→ "run").
Part-of-Speech (POS) Tagging: Identifying words as nouns, verbs, adjectives, etc.
Named Entity Recognition (NER): Detecting proper names, organizations,
locations, etc.
Syntax Analysis: Determines the grammatical structure of sentences using
techniques such as:
Parsing: Mapping a sentence into a parse tree based on grammar rules.
Dependency Parsing: Analyzing relationships between words (e.g., subject-verb
agreement).
Phrase Chunking: Identifying noun and verb phrases.
Semantics Processing: Focuses on understanding the meaning of words and
sentences:
Word Sense Disambiguation (WSD): Identifying word meanings based on
context.
Semantic Role Labeling (SRL): Assigning roles to words (e.g., subject, object,
action).
Ontology & Knowledge Graphs: Structuring knowledge for better understanding.
Pragmatics & Discourse Analysis: Examines how context affects meaning:
Coreference Resolution: Determining what pronouns refer to (e.g., "He" in "John
said he was tired").
Sentiment Analysis: Assessing emotions in text (e.g., positive/negative sentiment
in reviews).
Dialogue Systems: Developing chatbots or conversational AI that maintain
coherence.
Machine Learning & Deep Learning Models: Modern NLP systems rely on
statistical and neural network models to process language:
Traditional ML models: Decision trees, Naïve Bayes, Support Vector Machines
(SVM).
Deep Learning models: Recurrent Neural Networks (RNN), Transformers (e.g.,
BERT, GPT).
Role of Linguistic Knowledge & Statistical Models in NLP
1. Linguistic Knowledge: Rule-Based Approaches
Linguistics provides theoretical foundations that enhance NLP systems:
Morphological Analysis: Understanding word structures (prefixes, suffixes, roots).
Syntax Rules: Defining sentence structure using grammar frameworks.
Semantic Theory: Formal meaning representation (e.g., predicate logic).
Pragmatics: Interpreting contextual nuances to improve accuracy.
Although rule-based systems were prevalent in early NLP applications, they often
struggled with ambiguity and exceptions in language.
2. Statistical Models: Data-Driven Approaches
The rise of machine learning (ML) and deep learning revolutionized NLP:
Probabilistic Language Models: Predict the next word or phrase (e.g., n-gram
models).
Supervised Learning: Uses labeled datasets for training classifiers (e.g., spam
detection).
Unsupervised Learning: Identifies patterns without predefined labels (e.g., topic
modeling).
Neural Networks: Deep learning architectures that automatically capture linguistic
patterns.
Example: Machine Translation (Google Translate)
Linguistic Rules: Grammar and sentence structure influence translations.
Statistical Models: Neural networks analyze vast amounts of bilingual text to
improve accuracy.
Hybrid Approach: Combining Linguistic & Statistical Knowledge
Today's best NLP systems integrate both:
Linguistic rules improve structure & interpretation.
Machine learning enhances adaptability & scalability.
For instance, speech recognition combines phonetic rules with AI models trained
on large voice datasets.
Conclusion
To build effective NLP systems, a multi-layered approach is essential,
incorporating:
Linguistic principles for structured understanding.
Statistical models for adaptability and data-driven learning.
Deep learning to process and generate natural language efficiently.

12.a. Explain the concept of N-grams nd


language models. Discuss the poupose of
smoothing techniques. Also explain the
working of the Naïve bayes classifier in text
classification with a suitable example.
What are N-Grams?
An N-gram is a contiguous sequence of N items (words or characters) from a given
text. These are useful in language modeling, text prediction, and natural
language processing (NLP) applications.
Unigram (1-gram): Considers single words. Example: "The", "cat", "runs"
Bigram (2-gram): Considers word pairs. Example: "The cat", "cat runs"
Trigram (3-gram): Considers three-word combinations. Example: "The cat runs"
Higher N-Grams: Longer sequences enhance prediction accuracy but require
larger datasets.
N-grams are used in statistical language models to predict the next word based
on previous words in a sentence.
Language Models and Their Role
A language model (LM) is a probabilistic framework that predicts the likelihood of
a sequence of words.
Types of Language Models
Rule-Based Models: Uses predefined grammar rules (early NLP techniques).
Statistical Language Models (SLMs): Predict words using probability
distributions, relying on observed data.
Neural Language Models: Use deep learning (e.g., Transformer-based models like
GPT, BERT) for contextual word predictions.
Language models help in machine translation, autocomplete, speech
recognition, and chatbots.
Purpose of Smoothing Techniques
In NLP, smoothing helps address the zero probability problem—when an N-
gram is missing from the training data. Without smoothing, missing N-grams get
zero probability, making predictions inaccurate.
Types of Smoothing Techniques
Laplace (Add-One) Smoothing:
Adds 1 to every word count to avoid zero probabilities.
Example: In a bigram model, if "artificial intelligence" has zero occurrences,
adding one ensures a non-zero probability.
Good-Turing Smoothing:
Adjusts probabilities by redistributing counts from frequent words to rare/unseen
words.
Helps especially in small datasets.
Backoff and Interpolation:
Backoff: If an N-gram is unseen, revert to lower-order probabilities (e.g., unigram
instead of trigram).
Interpolation: Combines different N-gram probabilities dynamically for better
predictions.
Smoothing improves accuracy of language models by ensuring they handle unseen
text effectively.
Naïve Bayes Classifier in Text Classification
The Naïve Bayes classifier is a probabilistic machine learning model based on
Bayes' Theorem, assuming features (words) are independent.
Formula:
P(C∣X)=P(X∣C)⋅P(C)P(X)P(C|X) = \frac{P(X|C) \cdot P(C)}{P(X)}
Where:

P(C∣X)P(C|X) is the probability of class C given input X (text).

P(X∣C)P(X|C) is the likelihood of observing text X in class C.


P(C)P(C) is the prior probability of class C.
P(X)P(X) is a normalization factor.
Since Naïve Bayes assumes words appear independently, it simplifies
classification efficiently.
Example: Spam vs. Non-Spam Email Classification
Consider an email containing: "Win a free iPhone now!"
Step 1: Build Word Probabilities
Spam emails often contain words like "win," "free," "prize".
Calculate probabilities of words appearing in spam vs. non-spam using training
data.
Step 2: Apply Naïve Bayes
Compute:

P(Spam∣Win, Free, iPhone, Now)P(\text{Spam}|\text{Win, Free, iPhone, Now})


Compare with:

P(Not Spam∣Win, Free, iPhone, Now)P(\text{Not Spam}|\text{Win, Free, iPhone,


Now})
If Spam Probability > Non-Spam Probability, classify email as Spam.
Step 3: Improve Accuracy
Apply Laplace Smoothing to handle unseen words.
Use bigram models to capture word sequences.
Benefits of Naïve Bayes:
Fast and efficient for text classification.
Works well for spam filters, sentiment analysis, and topic categorization.
Handles high-dimensional data effectively.
Final Thoughts
N-Grams & Language Models help computers process language.
Smoothing Techniques improve accuracy by handling missing data.
Naïve Bayes Classifier is widely used for text classification in spam detection,
sentiment analysis, and recommendation systems.

13. a. Explain constituency parsing in detail.


Describe context free grammars with an
example. Discuss CKY parsing and Earley’s
alogorithm with their advantage.
Constituency Parsing is a syntactic analysis method in Natural Language
Processing (NLP) that represents sentences as hierarchical structures. It identifies
how words group together to form meaningful phrases (constituents) following
grammatical rules.
Key Concepts of Constituency Parsing
Parse Tree:
A hierarchical structure showing syntactic components (noun phrases, verb phrases,
etc.).
Example:
Sentence → [NP] [VP]
NP → [Det] [Noun] ("The cat")
VP → [Verb] [NP] ("chased the mouse")
Phrase Structure Rules:
Rules defining valid syntactic formations.
Example:
S → NP VP (A sentence consists of a noun phrase and a verb phrase)
NP → Det Noun (Noun phrase consists of a determiner and a noun)
VP → Verb NP (Verb phrase consists of a verb and a noun phrase)
Ambiguity in Parsing:
Some sentences have multiple valid parse trees (e.g., "I saw the man with the
telescope").
Parsing algorithms resolve ambiguity using probabilistic models.
Context-Free Grammar (CFG)
A Context-Free Grammar (CFG) is a formal system for defining syntactic
structures using production rules.
Definition
A CFG consists of:
Non-Terminals (N): Categories like NP (Noun Phrase), VP (Verb Phrase).
Terminals (Σ): Words from a sentence.
Production Rules (P): Define relationships.
Start Symbol (S): Root of the parse tree.
Example CFG
For the sentence "The cat sleeps":
S → NP VP
NP → Det Noun
VP → Verb
Det → "The"
Noun → "cat"
Verb → "sleeps"
This grammar generates valid sentence structures.
Applications
CFGs are foundational for:
Syntax Analysis (Programming languages, NLP)
Natural Language Understanding
Machine Translation
CKY Parsing Algorithm: Efficient Parsing Using Dynamic Programming
The Cocke-Kasami-Younger (CKY) Algorithm is used for parsing context-free
grammars in Chomsky Normal Form (CNF).
Steps in CKY Parsing
Convert CFG to CNF
Each rule is transformed into binary form (A → B C or A → terminal).
Fill a Parsing Table Using Dynamic Programming
A bottom-up approach checks possible combinations of words to form valid
constituents.
Trace Back to Construct Parse Tree
If S (start symbol) appears in the final cell, the sentence is valid.
Advantages of CKY Parsing
Polynomial Complexity: Efficient for structured grammars.
Handles Ambiguity: Explores multiple parse trees.
Used in Probabilistic Parsing Models: Helps resolve ambiguity using
probabilities.
Earley’s Algorithm: Parsing General CFGs
Unlike CKY, Earley's Algorithm parses any CFG without requiring CNF
conversion.
Steps of Earley's Algorithm
Initialization:
Start with prediction, scanning, and completion steps.
Generate possible derivations.
Parsing with a Chart:
Maintains a state set tracking expected rules, matched components, and
completed phrases.
Final Parsing Decision:
If the start symbol covers the entire sentence, it's a valid parse.
Advantages of Earley's Algorithm
Works with Any CFG: No CNF conversion needed.
Handles Left-Recursion Efficiently: Unlike CKY.
Incremental Parsing: Faster for long sentences.
Comparison: CKY vs. Earley’s Algorithm

Feature CKY Parsing Earley’s Algorithm

Requires CNF
✅ Yes ❌ No
Conversion

Handles Left-Recursion ❌ No ✅ Yes

O(n³) (average-case
Time Complexity O(n³)
faster)

Probabilistic Models ✅ Commonly ✅ Works well


Feature CKY Parsing Earley’s Algorithm

used

Conclusion
Constituency Parsing helps analyze sentence structures using parse trees.
Context-Free Grammar (CFG) defines formal grammar rules.
CKY Algorithm is efficient but requires CNF conversion.
Earley's Algorithm is more flexible and handles left-recursion.
Both play key roles in syntactic analysis, speech recognition, and machine
translation

14.a. Explain word senses in NLP and the use


of wordnet. How is word sense disambiguation
performed? Describe different approaches
with example.
In Natural Language Processing (NLP), words often carry multiple meanings
based on context. Word Sense refers to the specific meaning of a word in a given
sentence or phrase.
Example of Word Senses
Consider the word "bank":
Financial Institution: "I deposited money in the bank."
Riverbank: "He walked along the bank of the river."
Tilt or Lean: "The airplane will bank to the left."
A language model must determine the correct meaning based on context—this
process is called Word Sense Disambiguation (WSD).
Role of WordNet in NLP
WordNet is a lexical database for the English language, providing structured
relationships between words, including synonyms, antonyms, hypernyms, and
hyponyms.
Features of WordNet
Synsets: Groups of words with similar meanings (e.g., "happy," "joyful," "content").
Semantic Relationships: Links between words based on hierarchy (e.g., "car" →
"vehicle" → "transport").
Lexical Categorization: Words are classified into noun, verb, adjective, and
adverb categories.
Applications of WordNet
Word Sense Disambiguation (WSD)
Text Classification & Search Optimization
Machine Translation & Question Answering Systems
WordNet helps NLP models understand word meanings by providing structured
lexical relationships.
Word Sense Disambiguation (WSD)
WSD is the process of assigning the correct sense to a word based on its context.
Approaches to WSD
Supervised Learning Approach
Uses labeled datasets where words are tagged with correct senses.
Example Algorithm: Naïve Bayes Classifier
Example Usage: Sentiment analysis, chatbot responses.
Limitation: Requires large labeled datasets.
Unsupervised Learning Approach
Groups words based on context similarities without predefined labels.
Example Algorithm: Clustering (K-Means, Word Embeddings)
Example Usage: Topic modeling in search engines.
Limitation: Less accurate than supervised methods.
Knowledge-Based Approach
Uses lexical databases like WordNet to determine meaning.
Example Algorithm: Lesk Algorithm (Dictionary-Based)
Example Usage: Machine translation, search engines.
Limitation: Works well but lacks adaptability.
Hybrid Approach
Combines statistical models with lexical databases for improved accuracy.
Example: Using WordNet with deep learning models.
Advantage: Balances accuracy and adaptability.
Example of WSD in Action
Consider: "He went to the bank to withdraw cash."
Using Lesk Algorithm, the model searches WordNet for definitions related to
"bank."
It matches "bank" with "financial institution" instead of "riverbank" based on
cash withdrawal context.
Final Thoughts
Word Senses help NLP systems interpret ambiguous words.
WordNet acts as a powerful lexical knowledge base.
WSD Techniques enhance machine understanding of language.
Hybrid Models are emerging for more accurate NLP applications.

15.a. Explain discourage coherence and


discourse structure parsing. How does
centering theory and entity-based coherence
modeling help in understanding text? Provide
examples.
What is Discourse Coherence?
Discourse coherence refers to the logical and meaningful flow of information
within a text or conversation. It ensures that ideas are connected and
understandable to the reader or listener. Coherence can be achieved through
linguistic cohesion (e.g., pronoun references, linking words) and semantic
consistency (e.g., maintaining a central theme).
What is Discourse Structure Parsing?
Discourse structure parsing focuses on analyzing and interpreting the
organization of a text by identifying relationships between sentences, clauses,
and paragraphs. It helps in:
Determining how parts of text are related (e.g., contrast, causality,
elaboration).
Improving natural language understanding (NLU) systems, such as chatbots
or summarization models.
Discourse parsers use linguistic rules, statistical models, and deep learning
algorithms to identify the hierarchical structure of discourse.
Centering Theory and Entity-Based Coherence Modeling
Centering Theory and Entity-Based Coherence Modeling provide frameworks to
analyze how coherence is maintained within a text.
1. Centering Theory
Developed by Grosz, Joshi, and Weinstein in 1983, Centering Theory explains
how pronouns and noun phrases maintain coherence in a discourse.
It focuses on discourse centers—specific entities that hold attention across
sentences.
Key Concepts of Centering Theory
Forward-looking centers (Cf): The set of possible entities that might become the
focus in subsequent sentences.
Backward-looking center (Cb): The primary entity maintained from the previous
sentence.
Centering transitions:
Continue: The same entity remains central.
Smooth Shift: A new entity becomes central with minimal disruption.
Rough Shift: A sudden change causes loss of coherence.
Example of Centering Theory
Sentence 1: John went to the store. He bought some milk.
"John" is the Cb (backward-looking center).
"He" refers to John, maintaining coherence (continuation).
Sentence 2: John went to the store. The cashier gave him a receipt.
The Cb shifts from "John" to "cashier" (smooth transition).
If an unrelated subject were introduced abruptly, coherence would be disrupted.
2. Entity-Based Coherence Modeling
Entity-Based Coherence Modeling focuses on tracking references to entities
throughout a text. It helps determine coherence using the distribution and
recurrence of entities.
How It Works
Identify Named Entities (NE) in a document (e.g., persons, locations,
organizations).
Track Repeated Mentions: A coherent text repeatedly refers to the same entities
in a structured way.
Measure Entity Transition Patterns: If an entity suddenly disappears or is
replaced without logical progression, coherence might be lost.
Example of Entity-Based Coherence
Compare these two texts:
❌ Incoherent Passage: "John loves football. The weather was cold. Mike plays
tennis." (No clear connection between ideas—topics change abruptly.)
✅ Coherent Passage: "John loves football. He watches it every weekend. His
favorite team is Manchester United." (Entities—"John" and "football"—are
consistently referenced.)
How These Models Help in NLP Applications
Text Summarization: Ensures summaries maintain key entities and coherence.
Dialogue Systems: AI chatbots use Centering Theory to maintain conversation
flow.
Machine Translation: Prevents incorrect entity shifts in translations.
Automated Essay Grading:

You might also like