0% found this document useful (0 votes)
8 views

Introduction

The document provides an overview of natural language processing including why text is important, tasks in NLP like understanding and generation, and applications such as machine translation, question answering, and information extraction.

Uploaded by

saisuraj1510
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Introduction

The document provides an overview of natural language processing including why text is important, tasks in NLP like understanding and generation, and applications such as machine translation, question answering, and information extraction.

Uploaded by

saisuraj1510
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Natural Language Processing

& Applications
Why Text ?

Source: www.pinterest.com
2
Source : RECOMND

3
Source: lifeboat.com
4
Natural Language Processing

• A hallmark of human intelligence.


• Natural Language Processing
• NLP = Natural Language Understanding + Natural Language Generation
• Process information contained in natural language text.
• Also known as Computational Linguistics (CL), Human Language Technology
(HLT), Natural Language Engineering (NLE)
• Can machines understand human language?
Ultimate goal
Analyze, understand and generate human languages just like humans do.

5
Fitting in CS taxonomy
Computers

Databases Algorithms Networking


Artificial Intelligence

Robotics Natural Language Processing Search

6
NLP- Tasks

Natural Language Understanding


Taking some spoken/typed sentence and working out what it means

Natural Language Generation


Taking some formal representation of what you want to say and working out a
way to express it in a natural (human) language (e.g., English)

7
Working towards

• Applying computational techniques to language domain.

• Use the theories to build systems that can be of social use.

• Make computers learn our language rather than we learn theirs.

8
Natural language understanding
Raw speech signal /Raw Text
• Speech recognition
Sequence of words spoken /written
• Syntactic analysis
Structure of the sentence
• Semantic analysis
Partial representation of meaning of sentence
• Discourse & Pragmatic analysis
Final representation of meaning of sentence

9
Need for Language Technologies – In Daily life

A computer could be used for:


• Answering the phone, and replying to a question
• Translating a daily newspaper.
• Read the whole newspaper and tell me the important news only
• Automatically generating movie subtitles
• Sentence corrections
• Correcting descriptive questions.
• Understanding text in journals / books and building an expert system.

10
Application Areas
➢Machine Translation
➢Information Retrieval
Selecting from a set of documents the ones that are relevant to a query
➢Text Categorization
Classifying text into fixed topic categories
➢Question Answering
➢ Information Extraction
Converting unstructured text into structured data

11
Application Areas (cont..)
➢Spoken language control systems
➢Spelling and grammar checkers
➢Sentiment Analysis
➢Text-to-Speech & Speech recognition
➢Natural Language Dialogue Interfaces to Databases

12
Question Answering

Source: Google
13
Information Retrieval

• NLP improves web search

• Search for ‘Jaguar’


• Search for ‘Apple’
• Search for notebook, find Laptop

Source: Google
14
Email Spam Filtering/Categorizing

Source : junkemailfilter.com

15
Text Categorization
• Assign Label to a document representing its content (ACM keyword, Yahoo
category)
• E.g. Decide if a newspaper article is about politics, business, or sports?

16
Source: Medium
Machine Translation
• Multilingual Usage
• Machine-assisted human Translation
• Scope
Creating Language resources.

Source: www.localizer.co

17
Source: Google

18
Duplicate Question detection

19
Knowledge Extraction

Source: http://aritter.github.io

20
Information Extraction
Information extraction systems
• Find and understand relevant parts of text.
• Produce a structured representation of the relevant information
from text, in the form of :
• entities,
• relations between entities ,
• events in which the entities are involved.
• Produce a structured representation of the relevant information-
relations/events

21
Information Extraction

Source : cs.washington.edu 22
Applications of IE Systems

• Extracting diagnoses, symptoms, physical findings, test results from Medical


patient records.
• Gathering earnings, profits, board members, etc. from company reports
• Automatic Verification of construction industry specifications documents
• Real estate advertisements
• Building job databases from textual job vacancy postings
• Extraction of company take-over events
• Categorizing customer feedbacks based on product names
• Location extraction from social media texts for security applications.

23
Semantic Web

• Linked Data
• Vocabularies / Domain Information
• Inference
• Query

Source :Google

24
TOOLS
• Apache OpenNLP : Java machine learning toolkit for natural language
processing
• OpenCalais : Tag the people, places, companies, facts, and events in your
content to increase its value, accessibility and interoperability
• DBpedia Spotlight : Tool for automatically annotating mentions of DBpedia
resources in text.
• Natural Language Toolkit is a suite of libraries and programs for NLP
• General Architecture for Text Engineering (GATE)
• Spacy is a free open-source library featuring state-of-the-art speed and
accuracy and a powerful Python API.
• Stanford CoreNLP:a Java annotation pipeline framework, which provides
most of the common core natural language processing (NLP) steps, from
tokenization through to coreference resolution.
25
Aspects of Language Processing
• Phonology
• Word, lexicon: lexical analysis
• Morphology, word segmentation
• Syntax
• Sentence structure, phrase, grammar, …
• Semantics
• Meaning
• Discourse analysis
• Meaning of a text
• Relationship between sentences
• Pragmatics
The study of meaning in different contexts of use
26
Phonology
Speech processing
• Humans process speech remarkably well.
• Speech interface can replace keyboards and monitors.
• Convert Acoustic signals to Text.
• Phonemes are the smallest recognizable speech unit in a language.

Grapheme
A way of writing
down a phoneme

Speech Recognition, Text to Speech Conversion 27


Lexical Analysis -Morphology

Delegate
(de + leg + ate)
Take the legs from

cashier
(cashy + er)
More wealthy

Source: www.pinterest.co.uk
28
Morphology
• Structures and patterns in words
• Words are a sequence of Morphemes.
• Morpheme – smallest meaningful unit in a word.
• Analyses how words are formed from morphemes.
e.g., dogs= dog+s.
• Inflectional Morphology – Same Part of Speech
• Buses = Bus + es
• Carried = Carry + ed
• Derivational Morphology – Change PoS.
• Destruct + ion = Destruction (Noun)
• Beauty + ful = Beautiful (Adjective)
• Affixes – Prefixes, Suffixes Rules govern the fusion.

Spell checkers, Lemmatization, Information retrieval


29
Syntax
• Words when put together they convey more.
• Syntax is the grammatical structure of the
sentence.
• Syntactic Analysis (Parsing)
Process of assigning a parse tree to a
sentence.
Parsing: Given a sentence and a grammar
• Checks that the sentence is correct
according to the grammar
• Returns a parse tree representing the
structure of the sentence

Grammar checking tools, Information Extraction, Phrase Identification 30


Syntactic Analysis - Grammar
sentence -> noun_phrase, verb_phrase
noun_phrase -> proper_noun
noun_phrase -> determiner, noun
verb_phrase -> verb, noun_phrase

proper_noun -> [mary]


noun -> [apple]
verb -> [ate]
determiner -> [the]

31
Parsing
• Analyze the structure of a sentence

NP VP

PP
NP NP

D N V D N P D N
The student put the book on the table

32
Semantic Analysis
• What do you mean..?
• Words – Lexical Semantics
• Sentences – Compositional Semantics
• Converting the syntactic structures to semantic format – meaning
representation.
• Semantics: the meaning of a word or phrase within a sentence

Event Extraction, Knowledgebase construction


33
Semantic Representations
• Meaning representation of the sentence from its syntactic structure(s)
• Ways of meaning representing the sentence:
• Logical forms
Sentence: A tall man plays basketball
Representation: x man(x) & tall(x) & plays(x, basketball)
• Semantic role labelling
Sentence: 3 people were killed as X fired with gun.
Representation: Kill( Agent : X, Victim: 3 people, Instrument : Gun)

34
Discourse Analysis
• The meaning of an individual sentence may depend on the sentences that
precede it and may influence the meaning of the sentence that follow it.
• Issues related to discourse Integration
• Anaphora
Resolving the pronoun’s reference. Co-reference resolution
• Ellipsis
Incomplete sentences

• Anaphora
• I read the book by Dr. Kalam. It was great

• He hits the car with a stone. It bounces back.


35
Anaphora Resolution
• Anaphora Resolution(AR) is the process of
determining the antecedent of an anaphor.
• Anaphor – The reference that points to the
previous item (he, it)
• Antecedent –The entity to which anaphor
refers (John, Ice-cream)
• Mary bought a book for Kelly. She didn’t like it.
• She refers to Mary or Kelly??
• It refers to what -- book

36
Discourse Structures- Ellipsis

• Ellipsis – Incomplete sentences


• “What’s your name?”
• “Sri, and yours?”

The second sentence is not complete, but what it means can be inferred
from the first one.

37
Pragmatics
• Uses context of utterance

• Where, by who, to whom, why, when it was


said

• Intentions: inform, request, promise,


criticize, …

38
Challenges in NLP: Ambiguity
Morphology

• Words with Different Part of speech(POS)


Ex. Issue ( I have an issue/ Please issue a ticket)

• Words with different meaning with same POS


Ex. Bank (River Bank, Indian Bank)

39
Syntax Ambiguity

S S

VP VP
NP NP
NP NP

N N V N N V Adj N
Teacher strikes idle kids Teacher strikes idle kids

40
Attachment Ambiguity

• A sentence has attachment ambiguity if a constituent fits more than one


position in a parse tree.

• Attachment ambiguity arises from uncertainty of attaching a phrase or clause to


a part of a sentence

• “ John saw Mary with a telescope”


• John saw (Mary with a telescope)
• John (saw Mary with a telescope)

41
Semantic Ambiguity
• Meaning of the words themselves can be misinterpreted.
• Example 1: The car hit the pole while it was moving.
• The interpretations can be
• The car, while moving, hit the pole
• The car hit the pole while the pole was moving.
• Example 2:

42
Semantic Ambiguity
Semantic ambiguity: “I saw the prudential building flying into Boston”

Semantic Restriction, Domain Knowledge - Ontology

43
Sample Ontology

44
Discourse Ambiguity

“We gave the monkeys the bananas because they were hungry”

“We gave the monkeys the bananas because they were over-ripe”

45
Pragmatics Ambiguity

Pragmatic ambiguity: “you’re late”

What’s the speaker’s intention: informing or criticizing?

46
Enabling Computing Techniques
• Stemming
• Reduce words to base form.
• Part of Speech Tagging
• Determine for each word whether it is a noun, adjective, verb, …..
• Parsing
• sentence to parse tree
• Wordnet – Lexical Database - 206941 word sense pairs
• Word Sense Disambiguation
• Bank (Financial Bank vs Riverbank)
• Semantic similarity metrics
• Vector Representations of Words, Sentences
• Neural Network based Models
• Word2Vec, Glove, Elmo.
• Pretrained models
• BERT etc.
• Large Language Models
• GPT, Llama etc.
47
Conclusion

• Complete human-level natural language understanding is still a


distant goal
• Develop Algorithms for each level.
• Find appropriate match between application domain and the
available methods

48
References
Books
• Dan Jurafsky and James H. Martin, Speech and Language Processing , Pearson
education

49

You might also like