0% found this document useful (0 votes)
2 views

nlp unit 3

The document discusses semantic analysis in natural language processing (NLP), focusing on the analysis of sentence structure and meaning. It covers types of ambiguity, relations between words, and methods for representing meaning, including word sense disambiguation and vector semantics. Additionally, it highlights approaches like supervised and unsupervised methods for word sense disambiguation and the use of embeddings for efficient word representation.

Uploaded by

donpraj6
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

nlp unit 3

The document discusses semantic analysis in natural language processing (NLP), focusing on the analysis of sentence structure and meaning. It covers types of ambiguity, relations between words, and methods for representing meaning, including word sense disambiguation and vector semantics. Additionally, it highlights approaches like supervised and unsupervised methods for word sense disambiguation and the use of embeddings for efficient word representation.

Uploaded by

donpraj6
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

CCNLP UNIT 3

1
2
SEMANTIC ANALYSIS

Semantic analysis analyzes the grammatical format of


sentences, including the arrangement of words, phrases,
and clauses, to determine relationships between
independent terms in a specific context. This is a crucial
task of natural language processing (NLP) systems.

3
TYPES OF AMBIGUITY

• Lexical Ambiguity results when a word has more than one possible meaning such as in the
case of “board”, it could mean the verb “to get on” or it could refer to a flat slab of wood.

•Syntactic Ambiguity is present when more than one parse of a sentence exists.
Eg. “He lifted the branch with the red leaf.”
“He lifted the branch with the red leaf.”
The verb phrase may contain “with the red leaf” as part of the embedded noun phrase describing the
branch or “with the red leaf” may be interpreted as a prepositional phrase describing the action instead of the
branch, implying that he used the red leaf to lift the branch.

• Semantic Ambiguity is existent when more than one possible meaning exists for a sentence as
in “He lifted the branch with the red leaf.” It may mean that the person in question used a
red leaf to lift the branch or that he lifted a branch that had a red leaf on it.

4
• Referential Ambiguity is the result of referring to something without explicitly
naming it by using words like “it”, ‘he” and “they.” These words require the
target to be looked up and may be impossible to resolve such as in the sentence:
eg. “The interface sent the peripheral device data which caused it to break”, it
could mean the peripheral device, the data, or the interface.

• Local Ambiguity occurs when a part of a sentence is unclear but is resolved


when the sentence as a whole is examined.
Eg. “this hall is colder than the room”
exemplifies local ambiguity as the phrase: “is colder than” is indefinite until
“the room” is defined.

5
SEMANTIC ANALYSIS

• purpose of semantic analysis is to derive exact meaning, or dictionary


meaning from the text.
• The task of semantic analyzer is to check the text for meaningfulness.
eg. Rama writes a book. A book
writes Rama.

• Lexical analysis checks semantics for individual word, semantic analysis


checks for larger text (group of words)

6
RELATIONS BETWEEN WORDS

• Hyponymy
It is relationship between a generic term and instances of
that generic term. The generic term is called hypernym and
its instances are called hyponyms.
Eg. the word color is hypernym and the color blue, yellow
etc. are hyponyms.

• Homonymy
It is defined as the words having same spelling or same form but
having different and
unrelated meaning.
Eg. the word “Bat” is a homonymy word because bat can be an
implement to hit a ball or bat is a nocturnal flying mammal also.
7
• Polysemy
Polysemy means “many signs”. It represents words with
different but related sense. Polysemy has the same
spelling but different and related meaning.
Eg. the word “bank” is a polysemy word having the following
meanings −
• A financial institution.
• The building in which such an institution is located.
• A synonym for “to rely on”.

• Synonymy
• It is the relation between two lexical items having different
forms but expressing the same or a close meaning. Examples
are ‘author/writer’, ‘fate/destiny’.
8
• Antonyms
• It is the relation between two lexical items having symmetry
between their semantic components relative to an axis. The
scope of antonymy is as follows −
• Application of property or not − Example is ‘life/death’,
‘certitude/incertitude’
• Application of scalable property − Example is ‘rich/poor’,
‘hot/cold’
• Application of a usage − Example is ‘father/son’, ‘moon/sun’.

9
BUILDING BLOCKS OF A SEMANTIC SYSTEM

• To create a representation of meaning of a sentence, semantic analysis uses


following elements.
• Entities − It represents the individual such as a particular person, location etc.
eg. Pune, apple , Rama all are entities.
• Concepts − It represents the general category of the individuals such as a person, city,
fruit etc.
• Relations − It represents the relationship between entities and concept.
eg. Apple is a fruit.
• Predicates − It represents the verb structures.
eg. semantic roles and case grammar are the examples of predicates.

10
HOW TO REPRESENT
MEANING
• Semantic analysis uses the following
approaches for the representation of
meaning −
• First order predicate logic (FOPL)
• Semantic Nets
• Frames
• Conceptual dependency (CD)
• Rule-based architecture
• Case Grammar
• Conceptual Graphs

11
SYNTAX DRIVEN SEMANTIC
ANALYZER
• Establish analogy with Syntax directed translation

A -> αi …… αn f(αj.sem, …., αk.sem)

Fig 18.4

12
LEXICAL SEMANTICS

• The first part of semantic analysis, representing the meaning of individual words is called
lexical semantics. It includes words, sub-words, affixes (sub-units), compound words and
phrases also. All the words, sub-words, etc. are collectively called lexical items.

Following are the steps involved in lexical semantics −


• Classification of lexical items like words, sub-words, affixes, etc. is performed in lexical
semantics.
• Decomposition of lexical items like words, sub-words, affixes, etc. is performed in lexical
semantics.
• Differences as well as similarities between various lexical semantic structures is also
analyzed.

13
Entity and event resolution

14
ENTITY AND EVENT RESOLUTION

• ENTITY EXTRACTION, OR NAMED ENTITY RECOGNITION (NER), IS FINDING


MENTIONS OF KEY “THINGS” (AKA “ENTITIES”) SUCH AS PEOPLE, PLACES,
ORGANIZATIONS, DATES, AND TIME WITHIN TEXT. ENTITY MENTIONS ARE THE
WORDS IN TEXT THAT REFER TO ENTITIES, SUCH AS “BILL CLINTON,” “WHITE
HOUSE,” AND “U.S.”
• ENTITY RESOLUTION (AKA, ENTITY LINKING) TAKES IT ONE STEP FURTHER AND
DISTINGUISHES BETWEEN SIMILARLY NAMED ENTITIES SUCH AS GEORGE W.
BUSH AND GEORGE H. W. BUSH. OR, FROM THE MENTION OF “CLINTON,”
FIGURES OUT WITHIN THAT DOCUMENT IF “CLINTON” REFERS BACK TO BILL
CLINTON OR HILLARY CLINTON BY LOOKING AT THE CONTEXT IN WHICH THE 15
ENTITY APPEARS—AKA COREFERENCE RESOLUTION.
• THIS IS POSSIBLE BECAUSE ENTITY RESOLUTION TAKES THE MENTION OF EACH ENTITY, LOOKS
AT THE SURROUNDING CONTEXT, AND COMPARES IT TO A KNOWLEDGE BASE (SUCH AS
WIKIDATA). FOR EXAMPLE, IF THE ENTITY IS “NEIL ARMSTRONG,” IS HE MENTIONED IN THE
CONTEXT OF “AMERICAN, NASA, ASTRONAUT” OR “CANADIAN, NHL, REFEREE”?

16
COREFERENCE RESOLUTION
Coreference resolution is the task
of finding all expressions that
refer to the same entity in a text.

It is an important step for a lot of


higher level NLP tasks that
involve natural language
understanding such as

document summarization,
question answering, and 17

information extraction.
Predicate argument structure in NLP?

• Predicate argument structure is based on the


function features of lexical items (most often
verbs).

• The function features determine the thematic


roles to be played by the other words in the
sentence.

• However, function features and thematic roles


don't always cooincide
18
Word Sense Disambiguation (WSD)

19
WORD SENSE DISAMBIGUATION (WSD)
• words have different meanings based on the context of its usage in the sentence. Word sense
disambiguation, in natural language processing (NLP), is the process that decides which meaning of word
is activated by the use of word in a particular context.
• Lexical ambiguity, syntactic or semantic, is one of the very first problem that any NLP system faces.
• Part-of-speech (POS) taggers with high level of accuracy can solve Word’s syntactic ambiguity.
• The problem of resolving semantic ambiguity is called WSD (word sense disambiguation). – the task of
selecting the correct sense for a word. Resolving semantic ambiguity is harder than resolving syntactic
ambiguity.
Eg. I can hear bass sound.
He likes to eat grilled bass.
After disambiguation with WSD, the correct meanings are assigned as follows:
I can hear bass/frequency sound. He likes to eat
grilled bass/fish.

20
– Sita has a strong interest in computational linguistics.
– Sita pays a large amount of interest on her credit card.

• Applications of WSD
- question answering
- informational retrieval
- text classification

21
• A WSD system / algorithm takes as input a word in context along with a fixed inventory (storage / dictionary)
of potential word senses; and returns as output the correct word sense for that use.
• The inventory of senses depends on the task at hand –
- For machine translation from English to Spanish,

• Basic components for word sense disambiguation:


- Dictionary – used to specify the senses to be disambiguated

- Test Corpus – it is of two types


• Lexical sample: the occurrences of a small sample of target words need to be disambiguated, and
• All-words: all the words in a piece of running text need to be disambiguated.

22
DIFFERENT APPROACHES TO WSD
• Dictionary and Knowledge based methods
These rely primarily on dictionaries, thesauri, and lexical knowledge bases, without using any corpus
evidence.
• Supervised methods
These make use of sense-annotated corpora for training.
• Semi-supervised methods
These methods require very small amount of annotated text and large amount of plain unannotated text.
• Unsupervised methods
This approach works directly from raw unannotated corpora.

23
SUPERVISED WORD SENSE DISAMBIGUATION

• For a lexical sample task, a small pre-selected set of target words is chosen along
with an inventory of senses for each word.
• As set of words and set of senses are small, supervised ML approaches are used
for lexical sample.
• For each word, a number of instances (from the corpus) can be selected and
hand-labeled with the correct sense of the target word in each.
• Classifiers can be trained with the labeled examples. Trained classifiers can be
used to label the unlabeled target words.

• For the all-words task, systems are given entire text and a lexicon with an
inventory of senses for each entry. It is required to disambiguate every content
word in the text.
This task is similar to POS tagging, but the scope is larger……set of tags is larger

24
• Supervised WSD uses data hand-labeled with correct word senses.
• Hence supervised WSD approach extracts features from the text that are helpful
in
predicting particular senses
• Based on these features, a classifier is trained
• This classifier is used to assign senses to unlabeled words in context
• Commonly used classifier is Naïve Bayes

• Hand-labeled corpus available –


• Line-hard-serve corpus …….4000 sense-tagged examples (for lexical sample)
• Interest corpus ……2369 examples

• SemCor – corpus with 234,000 examples manually tagged …..(for all words)
25
MEANING REPRESENTATION IN NLP

• IT CAN BE SEEN AS A FORMAL STRUCTURE CAPTURING THE MEANING OF LINGUISTIC INPUT.


• THE SEMANTICS, OR MEANING, OF AN EXPRESSION IN NATURAL LANGUAGE CAN BE
ABSTRACTLY REPRESENTED AS A LOGICAL FORM. ONCE AN EXPRESSION HAS BEEN FULLY
PARSED AND ITS SYNTACTIC AMBIGUITIES RESOLVED, ITS MEANING SHOULD BE UNIQUELY
REPRESENTED IN LOGICAL FORM.

26
HOW TO REPRESENT THE MEANING OF A SENTENCE

• THERE ARE FOUR COMMONLY USED MEANING REPRESENTATION LANGUAGES:


• FIRST ORDER LOGIC
• ABSTRACT MEANING REPRESENTATION USING A DIRECTED GRAPH
• ABSTRACT MEANING REPRESENTATION USING THE TEXTUAL FORM
• FRAME-BASED OR SLOT FILTER REPRESENTATION

27
28
VECTOR SEMANTICS AND EMBEDDINGS

29
VECTOR SEMANTICS AND EMBEDDINGS

• VECTOR SEMANTICS REPRESENTS A WORD IN MULTI-DIMENSIONAL VECTOR SPACE.


• VECTOR MODEL IS ALSO CALLED EMBEDDINGS, DUE TO THE FACT THAT WORD IS EMBEDDED IN
A PARTICULAR VECTOR SPACE.
• VECTOR MODEL OFFERS MANY ADVANTAGES IN NLP.

30
VECTOR SEMANTICS AND EMBEDDINGS

31
• FOLLOWING FIGURE SHOWS A VISUALIZATION OF EMBEDDINGS LEARNED FOR SENTIMENT
ANALYSIS, SHOWING THE LOCATION OF SELECTED WORDS PROJECTED DOWN FROM 60-
DIMENSIONAL SPACE INTO A TWO DIMENSIONAL SPACE. NOTICE THE DISTINCT REGIONS
CONTAINING POSITIVE WORDS, NEGATIVE WORDS, AND NEUTRAL FUNCTION WORDS.

32
33
FEATURE EXTRACTION AND EMBEDDINGS IN NLP

1. ONE-HOT ENCODING:
• FOR BETTER ANALYSIS OF THE TEXT WE WANT TO PROCESS, THERE IS A NUMERICAL
REPRESENTATION OF EACH WORD.

• THIS CAN BE SOLVED USING THE ONE-HOT ENCODING METHOD.

• HERE EACH WORD IS TREATED AS A CLASS AND IN A DOCUMENT WHEREVER THE


WORD IS WE ASSIGN 1 FOR IT IN THE TABLE AND ALL OTHER WORDS IN THAT
DOCUMENT GET 0
34
35
WORD EMBEDDING:
• ONE-HOT ENCODING WORKS WELL WHEN WE HAVE A SMALL SET OF DATA. WHEN
THERE IS A HUGE VOCABULARY, WE CANT ENCODE IT USING THIS METHOD AS THE
COMPLEXITY INCREASES A LOT.

• WE REQUIRE A METHOD THAT CAN CONTROL THE SIZE OF THE WORDS WE REPRESENT.
WE DO THIS BY LIMITING IT TO A FIXED-SIZED VECTOR.

• WE WANT TO FIND AN EMBEDDING FOR EACH WORD.


EG: IF TWO WORDS ARE SIMILAR THEY MUST BE CLOSER TO EACH OTHER IN
REPRESENTATION, AND TWO OPPOSITE WORDS IF THEIR PAIRS EXIST, THEY BOTH MUST
BE HAVING THE SAME DIFFERENCE OF DISTANCES. THESE HELP US FIND SYNONYMS,
36

ANALOGIES, ETC
37
WORD2VEC

• WORD2VEC IS WIDELY USED IN MOST OF THE NLP MODELS.


• IT TRANSFORMS THE WORD INTO VECTORS.
• WORD2VEC IS A TWO-LAYER NET THAT PROCESSES TEXT WITH WORDS. THE INPUT IS IN
THE TEXT CORPUS AND THE OUTPUT IS A SET OF VECTORS:
• FEATURE VECTORS REPRESENT THE WORDS ON THAT CORPUS.
• WORD2VEC IS NOT A DEEP NEURAL NETWORK, IT CONVERTS TEXT INTO AN
UNAMBIGUOUS FORM OF COMPUTATION FOR DEEP NEURAL NETWORKS.

38
• THESE MODELS WORK USING CONTEXT.
• THIS MEANS THAT THE EMBEDDING IS LEARNED BY LOOKING AT NEARBY
WORDS; IF A GROUP OF WORDS IS ALWAYS FOUND CLOSE TO THE SAME
WORDS, THEY WILL END UP HAVING SIMILAR EMBEDDINGS.
• THUS, COUNTRIES WILL BE CLOSELY RELATED, SO WILL ANIMALS, AND SO ON

40
• TO LABEL HOW WORDS ARE CLOSE TO EACH OTHER, WE FIRST SET A
WINDOW-SIZE.

• THE WINDOW-SIZE DETERMINES WHICH NEARBY WORDS WE PICK. FOR


EXAMPLE, GIVEN A WINDOW-SIZE OF 2, FOR EVERY WORD, WE’LL PICK THE
2 WORDS BEHIND IT AND THE 2 WORDS AFTER IT:

41
CONSTRUCTING WORD PAIRS

42
TYPES OF ARCHITECTURES

• CONTINUOUS BAG OF WORDS (CBOW)


• SKIP GRAM

43
• SKIP-GRAM: WORKS WELL WITH A SMALL AMOUNT OF THE TRAINING DATA, REPRESENTS
WELL EVEN RARE WORDS OR PHRASES.
CBOW: SEVERAL TIMES FASTER TO TRAIN THAN THE SKIP-GRAM, SLIGHTLY BETTER ACCURACY
FOR THE FREQUENT WORDS.

44
CBOW

45
CBOW

• AS INDICATED IN THE FIGURE , THE CONTEXT WORDS ARE INITIALLY SUPPLIED AS AN INPUT TO AN
EMBEDDING LAYER.
• THE WORD EMBEDDINGS ARE THEN TRANSFERRED TO A LAMBDA LAYER, WHERE THE WORD
EMBEDDINGS ARE AVERAGED.
• THE EMBEDDINGS ARE THEN PASSED TO A DENSE SOFTMAX LAYER, PREDICTING OUR TARGET
WORD. WE COMPUTE THE LOSS AFTER MATCHING THIS WITH OUR TARGET WORD AND THEN RUN
BACKPROPAGATION WITH EACH EPOCH TO UPDATE THE EMBEDDING LAYER IN THE PROCESS.
• ONCE THE TRAINING IS COMPLETE, WE MAY EXTRACT THE EMBEDDINGS OF THE REQUIRED WORDS
FROM OUR EMBEDDING LAYER.

46
SKIP GRAM

47
• INDIVIDUAL EMBEDDING LAYERS ARE PASSED BOTH THE TARGET AND CONTEXT WORD PAIRS,
YIELDING DENSE WORD EMBEDDINGS FOR EACH OF THESE TWO WORDS.
• THE DOT PRODUCT OF THESE TWO EMBEDDINGS IS COMPUTED USING A 'MERGE LAYER,' AND
THE DOT PRODUCT VALUE IS OBTAINED.
• THE VALUE OF THE DOT PRODUCT IS THEN TRANSMITTED TO A DENSE SIGMOID LAYER, WHICH
OUTPUTS 0 OR 1.
• THE OUTPUT IS COMPARED TO THE ACTUAL VALUE OR LABEL, AND THE LOSS IS CALCULATED,
THEN BACKPROPAGATION IS USED TO UPDATE THE EMBEDDING LAYER AT EACH EPOCH.
48
DISCOURSE PROCESSING

49
50
51
52
COHERENCE

53
COHERENCE

• THE DISCOURSE WOULD BE COHERENT IF IT HAS MEANINGFUL


CONNECTIONS BETWEEN ITS UTTERANCES.

• THIS PROPERTY IS CALLED COHERENCE RELATION. FOR EXAMPLE,


SOME SORT OF EXPLANATION MUST BE THERE TO JUSTIFY THE
CONNECTION BETWEEN UTTERANCES.

54
55
56
RELATIONSHIP BETWEEN ENTITIES

• ANOTHER PROPERTY THAT MAKES A DISCOURSE COHERENT IS THAT


THERE MUST BE A CERTAIN KIND OF RELATIONSHIP WITH THE
ENTITIES.

• SUCH KIND OF COHERENCE IS CALLED ENTITY-BASED COHERENCE

57
TEXT COHERENCE

• COHERENCE RELATION DEFINES THE POSSIBLE CONNECTION


BETWEEN UTTERANCES IN A DISCOURSE.
• WE ARE TAKING TWO TERMS S0 AND S1 TO REPRESENT THE
MEANING OF THE TWO RELATED SENTENCES

58
1. RESULT

• IT INFERS THAT THE STATE ASSERTED BY TERM S0 COULD CAUSE THE


STATE ASSERTED BY S1.
• FOR EXAMPLE, TWO STATEMENTS SHOW THE RELATIONSHIP
RESULT:

ROHIT WAS CAUGHT IN THE FIRE. HIS SKIN BURNED.


59
2. EXPLANATION

• IT INFERS THAT THE STATE ASSERTED BY S1 COULD CAUSE THE STATE


ASSERTED BY S0.
• FOR EXAMPLE, TWO STATEMENTS SHOW THE RELATIONSHIP −
EXPLANATION

ROHIT FOUGHT WITH SHYAM’S FRIEND. HE WAS DRUNK.


60
3. PARALLEL

• IT INFERS P(A1,A2,…) FROM ASSERTION OF S0 AND P(B1,B2,…) FROM


ASSERTION S1.
• HERE AI AND BI ARE SIMILAR FOR ALL I.

• FOR EXAMPLE, TWO STATEMENTS ARE PARALLEL −

ROHIT WANTED CAR. SHYAM WANTED MONEY.

61
4. ELABORATION

• IT INFERS THE SAME PROPOSITION P FROM BOTH THE ASSERTIONS


− S0 AND S1
• FOR EXAMPLE, TWO STATEMENTS SHOW THE RELATION
ELABORATION:

ROHIT WAS FROM CHANDIGARH. SHYAM WAS FROM CHENNAI


62
5. OCCASION
• IT HAPPENS WHEN A CHANGE OF STATE CAN BE INFERRED FROM THE
ASSERTION OF S0, FINAL STATE OF WHICH CAN BE INFERRED
FROM S1 AND VICE-VERSA.

• FOR EXAMPLE, THE TWO STATEMENTS SHOW THE RELATION OCCASION

ROHIT PICKED UP THE BOOK. HE GAVE IT TO SHYAM.

63
DISCOURSE COHERENCE AND STRUCTURE

THE COHERENCE OF ENTIRE DISCOURSE CAN ALSO BE CONSIDERED


BY HIERARCHICAL STRUCTURE BETWEEN COHERENCE RELATIONS

64
• FOR EXAMPLE, THE FOLLOWING PASSAGE CAN BE REPRESENTED
AS HIERARCHICAL STRUCTURE −

Rohit

65
66
REFERENCE RESOLUTION

• INTERPRETATION OF THE SENTENCES FROM ANY DISCOURSE IS AN


IMPORTANT TASK AND TO ACHIEVE THIS WE NEED TO KNOW WHO OR
WHAT ENTITY IS BEING TALKED ABOUT.
• HERE, INTERPRETATION REFERENCE IS THE KEY
ELEMENT. REFERENCE MAY BE DEFINED AS THE LINGUISTIC EXPRESSION
TO DENOTE AN ENTITY OR INDIVIDUAL.
• REFERENCE RESOLUTION MAY BE DEFINED AS THE TASK OF
DETERMINING WHAT ENTITIES ARE REFERRED TO BY WHICH LINGUISTIC
EXPRESSION.
67
TERMINOLOGY USED IN REFERENCE RESOLUTION

• REFERRING EXPRESSION − THE NATURAL LANGUAGE EXPRESSION THAT IS USED TO


PERFORM REFERENCE IS CALLED A REFERRING EXPRESSION. FOR EXAMPLE, THE PASSAGE
USED ABOVE IS A REFERRING EXPRESSION.

• REFERENT − IT IS THE ENTITY THAT IS REFERRED. FOR EXAMPLE, IN THE LAST GIVEN
EXAMPLE ROHIT IS A REFERENT.

• COREFER − WHEN TWO EXPRESSIONS ARE USED TO REFER TO THE SAME ENTITY, THEY
ARE CALLED COREFERS. FOR EXAMPLE, ROHIT AND HE ARE COREFERS. 68
TERMINOLOGY USED IN REFERENCE RESOLUTION
(CONTD)

• ANTECEDENT − THE TERM HAS THE LICENSE TO USE ANOTHER TERM. FOR
EXAMPLE, ROHIT IS THE ANTECEDENT OF THE REFERENCE HE.
• ANAPHORA & ANAPHORIC − IT MAY BE DEFINED AS THE REFERENCE TO AN ENTITY
THAT HAS BEEN PREVIOUSLY INTRODUCED INTO THE SENTENCE. AND, THE REFERRING
EXPRESSION IS CALLED ANAPHORIC. [SEE NEXT SLIDE]
• DISCOURSE MODEL − THE MODEL THAT CONTAINS THE REPRESENTATIONS OF THE
ENTITIES THAT HAVE BEEN REFERRED TO IN THE DISCOURSE AND THE RELATIONSHIP
THEY ARE ENGAGED IN.
69
ANAPHORA

70
ANAPHORA

71
ANAPHORA

• ANAPHORA IS THE REPETITION OF A PHRASE AT THE BEGINNING OF PHRASES, SENTENCES, OR


VERSES, USED FOR EMPHASIS.

72
• ANAPHORA REFERS TO THE USE OF A PRONOUN TO REFER BACK TO A
PREVIOUSLY MENTIONED NOUN OR NOUN PHRASE. FOR EXAMPLE, IN THE
SENTENCE “JOHN WENT TO THE STORE. HE BOUGHT SOME MILK,” THE
PRONOUN “HE” IS USED ANAPHORICALLY TO REFER BACK TO JOHN.

• ON THE OTHER HAND, CATAPHORA REFERS TO THE USE OF A PRONOUN


TO REFER FORWARD TO A NOUN OR NOUN PHRASE THAT IS INTRODUCED
LATER IN THE SENTENCE OR IN A SUBSEQUENT SENTENCE. FOR EXAMPLE,
IN THE SENTENCE “AFTER HE FINISHED HIS HOMEWORK, JOHN WENT
TO THE STORE,” THE PRONOUN “HE” IS USED CATAPHORICALLY 73TO
REFER FORWARD TO JOHN.
REFERENCE RESOLUTION TASKS

1. COREFERENCE RESOLUTION
2. PRONOMINAL ANAPHORA RESOLUTION

74
1. COREFERENCE RESOLUTION
• IT IS THE TASK OF FINDING REFERRING EXPRESSIONS IN A TEXT THAT REFER TO THE
SAME ENTITY.
• IT IS THE TASK OF FINDING COREFER EXPRESSIONS.
• A SET OF COREFERRING EXPRESSIONS ARE CALLED COREFERENCE CHAIN.

• FOR EXAMPLE - HE, HIS, SHE, IT - THESE ARE REFERRING EXPRESSIONS IN EXAMPLE

75
EG: COREFERENCE RESOLUTION

76
• CONSTRAINT ON COREFERENCE RESOLUTION

• IN ENGLISH, THE MAIN PROBLEM FOR COREFERENCE RESOLUTION IS


THE PRONOUN IT.
• THE REASON BEHIND THIS IS THAT THE PRONOUN IT HAS MANY USES.
• FOR EXAMPLE, IT CAN REFER MUCH LIKE HE AND SHE. THE PRONOUN IT
ALSO REFERS TO THE THINGS THAT DO NOT REFER TO SPECIFIC
THINGS. FOR EXAMPLE,
• IT’S RAINING.
• IT IS REALLY GOOD. 77
• WHAT IS COREFERENCE RESOLUTION?
• COREFERENCE RESOLUTION (CR) IS THE TASK OF FINDING ALL LINGUISTIC
EXPRESSIONS (CALLED MENTIONS) IN A GIVEN TEXT THAT REFER TO THE
SAME REAL-WORLD ENTITY.
• AFTER FINDING AND GROUPING THESE MENTIONS WE CAN RESOLVE
THEM BY REPLACING, AS STATED ABOVE, PRONOUNS WITH NOUN
PHRASES.

78
79
TYPICAL ALGORITHM
1. A SERIES OF WORDS THAT ARE POTENTIALLY REFERRING TO REAL-
WORLD ENTITIES ARE EXTRACTED. WE CALL THESE
WORDS MENTIONS.
2. FOR EACH MENTION AND EACH PAIR OF MENTIONS, WE COMPUTE A
SET OF FEATURES. THIS IS COMMONLY DONE BY AVERAGING
THE WORD EMBEDDINGS OF THE MENTION AND ITS ADJACENT
WORDS TO CONSIDER CONTEXT INFORMATION.
3. THEN, WE INPUT THESE FEATURES INTO MACHINE LEARNING
MODELS TO FIND THE MOST LIKELY ANTECEDENT FOR EACH
MENTION (IF THERE IS ONE). 80
81
WHY COREFERENCE RESOLUTION?

• IT CAN BE APPLIED TO A VARIETY OF NLP TASKS SUCH AS TEXT


UNDERSTANDING, INFORMATION EXTRACTION, MACHINE TRANSLATION,
SENTIMENT ANALYSIS, OR DOCUMENT SUMMARIZATION.

• IT IS A GREAT WAY TO OBTAIN UNAMBIGUOUS SENTENCES WHICH CAN


BE MUCH MORE EASILY UNDERSTOOD BY COMPUTERS.

82
2. PRONOMINAL ANAPHORA RESOLUTION
• UNLIKE THE COREFERENCE RESOLUTION, PRONOMINAL ANAPHORA RESOLUTION
MAY BE DEFINED AS THE TASK OF FINDING THE ANTECEDENT FOR A SINGLE
PRONOUN.

• FOR EXAMPLE, THE PRONOUN IS HIS AND THE TASK OF PRONOMINAL


ANAPHORA RESOLUTION IS TO FIND THE WORD ROHIT BECAUSE ROHIT IS THE
ANTECEDENT.

83
END

84

You might also like