0% found this document useful (0 votes)

9 views

Week5

Uploaded by

fatimabuhari2014

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Week5

Uploaded by

fatimabuhari2014

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 26

Vector semantics and word embedding

 Introduction
 Concepts of word senses
 Word semantics & embedding
 Tf-idf
 Word2vec
Introduction
 In natural language, the meaning of a word is fully reflected
by its context and its contextual relations.
 Semantics can address meaning at the levels of words,
phrases, sentences, or larger units of discourse.
 Word representations are inputs for the learning models in
Natural Language Understanding tasks.
 Word embedding is a term used for the representation of
words for text analysis, typically, as
 a real-valued vector that encodes the meaning of the word.
 The words that are closer in the vector space are expected
to be similar in meaning.
CS3TM20©XH 2
Concepts of word senses
 Have a complex many-to-many association with words
(homonymy, multiple senses)

 Have relations with each other

 Synonymy
 Antonymy
 Similarity
 Relatedness
 Connotation

CS3TM20©XH 3
 The notion of word similarity is
very useful in larger semantic
tasks.
 While words don’t have many
synonyms, most words do have
lots of similar words.
 Cat is not a synonym of dog, but
cats and dogs are certainly similar
words.
 Knowing how similar two words are can help in computing
how similar the meaning of two phrases or sentences are.
CS3TM20©XH 4
 Word relatedness is also called “word association”
 One common kind of relatedness between words is if they
belong to the same semantic field.
 A semantic field is a set of words which cover a particular
semantic domain and can bear structured relations with
each other.
 hospitals
surgeon, scalpel, nurse, anaesthetic, hospital
 restaurants
waiter, menu, plate, food, menu, chef
 houses
door, roof, kitchen, family, bed
CS3TM20©XH 5
 Affective meanings or connotations are related to a writer or
reader’s emotions, sentiment, opinions, or evaluations.
 Some words have positive connotations (happy) while others
have negative connotations (sad).
 Differences in connotations between fake, knockoff, forgery,
and copy, replica, reproduction.
 Some words describe positive evaluation (great, love) and
others negative evaluation (terrible, hate).
 Positive or negative evaluation language is called Sentiment
analysis
 Applications of NLP, e.g. in business form, customer product
review, social media analysis
CS3TM20©XH 6
Vector semantics & embedding
 In vector semantics, we define meaning
as a point in space based on
distribution.
 Similar words are nearby in semantic
space.
 crucially, as we'll see, we build this
space automatically by seeing which
words are nearby in text.
 Every modern NLP algorithm uses
embeddings as the representation of
word meaning.
CS3TM20©XH 7
 Imagine we have a collection of documents, e.g Shakespeare
 Term-document matrix : Each row represents a word in the
vocabulary and each column represents a document from the
collection.
 Each cell in this matrix represents the number of times a
particular word.

CS3TM20©XH 8
 Term-document matrix :

cos ( 𝒗 , 𝒘 ) =¿𝒗 ,𝒘 > ¿ ¿¿

 Cosine similarity:
|𝑣|∨𝑤∨¿
Where dot product as

CS3TM20©XH 9
Example: Based on the vector
space of [fool, battle], calculate
cosine similarity of word counts
between pairs of documents

1. Henry V , Julius Caesar

2. Julius Caesar, Twelfth night

CS3TM20©XH 10
Example: Based on the vector space of [fool, battle], calculate
cosine similarity of tf between pairs of documents
1. Henry V , Julius Caesar
Dot product ( = 4*1 +13*7 =95
| 13.60
| =7.07
95/(13.60*7.07)= 0.988
2. Julius Caesar, Twelfth night
Dot product ( = 1*58 +7*0 =58
| =7.07

58/(58*7.07)= 0.23
CS3TM20©XH 11
tf-idf
 tf (term frequency) may be defined the ratio of number of
times the word term (t) appears in a document (d)
compared to the total number of words in that document.
𝐶𝑜𝑢𝑛𝑡 ( 𝑡 , ⅆ )
𝑡 𝑓 𝑡 ,𝑑 =
𝑁𝑑
 There are other variants definitions for
 Frequency is clearly useful; if sugar appears a lot near
apricot, that's useful information.
 But overly frequent words like the, it, or they are not very
informative about the context
CS3TM20©XH 12
tf-idf
 idf: inverse document frequency

𝑖 ⅆ 𝑓 𝑡 =log
( ⅆ𝑓
𝑁
𝑡
)
where N is total number of d in collection
df i number of d that have word t
 Words like "the" or "good" have very low idf
 tf-idf value for word t in document d:

CS3TM20©XH 13
Document 1: I love artificial intelligence, big love!
Example:
Document 2: I like computational intelligence.
words index D1 D2 (tf) D1( idf) D 2 (Idf) D 1 D2
(tf) (Tf-Idf) (Tf-Idf )
I 0 1/6 1/4 Log(2/2)=0 Log(2/2)=0 0 0
love 1 2/6 0 Log(2/1) - 2/6*Log(2/1) 0
like 2 0 1/4 - Log(2/1) 0 ¼*Log(2/1)
artificial 3 1/6 0 Log(2/1) - 1/6*Log(2/1) 0
computational 4 0 1/4 - Log(2/1) 0 ¼*Log(2/1)
intelligence 5 1/6 1/4 Log(2/2)=0 Log(2/2)=0 0 0
big 6 1/6 0 Log(2/1) - 1/6*Log(2/1) 0

CS3TM20©XH 14
Exam style question:
The Table below contains three documents, each consisting of one
sentence.
a) Consider all three documents. Identify words with zero Tf-idf
value for each document.
b) Which word in these documents has highest Tf value?
c) Which country Name in these documents has highest Tf-idf
value?
Doc 1 Germany or France.
Doc 2 Has Germany won over France?
Doc 3 England lost to Germany.

CS3TM20©XH 15
Doc 1 Germany, Germany, France.
Doc 2 Has Germany won over France?
Doc 3 England lost to Germany.

a) Consider all three documents. Identify words with zero Tf-idf value in
each document.
Since Germany appeared in all three documents, the multiplier
= log (3/3) =0 for all documents, Tf-idf become zero in each Doc.
b) Which word in these has highest Tf value?
This applies to Germany in Doc 1 as 2/3, the shortest Doc.
c) Which country name in any document has highest Tf-idf value?
Comparing France and England:
Doc 1: France: 1/3 log(3/2 ) = 0.0587
Doc 3: England (1/4)log(3/1 )=0.119 England
CS3TM20©XH 16
Word2vec
 The word2vec algorithm learn word associations from a
large corpus of text
 Each distinct word with a particular list of numbers as a
vector.
 Popular embedding method and very fast to train
 Word2vec provides various options. We'll do skip-gram with
negative sampling (SGNS)
 The intuition of word2vec is that instead of counting how
often each word w occurs near, say, apricot, we’ll instead
train a classifier on a binary prediction task:
“Is word w likely to show up near apricot?”
CS3TM20©XH 17
 Avoids the need for any sort of hand-labeled supervision
signal.
 Train a logistic regression classifier instead of a multi-layer
neural network.
 Semantic and syntactic patterns can be reproduced using
vector arithmetic.
Brother" - "Man" + "Woman" produces a result which is
closest to the vector representation of "Sister" in the model
 You can download from
https://code.google.com/archive/p/word2vec/

CS3TM20©XH 18
Output:
Skip-Gram V(context words)

Input : W(t-2) tablespoon

V(target word) Projection
W(t-1) of
W(t) Logistic model
W(t+1) jam
apricot
W(t+2) a

…lemon, a [tablespoon of apricot jam, a] pinch…

c1 c2 [target] c3 c4

 Goal: train a classifier that is given a candidate (word,

context) pair
(apricot, jam)
(apricot, aardvark)
…
 And assigns each pair a probability:
P(+|w, c) or P(−|w, c) = 1 − P(+|w, c)
CS3TM20©XH 20
Skip-Gram Training data
…lemon, a [tablespoon of apricot jam, a] pinch…
c1 c2 [target] c3 c4

 K=Negative samples /positive samples =2 21

 Negative samples drawn from N-gram models with low prob.

 Word2vec: How to learn vectors

Given the set of positive and negative training instances,

and an initial set of embedding vectors
The goal of learning is to adjust those word vectors such
that we:
• Maximize the similarity of the target word, context word
pairs (w , cpos) drawn from the positive data
• Minimize the similarity of the (w , cneg) pairs drawn from
the negative data.

01/24/2025 22
Loss function for one w with cpos , cneg1 ...Cnegk

Where 1
𝜎 ( 𝑐 ∙𝑤 )=
1+exp ( − 𝑐 ∙𝑤 )
This is to be minimised by updating the w, c
Two sets of embeddings
 Start with V random d-dimensional vectors as initial
embeddings
 SGNS learns two sets of embeddings
Target embeddings matrix W
Context embedding matrix C
 It's common to just add them together, representing word
i as the vector wi + ci
Exam style question:
A word2vec algorithm, as shown below, employs a Skigram
of +/- 1 word window and takes incoming sentence for training.
a)Suggest 4 pairs of positive {word, context} examples drawn
from the sentence.
b) Explain why negative examples are needed and how these
can be generated.
“I like to eat chicken and French fries”
W(t-1)
Logistic
W(t)
model W(t+1)
CS3TM20©XH 25
a)Suggest 4 pairs of positive {word, context} examples drawn from the sentence.
Positive
w c
like I
“I like to eat chicken and French fries”
like to
to like
to eat or other choices
b) Explain why negative examples are needed and how these can be generated.
Since logistic model needs negative samples to train, corresponding to positive
examples, noisy context word unlikely to be near the input word is used as negative
samples. There are generally more negative samples, e.g. twice as much.
Negative Negative
w c w c
like zeal eat like sellotape or other choices
like saki like cradle
to tooth to kechi
to deadly to Samsung
CS3TM20©XH 26

Streamline English 2 Connections Teacher S Edition PDF
67% (3)
Streamline English 2 Connections Teacher S Edition PDF
192 pages
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
21 Word2Vec 24 09 2024
No ratings yet
21 Word2Vec 24 09 2024
63 pages
Vector Semantics and Embedding (part 1)
No ratings yet
Vector Semantics and Embedding (part 1)
66 pages
Lecture 3. 6 - Vector - Apr18 - 2021
No ratings yet
Lecture 3. 6 - Vector - Apr18 - 2021
106 pages
Lecture12 - Word RepEmb
No ratings yet
Lecture12 - Word RepEmb
28 pages
Lecture 3. Vector Semantics
No ratings yet
Lecture 3. Vector Semantics
51 pages
Vector Semantics and Embedding (part 2)
No ratings yet
Vector Semantics and Embedding (part 2)
47 pages
week2and3
No ratings yet
week2and3
76 pages
4.Machine Learning Word Embedding-1
No ratings yet
4.Machine Learning Word Embedding-1
36 pages
CCS369 - TSS-Unit 2
No ratings yet
CCS369 - TSS-Unit 2
56 pages
3 WordMeaning
No ratings yet
3 WordMeaning
78 pages
4 Word Representation
No ratings yet
4 Word Representation
41 pages
11.Chapter8_WordEmbedding
No ratings yet
11.Chapter8_WordEmbedding
17 pages
Lecture Word Embeddings WordTo Vec IR
No ratings yet
Lecture Word Embeddings WordTo Vec IR
60 pages
Neural Models For NLP
No ratings yet
Neural Models For NLP
67 pages
Wordembedding
No ratings yet
Wordembedding
25 pages
6 Vector Apr18 2021
No ratings yet
6 Vector Apr18 2021
106 pages
Lebijp 59 SZ 31 Py
No ratings yet
Lebijp 59 SZ 31 Py
69 pages
Lecture -7 PPMI
No ratings yet
Lecture -7 PPMI
37 pages
NLP An Intuitive Understanding of Word Embeddings From Count Vectors To Word2Vec
No ratings yet
NLP An Intuitive Understanding of Word Embeddings From Count Vectors To Word2Vec
18 pages
Word and Document Embeddings
No ratings yet
Word and Document Embeddings
94 pages
Unit 2a
No ratings yet
Unit 2a
51 pages
wordembed
No ratings yet
wordembed
31 pages
XCS224N_Module1_Slides
No ratings yet
XCS224N_Module1_Slides
72 pages
Module03 Embeddings
No ratings yet
Module03 Embeddings
102 pages
Constructing and Evaluating Word Embeddings
No ratings yet
Constructing and Evaluating Word Embeddings
33 pages
unit2
No ratings yet
unit2
15 pages
Ling571 Class14 Distr Thes
No ratings yet
Ling571 Class14 Distr Thes
122 pages
lecture 10
No ratings yet
lecture 10
86 pages
Word Embeddings
No ratings yet
Word Embeddings
59 pages
Lect04
No ratings yet
Lect04
44 pages
NLP-UNIT-4 (1) (1)
No ratings yet
NLP-UNIT-4 (1) (1)
23 pages
COMP5046: Natural Language Processing
No ratings yet
COMP5046: Natural Language Processing
71 pages
5b. Word Vectors
No ratings yet
5b. Word Vectors
24 pages
DLNLP CH-3 N
No ratings yet
DLNLP CH-3 N
11 pages
Word Embeddings
No ratings yet
Word Embeddings
55 pages
Vector Based Models
No ratings yet
Vector Based Models
41 pages
Text Similarity Cosine BOW TF-IDF Lecture
No ratings yet
Text Similarity Cosine BOW TF-IDF Lecture
6 pages
Word2Vec - A Baby Step in Deep Learning But A Giant Leap Towards Natural Language Processing
100% (1)
Word2Vec - A Baby Step in Deep Learning But A Giant Leap Towards Natural Language Processing
12 pages
Week11 WordEmbedding
No ratings yet
Week11 WordEmbedding
20 pages
Motivation Video: Mitsuku Vs Cleverbot - AI (Artificial Intelligence)
No ratings yet
Motivation Video: Mitsuku Vs Cleverbot - AI (Artificial Intelligence)
45 pages
CS585 Lecture October17th
No ratings yet
CS585 Lecture October17th
104 pages
ML UNIT-II
No ratings yet
ML UNIT-II
27 pages
Wordembed v2.0
No ratings yet
Wordembed v2.0
46 pages
Vector-Based Models of Semantic Composition: Jeff Mitchell and Mirella Lapata
No ratings yet
Vector-Based Models of Semantic Composition: Jeff Mitchell and Mirella Lapata
9 pages
Reference Material For NLP - 1
No ratings yet
Reference Material For NLP - 1
40 pages
Christopher Manning Lecture 2: Word Vectors, Word Senses, and Neural Classifiers
No ratings yet
Christopher Manning Lecture 2: Word Vectors, Word Senses, and Neural Classifiers
57 pages
Lecture14 Distributed Representations.pptx
No ratings yet
Lecture14 Distributed Representations.pptx
63 pages
Vector Semantics
No ratings yet
Vector Semantics
83 pages
Ci 5
No ratings yet
Ci 5
17 pages
Text Similarity
No ratings yet
Text Similarity
31 pages
Frontiers of Computational Journalism - Columbia Journalism School Fall 2012 - Week 3: Document Topic Modeling
No ratings yet
Frontiers of Computational Journalism - Columbia Journalism School Fall 2012 - Week 3: Document Topic Modeling
48 pages
Word Embedding
No ratings yet
Word Embedding
60 pages
Learning Representations That Convey Semantic and Syntactic Information
No ratings yet
Learning Representations That Convey Semantic and Syntactic Information
14 pages
Precision recal TF idf
No ratings yet
Precision recal TF idf
36 pages
08 Word Embeddings (2021)
No ratings yet
08 Word Embeddings (2021)
58 pages
Natural Language Processing With Deep Learning CS224N/Ling284
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
57 pages
Coding for beginners The basic syntax and structure of coding
From Everand
Coding for beginners The basic syntax and structure of coding
Diamond Moore
No ratings yet
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
From Everand
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
Miguel Miranda de Mattos
No ratings yet
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
From Everand
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
Charlie Masterson
No ratings yet
Revision Lecture
No ratings yet
Revision Lecture
19 pages
Week9
No ratings yet
Week9
36 pages
Week4
No ratings yet
Week4
45 pages
Week2
No ratings yet
Week2
44 pages
Week3
No ratings yet
Week3
15 pages
Big Data Challenges Practices and Technologies NIST Big Data Public Working Group Workshop at IEEE Big Data 2014
No ratings yet
Big Data Challenges Practices and Technologies NIST Big Data Public Working Group Workshop at IEEE Big Data 2014
5 pages
Cs3vr16 Graphics 5(4)
No ratings yet
Cs3vr16 Graphics 5(4)
38 pages
Cs3vr16 Revision Plus Answers(2)
No ratings yet
Cs3vr16 Revision Plus Answers(2)
32 pages
Week10
No ratings yet
Week10
24 pages
Cs3vr16 Graphics 3(1)
No ratings yet
Cs3vr16 Graphics 3(1)
37 pages
cs3vr16 Graphics 1(1) (1)
No ratings yet
cs3vr16 Graphics 1(1) (1)
42 pages
Cs3vr16 Graphics 2(4)
No ratings yet
Cs3vr16 Graphics 2(4)
39 pages
Cs3vr16 Graphics 4 Tutorial(2)
No ratings yet
Cs3vr16 Graphics 4 Tutorial(2)
13 pages
Zoho Workplace Apps
No ratings yet
Zoho Workplace Apps
14 pages
I. ENGLISH VERB-WPS Office
No ratings yet
I. ENGLISH VERB-WPS Office
20 pages
Auxiliary Verbs Greek
No ratings yet
Auxiliary Verbs Greek
1 page
Group 3 - Nna48a1 - Pragmatics - Politeness and Politeness Strategies
No ratings yet
Group 3 - Nna48a1 - Pragmatics - Politeness and Politeness Strategies
34 pages
U040RO Kamana Island Homework Worksheet
No ratings yet
U040RO Kamana Island Homework Worksheet
2 pages
أسلوبيات نسخة نهائية (Form 1)
No ratings yet
أسلوبيات نسخة نهائية (Form 1)
9 pages
Т///////// I Variant Test 8 form II Variant
No ratings yet
Т///////// I Variant Test 8 form II Variant
2 pages
HSS Workbook 5
No ratings yet
HSS Workbook 5
108 pages
What's The Matter (Lesson145-160)
No ratings yet
What's The Matter (Lesson145-160)
32 pages
Eapp 110 Module
No ratings yet
Eapp 110 Module
47 pages
M Schaefer Dissertation
No ratings yet
M Schaefer Dissertation
225 pages
Semantic Acquisition Draft
No ratings yet
Semantic Acquisition Draft
29 pages
Boundary Annotated Qur'an Dataset For Machine Learning (Version 2.0)
No ratings yet
Boundary Annotated Qur'an Dataset For Machine Learning (Version 2.0)
6 pages
Cuadernillo 4th Grade
No ratings yet
Cuadernillo 4th Grade
42 pages
Sva - Set 02
No ratings yet
Sva - Set 02
7 pages
Module 3 - Past and Past Perfect Tense
No ratings yet
Module 3 - Past and Past Perfect Tense
29 pages
Macro Skills Reviewer
No ratings yet
Macro Skills Reviewer
20 pages
Practice Book - English - Class 8
No ratings yet
Practice Book - English - Class 8
43 pages
ESL Unit K.6
No ratings yet
ESL Unit K.6
19 pages
Language Testing Group 11
No ratings yet
Language Testing Group 11
14 pages
Unit 5 - Complete
No ratings yet
Unit 5 - Complete
6 pages
RPS ESP Vocabulary Building
No ratings yet
RPS ESP Vocabulary Building
12 pages
Test Speakout Elementary Unit 1
No ratings yet
Test Speakout Elementary Unit 1
1 page
Language Society and Culture
No ratings yet
Language Society and Culture
25 pages
German Language Levels and Detailed Explanations (2022)
No ratings yet
German Language Levels and Detailed Explanations (2022)
11 pages
How To Learn English The Ultimate Guide Book
100% (1)
How To Learn English The Ultimate Guide Book
27 pages
Libro Ingles Basico II
No ratings yet
Libro Ingles Basico II
134 pages
Varieties of Non-Standard English in Huckleberry Finn by Mark Twain
100% (1)
Varieties of Non-Standard English in Huckleberry Finn by Mark Twain
2 pages
The Expansion o f Teenagers Slang Langua (1)
No ratings yet
The Expansion o f Teenagers Slang Langua (1)
12 pages
A Brief History of Semantics
No ratings yet
A Brief History of Semantics
2 pages

Week5

Uploaded by

Week5

Uploaded by

Vector semantics and word embedding

 Have relations with each other

cos ( 𝒗 , 𝒘 ) =¿𝒗 ,𝒘 > ¿ ¿¿

1. Henry V , Julius Caesar

Input : W(t-2) tablespoon

…lemon, a [tablespoon of apricot jam, a] pinch…

 Goal: train a classifier that is given a candidate (word,

 K=Negative samples /positive samples =2 21

 Negative samples drawn from N-gram models with low prob.

Given the set of positive and negative training instances,

You might also like