Part of Speech Tagging

Part-of-speech (POS) tagging is the process of assigning a POS tag like noun, verb, adjective to each word in a sentence. It is useful for tasks like information retrieval, text-to-speech, and word sense disambiguation. Choosing an appropriate tagset and training a statistical model on a tagged corpus are important for POS tagging. Statistical tagging aims to assign tags that maximize the probability of a tag sequence given words, based on tag and word probabilities derived from a training corpus. Ambiguity makes POS tagging challenging.

Uploaded by

Howell Erivera Yangco

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views

Part of Speech Tagging

Uploaded by

Howell Erivera Yangco

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Part-of-Speech Tagging

The process of assigning a part-of-speech to each word in a sentence

WORDS heat water in a large vessel TAGS

N V P DET ADJ

Example
Word heat water in a large vessel Tag verb (noun) noun (verb) prep (noun, adv) det (noun) adj (noun) noun

What is POS tagging good for?

Useful in Information Retrieval Text to Speech: object(N) vs. object(V); discount(N) vs. discount(V) Word Sense Disambiguation Useful as a preprocessing step of parsing Unique tag to each word reduces the number of parses

Choosing a tagset
Need to choose a standard set of tags to do POS tagging One tag for each part of speech Could pick very coarse tagset N, V, Adj, Adv, Prep. More commonly used set is finer-grained E.g., the UPenn TreeBank II tagset has 36 word tags PRP, PRP$, VBG,VBD, JJR, JJS (also has tags for phrases) Even more finely-grained tagsets exist

Why is POS tagging hard?

Ambiguity Plants/N need light and water. Each one plant/V one. Flies like a flower Flies: noun or verb? like: preposition, adverb, conjunction, noun, or verb? a: article, noun, or preposition? flower: noun or verb?

Methods for POS tagging

Rule-Based POS tagging e.g., ENGTWOL [ Voutilainen, 1995 ] large collection (> 1000) of constraints on what sequences of tags are allowable Transformation-based tagging e.g.,Brills tagger [ Brill, 1995 ] sorry, I dont know anything about this Stochastic (Probabilistic) tagging e.g., TNT [ Brants, 2000 ] Ill discuss this in a bit more detail

Stochastic Tagging
Based on probability of certain tag occurring, given various possibilities Necessitates a training corpus A collection of sentences that have already been tagged Several such corpora exist One of the best known is the Brown University Standard Corpus of Present-Day American English (or just the Brown Corpus) about 1,000,000 words from a wide variety of sources POS tags assigned to each

Approach 1
Assign each word its most likely POS tag If w has tags t1, , tk, then can use P(ti | w) = c(w,ti)/(c(w,t1) + + c(w,tk)), where c(w,ti) = number of times w/ti appears in the corpus Success: 91% for English Example heat :: noun/89, verb/5

Approach 2
Given: sequence of words W W = w1,w2,,wn (a sentence) e.g., W = heat water in a large vessel Assign sequence of tags T: T = t1, t2, , tn Find T that maximizes P(T | W)

Practical Statistical Tagger

Getting the Conditional Probabilties

Let c(ti) = frequency of ti in the corpus c(wi,ti) = frequency of wi/ti in the corpus c(ti-1,ti) = frequency of ti-1 ti in the corpus Then we can use P(ti|ti-1) = c(ti-1,ti)/c(ti-1), P(wi|ti) = c(wi,ti)/c(ti)

UPenn TreeBank II word tags

CC - Coordinating conjunction CD - Cardinal number DT - Determiner EX - Existential there FW - Foreign word IN - Preposition or subordinating conjunction JJ - Adjective JJR - Adjective, comparative JJS - Adjective, superlative LS - List item marker MD - Modal NN - Noun, singular or mass NNS - Noun, plural NNP - Proper noun, singular NNPS - Proper noun, plural PDT - Predeterminer POS - Possessive ending PRP - Personal pronoun PRP$ - Possessive pronoun RB - Adverb RBR - Adverb, comparative RBS - Adverb, superlative RP - Particle SYM - Symbol TO - to UH - Interjection VB - Verb, base form VBD - Verb, past tense VBG - Verb, gerund or present participle VBN - Verb, past participle VBP - Verb, non-3rd person singular present VBZ - Verb, 3rd person singular present WDT - Wh-determiner WP - Wh-pronoun WP$ - Possessive wh-pronoun WRB - Wh-adverb

A Detailed Lesson Plan About Pronoun and Its Kind
92% (12)
A Detailed Lesson Plan About Pronoun and Its Kind
13 pages
Derivational Morphology Exercises
No ratings yet
Derivational Morphology Exercises
1 page
Lecture 16-17-18-19
No ratings yet
Lecture 16-17-18-19
42 pages
Part-Of-Speech (POS) Tagging
No ratings yet
Part-Of-Speech (POS) Tagging
53 pages
Word Classes and Part-of-Speech (POS) Tagging: CS4705 Julia Hirschberg
No ratings yet
Word Classes and Part-of-Speech (POS) Tagging: CS4705 Julia Hirschberg
40 pages
POS Tagging: Introduction: Heng Ji
No ratings yet
POS Tagging: Introduction: Heng Ji
35 pages
Lect6 Pos
No ratings yet
Lect6 Pos
62 pages
Lecture#11 (POS Tagging)
No ratings yet
Lecture#11 (POS Tagging)
19 pages
nlp-unit-iii-notes
No ratings yet
nlp-unit-iii-notes
30 pages
Cme4408 p6 Pos Tagging
No ratings yet
Cme4408 p6 Pos Tagging
33 pages
Hmm
No ratings yet
Hmm
94 pages
3 Natural Language Processing-PoS Tagging
No ratings yet
3 Natural Language Processing-PoS Tagging
14 pages
NLP 4
No ratings yet
NLP 4
83 pages
Lec-5 POStagging
No ratings yet
Lec-5 POStagging
24 pages
Lec3-posner intro
No ratings yet
Lec3-posner intro
30 pages
Part of Speech Tagging (Chapter 5) : Adapted From Kathy Mccoy'S Presentation Downloaded From The Web, September 2010
No ratings yet
Part of Speech Tagging (Chapter 5) : Adapted From Kathy Mccoy'S Presentation Downloaded From The Web, September 2010
63 pages
Lecture 5
No ratings yet
Lecture 5
56 pages
Ilak Pos Tagging
No ratings yet
Ilak Pos Tagging
48 pages
Lecture 20-23 Part of Speech Tagging
No ratings yet
Lecture 20-23 Part of Speech Tagging
36 pages
NLPChapter3
No ratings yet
NLPChapter3
14 pages
Apznzaaczprqee1da4bjade7ul0meb Ap8tjou Feozcgqct6cpnh0z32ibu3faj 0wgfmnhp5p Eneunhaucakhow Bie9yhlaoqtsknu7yq0gfnxrzjd2mjuyrbnhadveb2wj7gjgcxpffbjgyxl4nzdqf5qeux-Lla2ggr5kg9w4bp8ev5hqrj7bwr3npwnp9gfmazwtau
No ratings yet
Apznzaaczprqee1da4bjade7ul0meb Ap8tjou Feozcgqct6cpnh0z32ibu3faj 0wgfmnhp5p Eneunhaucakhow Bie9yhlaoqtsknu7yq0gfnxrzjd2mjuyrbnhadveb2wj7gjgcxpffbjgyxl4nzdqf5qeux-Lla2ggr5kg9w4bp8ev5hqrj7bwr3npwnp9gfmazwtau
108 pages
Module 2 HMMppt
No ratings yet
Module 2 HMMppt
31 pages
NLP Unit 5
No ratings yet
NLP Unit 5
10 pages
Module-2_NLP (1)
No ratings yet
Module-2_NLP (1)
50 pages
Parts of Speech
No ratings yet
Parts of Speech
26 pages
10 - POS Tagging
No ratings yet
10 - POS Tagging
75 pages
Rule-Based POS Tagging: Part of Speech Tagging
No ratings yet
Rule-Based POS Tagging: Part of Speech Tagging
10 pages
10pos Tagging PDF
No ratings yet
10pos Tagging PDF
76 pages
Lecture Notes On Syntactic Processing
No ratings yet
Lecture Notes On Syntactic Processing
14 pages
Part-of-Speech (POS) Tagging
No ratings yet
Part-of-Speech (POS) Tagging
47 pages
Tagging and its types
No ratings yet
Tagging and its types
3 pages
hidden markov model
No ratings yet
hidden markov model
13 pages
A Hybrid Model For Part-of-Speech Tagging and Its Application To Bengali
No ratings yet
A Hybrid Model For Part-of-Speech Tagging and Its Application To Bengali
4 pages
Part-Of-Speech Tagging: A Simple But Useful Form of Linguistic Analysis
No ratings yet
Part-Of-Speech Tagging: A Simple But Useful Form of Linguistic Analysis
18 pages
NLP-Lectures 4,5,6
No ratings yet
NLP-Lectures 4,5,6
85 pages
pos tagging and chunking
No ratings yet
pos tagging and chunking
29 pages
721
No ratings yet
721
7 pages
Multi-Tagging For Transition-Based Dependency Parsing
No ratings yet
Multi-Tagging For Transition-Based Dependency Parsing
10 pages
A Hybrid Model For POS Tagging
No ratings yet
A Hybrid Model For POS Tagging
4 pages
Sepe A POS Tagger For Spanish
No ratings yet
Sepe A POS Tagger For Spanish
10 pages
Unit 3
No ratings yet
Unit 3
16 pages
A9254058119 PDF
No ratings yet
A9254058119 PDF
10 pages
unit-3
No ratings yet
unit-3
50 pages
Rule_based_POS_Tagging_Example (1)
No ratings yet
Rule_based_POS_Tagging_Example (1)
4 pages
Chapter Two Natural Language Processing
No ratings yet
Chapter Two Natural Language Processing
141 pages
Speech Recognition Architecture
No ratings yet
Speech Recognition Architecture
13 pages
723
No ratings yet
723
5 pages
Speech and Language Processing: SLP Chapter 5
No ratings yet
Speech and Language Processing: SLP Chapter 5
56 pages
3 cs626 Pos Tagging Week of 8aug22
No ratings yet
3 cs626 Pos Tagging Week of 8aug22
27 pages
Sanskrit Tag-Sets and Part-Of-Speech Tagging Methods - A Survey
No ratings yet
Sanskrit Tag-Sets and Part-Of-Speech Tagging Methods - A Survey
6 pages
4-Lecture Four - (Part of Speech Tagging and Sequence Labeling)
No ratings yet
4-Lecture Four - (Part of Speech Tagging and Sequence Labeling)
36 pages
2.1 Rule Based POS Tagging
No ratings yet
2.1 Rule Based POS Tagging
5 pages
Assignment 3
No ratings yet
Assignment 3
12 pages
7. POS Tagging-II
No ratings yet
7. POS Tagging-II
11 pages
lec04-2-PartOfSpeechTagging
No ratings yet
lec04-2-PartOfSpeechTagging
56 pages
Natural Language Processing: Parts of Speech Tagging - Pos
No ratings yet
Natural Language Processing: Parts of Speech Tagging - Pos
20 pages
NLP Ia2
No ratings yet
NLP Ia2
18 pages
Part-Of-Speech Tagging: A Simple But Useful Form of Linguistic Analysis Christopher Manning
No ratings yet
Part-Of-Speech Tagging: A Simple But Useful Form of Linguistic Analysis Christopher Manning
14 pages
Part of Speech Tagging
100% (2)
Part of Speech Tagging
13 pages
Group Theory
From Everand
Group Theory
W. R. Scott
4.5/5 (3)
Tuttle Pocket Mandarin Chinese Dictionary: English-Chinese Chinese-English (Fully Romanized)
From Everand
Tuttle Pocket Mandarin Chinese Dictionary: English-Chinese Chinese-English (Fully Romanized)
Li Dong
No ratings yet
IELTS : From Failure To Success
From Everand
IELTS : From Failure To Success
YASH AKBARI
No ratings yet
What-Are-They-Doing - Writing-Color
No ratings yet
What-Are-They-Doing - Writing-Color
2 pages
o Macmillan Dictionary For Students
No ratings yet
o Macmillan Dictionary For Students
3 pages
English Grammar p 2
No ratings yet
English Grammar p 2
7 pages
Subjuntivo Powerpoint 2
No ratings yet
Subjuntivo Powerpoint 2
33 pages
Mastermind Grammar Sample
100% (1)
Mastermind Grammar Sample
5 pages
Part of Speech Quest.
No ratings yet
Part of Speech Quest.
7 pages
English
No ratings yet
English
2 pages
ENG_A1!1!26 Talk About Your Family
No ratings yet
ENG_A1!1!26 Talk About Your Family
14 pages
Adverb of Manner
No ratings yet
Adverb of Manner
20 pages
Tamilnadu Polytechnic College (Autonomous) Communication English - 2 Internal Exam - 2 Maximum Marks: 50
No ratings yet
Tamilnadu Polytechnic College (Autonomous) Communication English - 2 Internal Exam - 2 Maximum Marks: 50
3 pages
A Contrastive Analysis Between English and Indonesian Language
No ratings yet
A Contrastive Analysis Between English and Indonesian Language
5 pages
Assignments of Certificate For Proficency in Urdu Through English (PUE) PDF
No ratings yet
Assignments of Certificate For Proficency in Urdu Through English (PUE) PDF
3 pages
Adjectives - Degrees of Comparison
No ratings yet
Adjectives - Degrees of Comparison
6 pages
Predicative Complexes With The Infinitive
No ratings yet
Predicative Complexes With The Infinitive
3 pages
Branding and Brand Names
No ratings yet
Branding and Brand Names
8 pages
47th Preli ST-2 - (B.A) Set-1 - Ans
No ratings yet
47th Preli ST-2 - (B.A) Set-1 - Ans
2 pages
Every or Everyone
No ratings yet
Every or Everyone
11 pages
Conjugations 05 1
No ratings yet
Conjugations 05 1
8 pages
Sta. Ana Elementary School
No ratings yet
Sta. Ana Elementary School
9 pages
Irregular Verb List Secondary Uppersec
No ratings yet
Irregular Verb List Secondary Uppersec
2 pages
Tổng hợp Đề + Bài tập
No ratings yet
Tổng hợp Đề + Bài tập
51 pages
(Ebook) ROUTLEDGE MODERN GRAMMARS - Modern Brazilian Portuguese Grammar: A Practical Guide (2010) by John Whitlam ISBN 9780415566438, 0415566436 instant download
100% (1)
(Ebook) ROUTLEDGE MODERN GRAMMARS - Modern Brazilian Portuguese Grammar: A Practical Guide (2010) by John Whitlam ISBN 9780415566438, 0415566436 instant download
55 pages
Tense PPF 140710120443 Phpapp02
No ratings yet
Tense PPF 140710120443 Phpapp02
45 pages
Degrees of Comparison PDF
No ratings yet
Degrees of Comparison PDF
4 pages
The Passive Voice: My Grandfather Planted This Tree. This Tree Was Planted by My Grandfather
No ratings yet
The Passive Voice: My Grandfather Planted This Tree. This Tree Was Planted by My Grandfather
10 pages
Tugas Bahasa Inggris Part of Speech
No ratings yet
Tugas Bahasa Inggris Part of Speech
9 pages
Part of Speech
No ratings yet
Part of Speech
8 pages
December 23 Dawn Vocabulary
No ratings yet
December 23 Dawn Vocabulary
2 pages

Part of Speech Tagging

Uploaded by

Part of Speech Tagging

Uploaded by

Part-of-Speech Tagging

The process of assigning a part-of-speech to each word in a sentence

WORDS heat water in a large vessel TAGS

What is POS tagging good for?

Why is POS tagging hard?

Methods for POS tagging

Practical Statistical Tagger

Getting the Conditional Probabilties

UPenn TreeBank II word tags

You might also like