0% found this document useful (0 votes)
74 views17 pages

Named Entity Recognition in NLP

Uploaded by

oluyoleexpert
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views17 pages

Named Entity Recognition in NLP

Uploaded by

oluyoleexpert
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Named Entity

Recognition
I N T R O D U C T I O N T O N AT U R A L L A N G U A G E P R O C E S S I N G I N P Y T H O N

Katharine Jarmul
Founder, kjamistan
What is Named Entity Recognition?
NLP task to identify important named entities in the text
People, places, organizations

Dates, states, works of art

... and other categories!

Can be used alongside topic identi cation


... or on its own!

Who? What? When? Where?

INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN PYTHON


Example of NER

(Source: Europeana Newspapers (h p://[Link]-


[Link]))

INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN PYTHON


nltk and the Stanford CoreNLP Library
The Stanford CoreNLP library:
Integrated into Python via nltk

Java based

Support for NER as well as coreference and dependency


trees

INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN PYTHON


Using nltk for Named Entity Recognition
import nltk
sentence = '''In New York, I like to ride the Metro to
visit MOMA and some restaurants rated
well by Ruth Reichl.'''
tokenized_sent = nltk.word_tokenize(sentence)
tagged_sent = nltk.pos_tag(tokenized_sent)
tagged_sent[:3]

[('In', 'IN'), ('New', 'NNP'), ('York', 'NNP')]

INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN PYTHON


print(nltk.ne_chunk(tagged_sent))

(S
In/IN
(GPE New/NNP York/NNP)
,/,
I/PRP
like/VBP
to/TO
ride/VB
the/DT
(ORGANIZATION Metro/NNP)
to/TO
visit/VB
(ORGANIZATION MOMA/NNP)
and/CC
some/DT
restaurants/NNS
rated/VBN
well/RB
by/IN
(PERSON Ruth/NNP Reichl/NNP)
./.)

INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN PYTHON


Let's practice!
I N T R O D U C T I O N T O N AT U R A L L A N G U A G E P R O C E S S I N G I N P Y T H O N
Introduction to
SpaCy
I N T R O D U C T I O N T O N AT U R A L L A N G U A G E P R O C E S S I N G I N P Y T H O N

Katharine Jarmul
Founder, kjamistan
What is SpaCy?
NLP library similar to gensim , with di erent implementations

Focus on creating NLP pipelines to generate models and


corpora

Open-source, with extra libraries and tools


Displacy

INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN PYTHON


Displacy entity recognition visualizer

(source: h ps://[Link]/displacy-ent/)

INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN PYTHON


import spacy
nlp = [Link]('en_core_web_sm')
[Link]

<[Link] at 0x7f76b75e68b8>

doc = nlp("""Berlin is the capital of Germany;


and the residence of Chancellor Angela Merkel.""")
[Link]

(Berlin, Germany, Angela Merkel)

print([Link][0], [Link][0].label_)

Berlin GPE

INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN PYTHON


Why use SpaCy for NER?
Easy pipeline creation

Di erent entity types compared to nltk

Informal language corpora


Easily nd entities in Tweets and chat messages

Quickly growing!

INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN PYTHON


Let's practice!
I N T R O D U C T I O N T O N AT U R A L L A N G U A G E P R O C E S S I N G I N P Y T H O N
Multilingual NER
with polyglot
I N T R O D U C T I O N T O N AT U R A L L A N G U A G E P R O C E S S I N G I N P Y T H O N

Katharine Jarmul
Founder, kjamistan
What is polyglot?
NLP library which uses word
vectors

Why polyglot ?
Vectors for many di erent
languages

More than 130!

INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN PYTHON


Spanish NER with polyglot
from [Link] import Text
?ext = """El presidente de la Generalitat de Cataluña,
Carles Puigdemont, ha afirmado hoy a la alcaldesa
de Madrid, Manuela Carmena, que en su etapa de
alcalde de Girona (de julio de 2011 a enero de 2016)
hizo una gran promoción de Madrid."""
ptext = Text(text)
[Link]

[I-ORG(['Generalitat', 'de']),
I-LOC(['Generalitat', 'de', 'Cataluña']),
I-PER(['Carles', 'Puigdemont']),
I-LOC(['Madrid']),
I-PER(['Manuela', 'Carmena']),
I-LOC(['Girona']),
I-LOC(['Madrid'])]

INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN PYTHON


Let's practice!
I N T R O D U C T I O N T O N AT U R A L L A N G U A G E P R O C E S S I N G I N P Y T H O N

You might also like