100% found this document useful (1 vote)
354 views

1.introduction To Natural Language Processing (NLP)

Natural language processing (NLP) allows computers to understand and derive meaning from human language to perform useful tasks. NLP analyzes text at different levels, including words, syntax, semantics, pragmatics, and discourse, to understand language beyond individual words. It involves natural language understanding to analyze input text and derive meaning, as well as natural language generation to produce understandable output text. While language presents many challenges for computers, NLP techniques help machines process natural human language.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
354 views

1.introduction To Natural Language Processing (NLP)

Natural language processing (NLP) allows computers to understand and derive meaning from human language to perform useful tasks. NLP analyzes text at different levels, including words, syntax, semantics, pragmatics, and discourse, to understand language beyond individual words. It involves natural language understanding to analyze input text and derive meaning, as well as natural language generation to produce understandable output text. While language presents many challenges for computers, NLP techniques help machines process natural human language.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 37

Introduction to Natural Language

Processing (NLP)
Why NLP ?
• According to industry estimates, only 21% of the available
data is present in structured form (Report : 12 Jan, 2017).
• Data is being generated as we speak, as we tweet, as we send
messages on Whatsapp and in various other activities.
• Majority of this data exists in the textual form, which is highly
unstructured in nature.
• Few notorious examples include – posts on social media, user
to user chat conversations, news, blogs and articles, product
or services reviews and patient records in the healthcare
sector. A few more recent ones includes chatbots and other
voice driven bots.
Why NLP ?
• Despite having high dimension data, the information present in it
is not directly accessible unless it is processed (read and
understood) manually or analyzed by an automated system.

• In order to produce significant and actionable insights from text


data, it is important to get acquainted with the techniques and
principles of Natural Language Processing (NLP).

• Processing of Natural Language is required when you want an


intelligent system like robot to perform as per your instructions.
Why NLP ?
• Computers are confused by (Human)
Language

• To remove this confusion we use NLP.


What is Natural Language Processing?
• Computers can’t directly understand text like humans can.
- Humans automatically break down sentences into units of
meaning.

• The field of study that focuses on the interactions between


human language and computers is called Natural Language
Processing, or NLP for short.

• NLP is a way for computers to analyze, understand, and derive


meaning from human language in a smart and useful way.
What is Natural Language Processing?
• NLP refers to computer systems that process human language in
terms of its meaning.

• Apart from common word processor operations that treat text like
a mere(‫)ص رف‬
ِ sequence of symbols, NLP considers the hierarchical
structure of language: several words make a phrase, several
phrases make a sentence and, ultimately, sentences convey ideas.

• By analyzing language for its meaning, NLP systems have long filled
useful rules, such as correcting grammar, converting speech to text
and automatically translating between languages.
What is Natural Language Processing?
• NLP is used to analyze text, allowing machines to understand
how human’s speak

• To understand human language is to understand not only the


words, but the concepts and how they’re linked together to
create meaning.

• Despite language being one of the easiest things for humans


to learn, the ambiguity of language is what makes natural
language processing a difficult problem for computers to
master.
Knowledge of Language
• Words (words and their composition)
• Syntax (structure of sentences)
• Semantics (explicit meaning of sentence)
• Discourse and pragmatics (implicit and
contextual meaning)
Small Applications

• Spelling correctors
• Optical Character Recognition software
• Grammar and style checkers
Big Applications
• Question answering
• Conversational agents (live chat etc )
• Text summarization
• Machine translation
Modern Applications
NEXT We will discuss
• Knowledge of language
• Ambiguity
• Models and algorithms

Note : The field of NLP involves making computers to perform useful tasks with
the natural languages humans use. The input and output of an NLP system can
be -
• Speech
• Written Text
Knowledge of Language
• Phonetics and phonology: speech sounds, their
production, and the rule systems that govern their
use
• Morphology: words and their composition from
more basic units
- Cat, cats (inflectional morphology)
- Child, children
- Friend, friendly (derivational morphology)
Knowledge of Language
• Comparison:
- Phonetics: Analyzes the production of all human speech
sounds, regardless of language.
- Phonology : Analyzes the sound patterns of a particular
language by determining which phonetic sounds
are significant, and explaining how these sounds
are interpreted by the native speaker.
Note
• Phonology Is the basis for further work in morphology, syntax, discourse, and
orthography design.
• Constructing words from phonemes (e.g. “th”+”i”+”ng”=thing)
Knowledge of Language
• Syntax: the structuring of words into legal
larger phrases and sentences

Note :
Syntax means structure of the sentences
Knowledge of Language
• Semantics: The meaning of words and phrases

-Lexical semantics: the study of the meanings of words

- Compositional semantics: how to combine word meanings

- Word-sense disambiguation:
River bank vs. financial bank
Knowledge of Language
• Pragmatics: It deals with using and understanding
sentences in different situations and how the interpretation
of the sentence is affected. It relating to a practical point of
view or practical considerations.
• Uses context of utterance
– Where, by who, to whom, why, when it was said
– Intentions: inform, request, promise, criticize,..
Examples
– Do you have a stapler?
– What is the time by your watch ?
 Handling ambiguity
– Pragmatic ambiguity: “you’re late”: What’s the speaker’s
intention: informing or criticizing?
Knowledge of Language
• Discourse: It deals with how the immediately preceding sentence
can affect the interpretation of the next sentence. Discourse is spoken or
written communication between people, especially serious discussion of a
particular subject.
• Discourse defines what statements can be said about a topic.
Example :
– Sue took the trip to New York. She had a great
time there.
• Sue/she;
• New York/there;
• took/had (time)
Ambiguity
• There is ambiguity at all levels of language
Example :
• I saw the woman with the telescope
• Syntactically ambiguous:
– I saw (NP the woman with the telescope)
– I saw (NP the woman) (PP with the
telescope)
Models and Algorithms
• Models (as we are using the term here):
– Formalisms to represent linguistic knowledge
• Algorithms:
– Used to manipulate the representations and
produce the desired behavior
• choosing among possibilities and
combining pieces

Note : algorithms that take one kind of


structure as input and output another.
NLP Pipeline
Components of NLP
• There are two components of NLP as given
- Natural Language Understanding (NLU)
- Natural Language Generation (NLG)
Components of NLP
1. Natural Language Understanding
 Taking some spoken/typed sentence and working
out what it means

2. Natural Language Generation


 Taking some formal representation of what you want
to say and working out a way to express it in a
natural (human) language (e.g., English)
Natural Language Understanding
(NLU)
• Understanding involves the following tasks −
- Mapping the given input in natural language into useful
representations.
- Analyzing different aspects of the language.
Natural language understanding
eggs
Natural Language Generation
(NLG)
• It is the process of producing meaningful phrases and
sentences in the form of natural language from some
internal representation.

• The NLU is harder than NLG.


Natural Language Generation(NLG)
 Talking back! 
 What to say or text planning
– flight(AA,london,boston,$560,2pm),
– flight(BA,london,boston,$640,10am),
 How to say it
– “There are two flights from London to Boston. The first one is
with American Airlines, leaves at 2 pm, and costs $560 …”
 Speech synthesis
– Simple: Human recordings of basic templates
– More complex: string together phonemes in phonetic spelling of
each word
 Difficult due to stress, intonation, timing, liaisons between words
Difficulties in NLU
• NL has an extremely rich form and structure.
• It is very ambiguous. There can be different levels of ambiguity −
- Lexical ambiguity − It is at very primitive level such as
word-level. For example, treating the word “board” as
noun or verb?
- Syntax Level ambiguity − A sentence can be parsed in
different ways.
- Referential ambiguity − Referring to something using
pronouns. For example, Rima went to Gauri. She said, “I am
tired.”
One input can mean different meanings.
Many inputs can mean the same thing
Steps in NLP
• There are general five steps –

Lexical Analysis - It involves identifying and analyzing the structure


of words. Lexicon of a language means the collection of words and
phrases in a language. Lexical analysis is dividing the whole chunk of
text into paragraphs, sentences, and words.

Syntactic Analysis (Parsing) - It involves analysis of words in the


sentence for grammar and arranging words in a manner that shows
the relationship among the words. The sentence such as “The
school goes to boy” is rejected by English syntactic analyzer.
Steps in NLP
Steps in NLP
Semantic Analysis - It draws the exact meaning or the dictionary
meaning from the text. The text is checked for meaningfulness. It is
done by mapping syntactic structures and objects in the task
domain.

Discourse Integration − The meaning of any sentence depends


upon the meaning of the sentence just before it. In addition, it also
brings about the meaning of immediately succeeding sentence.
Steps in NLP
• Pragmatic Analysis - During this, what was said is re-
interpreted on what it actually meant. It involves
deriving those aspects of language which require real
world knowledge.

Note :
World Knowledge : It includes the general knowledge
about the world.
Google Translator
• https://translate.google.com/#zh-CN/ko/my%20name%20is%20asad%2C%20and
%20ur%20name
Note
Reference Materials
1. Daniel Jurafsky and James H. Martin. 2008. Speech and Language Processing:
An Introduction to Natural Language Processing, Computational Linguistics
and Speech Recognition. Second Edition. Prentice Hall.

2. Foundations of Statistical Natural Language Processing, Manning and Schütze,


MIT Press. Cambridge, MA: May 1999

You might also like