Diwakar Vishwakarma & Bharti Gupta MCA II Year BBAU(A Central University) Lucknow
AI Concept and Definition
Encompasses Many Definitions
AI Involves Studying Human Thought
Processes Representing Thought Processes on Machines
study of how to make computers do things at which, at the moment, people are better (Rich and Knight [1991]) Theory of how the human mind works (Mark Fox)
AI Objectives
Make machines smarter Understand what intelligence is Make machines more useful (practical purpose)
Turing Test for Intelligence
A computer can be considered to be smart only when a human interviewer, conversing with both an unseen human being and an unseen computer, can not determine which is which.
Major AI Areas
Expert Systems
Natural
Language Processing
Speech Understanding Robotics and Sensory Systems Computer Vision and Scene Recognition Neural Computing Fuzzy Logic
Interaction Level
Natural Language Processing is a technique where machine can become more human and there by reducing the distance between human being and the machine can be reduced. Therefore in simple sense NLP makes human to communicate with the machine easily. NLP applications are very useful in everyday life for example a machine that takes instructions by voice.
Interaction Level
The level that computer and human interact. NL used for make Interaction level near to human.
Graphical UI NL UI Human Interaction level Command-line Computer
Natural?
Natural Language? Natural Language is one of fundamental aspects of human behaviors. Provide easy interaction with computer Refers to the language spoken by people, e.g. English, Japanese, Hindi as opposed to artificial languages, like C++, Java, etc.
Where does it fit in the CS taxonomy?
Computers Databases Artificial Intelligence Algorithms Networking
Robotics
Natural Language Processing
Expert System
Information Retrieval
Machine Translation
Language Analysis
Semantics
Parsing
Natural Language Processing
Natural Language Processing is a collection
used to extract the meaning from input in order to perform the useful task as a result. Automatic analysis of human language by computer algorithms.
Why Natural Language Processing ?
Huge amounts of data Internet = at least 20 billions pages and exponentially increasing
Applications for processing large amounts of texts require NLP expertise
Application Areas of NLP
Text-based applications This involves applications such as searching for a certain topic or a keyword in a data base, extracting information from a large document, translating one language to another or summarizing text for different purposes.
Application Areas of NLP
Dialogue based applications Some of the typical examples of this are answering systems that can answer questions, services that can be provided over a telephone without an operator, teaching systems, voice controlled machines (that take instructions by speech) and general problem solving systems.
Components of Natural Language Processing
Natural Language Understanding o Mapping the given input in the natural language
into a useful representation.
o Different level of analysis required:
morphological analysis , syntactic analysis, semantic analysis, discourse analysis,
Components of Natural Language Processing
Natural Language Generation o Producing output in the natural language from
some internal representation.
o Different level of synthesis required:
deep planning (what to say), syntactic generation
Natural Language Processing
Natural Language Understanding
The steps in natural language understanding are as follows: Words Morphological Analysis Morphologically analyzed words (another step: POS tagging) Syntactic Analysis Syntactic Structure
Natural Language Understanding
Semantic Analysis Context-independent meaning representation Discourse Processing
Final meaning representation
MAJOR TASKS INVOLVED IN NATURAL LANGUAGE PROCESSING
Phonology Morphology Syntax Semantics Pragmatics Discourse
Phonology
Deals with the interpretation of speech sounds within and across words. Three types of rules used in phonological analysis: 1) phonetic rules for sounds within words; 2) phonemic rules for variations of pronunciation when words are spoken together, and; 3) prosodic rules for fluctuation in stress and intonation across a sentence.
Morphology
Morphology is the first stage of analysis once input has been received. It looks at the ways in which words break down into their components and how that affects their grammatical status.
Morphology
Morphemes are the smallest meaningful units of language. cars car+PLU Children Child+PLU
Syntax
Syntax involves applying the rules of the target languages grammar, its task is to determine the role of each word in a sentence and organize this data into a structure that is more easily manipulated for further analysis.
Issues in Syntax
1.
the dog ate my homework - Who did what? Identify the part of speech (POS)
Dog = noun ; ate = verb ; homework = noun English POS tagging: 95% (Can be improved)
Identify collocations mother in law, hot dog
Issues in Syntax
Full Parsing Ravindra loves Khusi.
Ravindra loves Khusi
NP(Ravindra)
VP(loves Khusi)
Verb Noun(R)
NP
loves
Noun(K) Ravindra Love Khusi
More Issues in Syntax
Preposition Attachment I saw the man in the park with a telescope
Semantics
Semantics are the examination of the meaning of words and sentences. Semantics convey Useful information relevant to the scenario as a whole.
Issues in Semantics
Understand language! How? plant = industrial plant plant = living organism Words are ambiguous Importance of semantics?
Machine Translation: wrong translations
Information Retrieval: wrong information
Issues in Semantics
Learn from annotated examples:
Assume 100 examples containing plant
previously tagged by a human Train a learning algorithm How to choose the learning algorithm? How to obtain the 100 tagged examples?
Pragmatics
Pragmatics is the sequence of steps taken that exposes the overall purpose of the statement being analyzed. This will be broken down into ambiguous entities and will be disambiguate to facilitate understanding.
Discourse
Concerns how the immediately preceding sentences affect the interpretation of the next sentence. For example, interpreting pronouns and interpreting the temporal aspects of the information.
Issues in Discourse
Anaphora Resolution: to resolve referring expression The dog entered my room. It scared me Mary bought a book for Kelly. She didnt like it. She refers to Mary or Kelly. -- possibly Kelly It refers to what -- book.
Approaches to Natural Language Processing
Natural language processing approaches fall
roughly into 3 categories:
Symbolic Approach:
Perform
deep analysis of linguistic phenomena
Based
on explicit representation of facts about
language
Approaches to Natural Language Processing
Statistical Approach
Employ various mathematical techniques
Use large text corpora to develop
approximate generalized models of
linguistic phenomena
Approaches to Natural Language Processing
Connectionist Approach
Develop generalized models from
examples of linguistic phenomena
Combine statistical learning with various
theories of representation
Research
Microsoft Natural Language Processing Group The team is broadening the scope of the NLP effort by developing parallel systems in several languages. The languages covered are Chinese, English, French, German, Japanese, Korean and Spanish.
Research
Canon Natural Language Processing Group research and development of large vocabulary speech understanding software, for interactive spoken systems;
Applications of NLP
Machine Translation: different strategies
Systran: www.Systransoft.com
Google: Translate.google.com
Question Answering Information Extraction Spell Checking
Microsoft Spell Checker
Machine Translation
Machine Translation is the process of translating from source language text into target language. There are 2 types of MT: Rule based MT Statistical MT
Machine Translation
Rule based MT Explicit use and manual creation of linguistically informed rules and representations Statistical MT Corpus based, i.e. learned from examples of translations called parallel or bilingual corpora
Applications of Machine Translation
ANGLABHARTI (1991), a machine-aided translation system specifically designed for translating English to Indian languages at IIT Kanpur. Anglabharti uses a pseudo-interlingua approach. It analyses English only once and creates an intermediate structure called PLIL (Pseudo Lingua for Indian Languages).
Applications of Machine Translation
Anusaaraka (1995) project which started at IIT Kanpur, and is now being continued at IIIT Hyderabad Aim of translation from one Indian language to another Anusaaraka's have been built from Telugu, Kannada, Bengali, and Marathi to Hindi. TDIL(Technology Development for Indian Languages) is also working on developing various MT tools
Question Answering
Is a system that automatically answer questions posed by humans in natural language Three steps involved in question answering: Question Manipulation and classification Matching Answer selection
Applications of Question Answering
LUNAR gives access to a data base containing information on lunar rocks and soil composition obtained during the NASA Apollo-11 moon landing mission. It respond to a natural queries of geologist like what is the average of the basalt?
Applications of Question Answering
ELIZA uses the keyword and pattern matching approach. It is based on the use of sentence templates which contain keywords or phrases. Other famous Question Answering systems are-SHRDLU, GUS, JUPITER, QUALM, BASEBALL
Future of NLP
Well there are so many applications we can dream with NLP techniques. How about robots that understand and follow instructions by human voice or driving by talking to the car like in some science fiction movies. Well they all can be real one day. Imagine we have a computer system that can follow simple human instructions and do what ever we want it to do. How convenient will it be ? But lets leave all that to the FUTURE.........
Conclusions
A lot of research is going into developing new applications and investigating new techniques and approaches that will make Statistical NLP more feasible in the near future. So we will be able to see improved applications of NLP in the near future.
References
Blogs on Natural Language Processing from the Microsofts official site. Tutorial on NLP by Saad Ahmad (University of northern Iowa) Coppin, B. (2004). Artificial Intelligence Illuminated.Sudbury, Massachusetts: Jones and Bartlett Publishers Di Eugenio, B. (2001).Natural-Language Processing for Computer-Supported Instruction. Intelligence. Winter 2001
Thank You