AIML-Unit 3 Notes
AIML-Unit 3 Notes
DEPARTMENT OF ECE
SYLLABUS
Email filters. Email filters are one of the most basic and initial applications of
NLP online. ...
Smart assistants. ...
Search results. ...
Predictive text. ...
Language translation. ...
Digital phone calls. ...
Data analysis. ...
Text analytics.
Stages of Natural Language Processing
Lexical or Morphological Analysis is the initial step in NLP. It entails recognizing and
analyzing word structures. The collection of words and phrases in a language is
referred to as the lexicon. Lexical analysis is the process of breaking down a text file
into paragraphs, phrases, and words.
2. Syntax Analysis
Syntax Analysis ensures that a given piece of text is correct structure. It tries to
parse the sentence to check correct grammar at the sentence level. Arranging words
in a manner that shows the relationship among the words
For example:
Correct Syntax: Sun rises in the east.
Incorrect Syntax: Rise in sun the east.
3. Semantic Analysis
Consider the sentence: “The apple ate a banana”. Although the sentence is
syntactically correct, it doesn’t make sense because apples can’t eat. Semantic
analysis looks for meaning in the given sentence. It also deals with combining words
into phrases.
For example, “red apple” provides information regarding one object; hence we treat it
as a single phrase. Similarly, we can group names referring to the same category,
person, object or organisation. “Robert Hill” refers to the same person and not two
separate names – “Robert” and “Hill”.
4. Discourse
5. Pragmatics
The final stage of NLP, Pragmatics interprets the given text using information from
the previous steps. Given a sentence, “Turn off the lights” is an order or request to
switch off the lights.
SEMANTIC NETWORK
A semantic network is a knowledge structure that depicts how concepts are related
to one another and illustrates how they interconnect. Semantic networks use artificial
intelligence (AI) programming to mine data, connect concepts and call attention to
relationships. Semantic networks are a way of representing relationships between
objects and ideas.
Semantic networks are not necessarily be hierarchy and they follow the lattice
structure for knowledge representation. There is a need of built-in feature of natural
language understanding in knowledge representation, so that it becomes possible to
carry out the inference through this representation.
The Semantic networks not only represent information but facilitate the retrieval of
relevant facts. For instance, all the facts about an object “Rajan” are stored with a
pointer directly to one node representing Rajan. Another advantage of semantic
networks is about the inheritance of properties. If a semantic network represents the
knowledge: “All canaries are of yellow color”, and “Tweety is canary”, the network
would be able to infer that “Tweety is of yellow color.” This inference is performed
through network mathcer or retriever. The most advanced system of inference has
Inference Engine, which can perform specialized inferences tailored to treat certain
functions, predicates, and constant symbols differently than others.
This is achieved by building into the inference engine certain true sentences, which
involve these symbols, and control is provided to handle these sentences. The
inference engine is able to recognize the special conditions, on which it makes use
of specialized machinery. It becomes possible by coupling the specialized
knowledge to the form of situations that it can deal with.
The semantic networks also called the Associative Networks, model the semantics
and words of the English language. A simple example of semantic network is shown
in Fig. 1, where ako(a-kind-of), has-parts, color, and has-property, are binary
relations.
There are several benefits of using Semantic Networks for representing knowledge:
• Reflects the structure of the part of the world being represented in the knowledge
structuring.
• The representation due to “is-a” and “is-partof” relations help in organizing the
inheritance based hierarchies, which are useful for inheritance-based inferences.
• The semantic networks are useful in representing events and natural language
sentences, whose meanings can be very precise. However, the concept of semantic
networks is very general. This causes a problem, unless we are clear about the
syntax and semantics in each case.
Unlike the predicate logic, there is no well accepted syntax and semantic for
semantic networks. A syntax for any given system is determined based on the
objects and relation primitives chosen and the rules used for connection of the
objects.
1. Lexical part:
3. Semantic part: Meanings (semantics) are associated with the edges and node
labels, whose details depend on the application domains.
4. Procedural part: The constructors are part of the procedural part, they allow for
the creation of new edges (links) and nodes. The destructors allow the deletion of
edges and nodes, the writers allow the creation and alteration of labels, and the
readers can extract answers to questions. Clearly, there is plenty of flexibility in
creating these representations.
The word-symbols used for the representation are those which represent object
constants and n-ary relation constants. The network nodes usually represent nouns
(objects) and the arcs represent the relationships between objects. The direction of
the arrow is taken from the first to the second objects, as they represent in the
relations.
The Fig. 2 shows a is-a hierarchy representing a semantic network. In set theory
terms, is-a corresponds to the sub-set relation ‘⊆’, and an instance corresponds to
the membership relation ‘∈’ (an object class relation) [3]. The commonly used
relations are: Member-of, Subset-of, ako (a-kind of), hasparts, instance-of, agent,
attributes, shaped-like, etc. The ‘is-a’ relationship occurs quite often, like, in
sentences: “Rajan is a Professor”, “Bill is a student”, “cat is a pet animal”, “Tree is a
plant”, “German shepherd is a dog”, etc. The ‘is-a’ relation is most often used to state
that an object is of a certain type, or to state that an object is a subtype of another, or
an object is an instance of a class.
The property inherited like this is recognized as default reasoning. It is assumed that
unless there is an information to the contradictory, it is reasonable to inherit the
information from the ancestor nodes. In Fig.3, Pigeon inherits the property of “can
fly” from the vertebrates, while Ostrich has a locally installed attribute of “cannot fly”,
hence the property ’fly’ will not be inherited by it.
The inference procedures for semantic networks can also be in parallel to those in
propositional and predicate logic.
Definition: A conceptual graph (CG) is a graph representation for logic, based on the
semantic networks of artificial intelligence and the existential graphs of Charles
Sanders Peirce. The conceptual graphs may be regarded as formal building blocks
of Semantic networks. When they are linked together, they form a more complex and
useful network..
Benefits: Conceptual graphs are convenient for such methods because they can be
used to represent arbitrary statements in logic, to represent formal models of those
statements, and to represent each step from an initially vague statement to a
complete specification.
A sentence in the semantic network is represented with a verb node, and various
case links to this node represent other participants in carrying out the action. The
complete structure formed is called case-frame.
While this sentence is parsed, the algorithm identifies the verb node, and retrieves
the complete case-frame from the knowledge base.
The given sentence is represented by the conceptual graph and predicate formula as
shown below.
Conceptual graph:
Predicate formula:
∃x∃y(eat(x) ∧ person(Rajan) ∧ food(Noodles) ∧ fork(y)∧ agent(x, Rajan) ∧ object(x,
Noodles) ∧ inst(x, y)) - (1)
Representation as text:
- (2)
Conceptual graph:
Predicate formula:
In the above CG the four concepts, Person, Go, Mumbai, and Bus have type
labels.
Two concepts, Person and Destination have names.
The verb “Go” is related to the remaining three concepts, by relations of
agent, destination, and Instrument.
The complete CG indicates that the person Rajan is an agent of some
instance “Going”, the city of Mumbai is the destination, and the bus is the
instrument.
(frame-name
<slot-name1 filler1>
<slot-name2 filler2 > …)
The conditions must also be specified for each terminal, under which
the assignment be made.
The assignments are usually smaller frames, called “sub-frames”.
Simple conditions are indicated by markers, which might require a
terminal assignment to be a person, an object of sufficient value, or a pointer
to a sub-frame of a certain type.
It is possible to specify relations among the things assigned to several
terminals using more complex conditions.
The inheritance hierarchies serve for economic data conservation. Instead of storing
all the properties of each object, all the objects are structured in a hierarchy, and
only the individual properties are stored in the object itself, while the general
properties are attached to the predecessors and inherited by all the successors.
In Object-centered representations
Every frame is identified by its individual name—the frame name; a frame consists of
a set attributes associated with it (see Fig.). For example, in the frame “Person”,
slots may be: name, weight, height, and age; for the frame “Computer”, the slots
may be: model, processor, memory, and price.
1. Relationship: A frame provides the relationship to the other frames. The frame
Hotel room (Fig. 7.8) can be a member of other frame class Room, which in turn can
belong to the class Housing, thus providing the relationship with these other room
types.
2. Slot value: The value of a slot can be numeric, symbolic, or Boolean (True/False).
For example, a slot identified as ‘Person’ is symbolic, with slot names as ‘Age’, and
‘Height’, both having float values. The slot’s values can be dynamically assigned
during a session with the expert system, or they can be static, or can be initialized in
the beginning while the slot is created.
3. Default value of slot: A slot may contain default value when the true value is not
available, and there is no evidence that the value chosen is in no way providing any
contradiction. For example, in a frame named as Car when slot values are not
provided, default values of the slots: wheels-count and Engine-count can be taken as
4 and 1, respectively.
4. Slot value’s range: The range of a slot’s value is useful in checking the bounds of
the slot value—whether the provided value of a slot is within the prescribed limit? For
example, a pressure range of a car tire may range 30–50 psi (pounds per inch).
5. Slot Procedure: A slot has a procedure attached to it; when this procedure is
called it may read a value from the given slot, or it can update the value of the slot
(write the value in it).
It can be used for establishing the attributed value of a frame, it can control end-user
queries, and can direct the inference engine as how to process the attributes.
DESCRIPTION LOGIC
• Links are not alike in function or form, confusion in links that asserts
relationships and structural links.
The KB comprises two components, the TBOX (terminology Box) introduces the
terminology, i.e., the vocabulary of an application domain; and the ABOX (assertion box)
contains assertions about the named individuals in terms of vocabulary.
The vocabulary consists of concepts, which denotes the individuals, and the roles which
denote the binary relationship between the individuals. In addition to these, DL system
allows the users to build a complex description of concepts and roles. The TBOX can be
used to assign names to complex descriptions. The description language has model
theoretic semantics. Consequently, the semantics in ABOX and TBOX are FOPL
formulas or its extensions. The DL system provides the services for reasoning using KB,
typically, to reason if the terminology is satisfiable. The reasoning process checks that
the assertions are consistent. With subsumption testing, it is easy to organize the
concepts of terminology in the hierarchy. In any application, the KR system is embedded
into a large environment. The other components interact system through queries to KB
and by modifying it ,i.e., by adding or retracting concepts, roles, and assertions. The
basic form of declaration in a TBox is a concept definition, that is, the definition of a new
concept in terms of other previously defined concepts. For example, a woman can be
defined as a female person by writing this declaration:
There are some important common assumptions usually made about DL terminologies:
• Definitions are acyclic in the sense that concepts are neither defined in terms of
themselves nor in terms of other concepts that indirectly refer to them.
The ABox comprises extended knowledge about the domain of interest, which are,
assertions about individuals, called membership assertions. For example,
show that the individual named as sita is a female person. Given this definition of woman,
one can derive from this assertion that sita is an instance of the concept Woman.
Similarly,
indicates that sita has luv as a child. Assertions of the first category are also called
concept assertions, while of the second is called role assertions.
DL REASONING AND INFERENCES
Advantages
Application of DL