Module - 5
Module - 5
Syllabus
Module – 1
Introduction to AI: history, Intelligent systems, foundation and sub area of AI, applications, current trend and
development of AI. Problem solving: state space search and control strategies.
Module – 2
Problem reduction and Game playing: Problem reduction, game playing, Bounded look-ahead strategy, alpha-
beta pruning, Two player perfect information games.
Module – 3
Logic concepts and logic Programming: propositional calculus, Propositional logic, natural deduction system,
semantic tableau system, resolution refutation, predicate logic, Logic programming.
Module – 4
Advanced problem solving paradigm: Planning: types of planning system, block world problem, logic based
planning, Linear planning using a goal stack, Means-ends analysis, Non-linear planning strategies, learning plans.
Module – 5
Knowledge Representation, Expert system
Approaches to knowledge representation, knowledge representation using semantic network, extended
semantic networks for KR, Knowledge representation using Frames.
Expert system: introduction phases, architecture ES verses Traditional system.
Reference Books:
1. Elaine Rich, Kevin Knight, Artificial Intelligence, Tata McGraw Hill.
2. Nils J. Nilsson, Principles of Artificial Intelligence, Elsevier, 1980.
3. StaurtRussel, Peter Norvig, Artificial Intelligence: A Modern Approach, Pearson Education, 3rd Edition,
2009.
4. George F Lugar, Artificial Intelligence Structure and strategies for complex, Pearson Education, 5 th
Edition, 2011.
Module – 5
Knowledge Representation, Expert system
Approaches to knowledge representation, knowledge representation using semantic network, extended
semantic networks for KR, Knowledge representation using Frames.
Expert system: introduction phases, architecture ES verses Traditional system.
Introduction
Knowledge representation is an important issue in both cognitive science and AI. In cognitive science, it is
concerned with the way in which information is stored and processed by humans, while in AI, the main focus is
on storing knowledge or information in such a manner that programs can process it and achieve human
intelligence.
In AI, knowledge representation is an important area because intelligent problem solving can be achieved and
simplified by choosing an appropriate knowledge representation technique.
The fundamental goal of knowledge representation is to represent knowledge in a manner that facilitates the
process of inferencing (i.e., drawing conclusions) from it.
Several programming languages oriented to KR have been developed till date. In Prolog (1972), knowledge is
represented in the form of rules and facts as already explained in previous chapters. One can derive conclusions
and prove theorems from known premises or facts.
KL-ONE (1980s) was more specifically aimed at knowledge representation itself. The languages such as XML,
RDF, etc., for handling electronic documents in web systems, have been developed in order to represent the
structure of documents more explicitly.
These languages facilitate processes such as information retrieval, data mining, etc., which are also related to
the field of KR.
Any knowledge representation system should possess properties such as learning, efficiency in acquisition,
representational adequacy, and inferential adequacy. These properties may be defined as follows:
• Learning refers to a capability that acquires new knowledge, behaviours, understanding, etc. It does not
simply involve adding new facts to a knowledge base but new information may have to be classified to
avoid redundancy and replication in the existing knowledge prior to storage to enable retrieval. There
are a number of ways in which knowledge may be gained such as by reasoning and logic, by experience,
by observing the world, by mathematical proofs, and by scientific methods.
• Efficiency in acquisition refers to the ability to acquire new knowledge using automatic methods
wherever possible rather than relying on human intervention.
• Inferential adequacy refers to the ability of manipulating knowledge to produce new knowledge from
the existing one.
Knowledge representation is a core component of a number of applications such as expert systems, machine
translation systems, computer-aided maintenance systems, information retrieval systems, and database front-
ends.
In expert systems, there are various ways of representing knowledge, namely, predicate logic, semantic
networks, frames, conceptual dependency, etc., which have originated from theories of human information
processing. In predicate logic, knowledge is represented in the form of rules and facts.
We can easily obtain the answers of the following queries (explicit) from Table 7.1:
• What is the age of John?
• How much does Mary earn?
• What is the qualification of Mike?
However, we cannot obtain answers for queries (implicit) such as Does a person having a PhD qualification earn
more? So, inferencing new knowledge is not possible from such structures.
Other relations such as {can, has, colour, height} are known as property relations. These have been represented
by dotted lines pointing from the concept to its property. In this structure, property inheritance is easily
achieved. For example, the query Does a parrot breathe? can be easily answered as 'yes' even though this
property is not associated directly with a parrot. It inherits this property from its super class named Living_Thing.
We notice that isa and inst links have well-defined meaning, whereas other links which connect the attributes
of class vary in their interpretations.
The semantic net interpreter cannot understand all the attribute links unless their semantics are encoded into
it. Since all the attributes of the class are the properties, we can designate a unique name to such relations, say,
prop link. The interpretation of prop link will be simple that the attribute attached to a class is one of its
properties.
The information encoded above can be represented as a directed graph as shown in Fig. 7.2. In this figure, the
square nodes represent concepts or objects connected with isa or inst links, whereas oval nodes represent
property attributes attached to the square nodes. Property links are shown as dotted lines for more clarity.
Semantic net can be implemented in any programming language along with an inheritance procedure
implemented explicitly in that language. Prolog language is very convenient for representing an entire semantic
structure in the form of facts (relation as predicate and nodes as arguments) and inheritance rules. Inheritance
is easily achieved by unification of appropriate arguments in Prolog. The implementation of semantic net shown
in Fig. 7.2 in Prolog is given as follows:
Prolog Facts
The facts in Prolog would be written as shown in Table 7.2.
We know that in class hierarcy structure, a member subclass of a class is also a member of all super classes
connected through isa link. For example, if man is a member of subclass human, then man is also member of
living_class.
Similarly, an instance of subclass is also an instance of classes connected by isa link. It should be noted that we
cannot define an instance of an instance, that is, if john is an individual instance, then there cannot be another
individual as an instance of john.
Further, if john is an instance of man, then we can infer that john is human and also living thing.
Similarly, a property of a class can be inherited by lower sub-classes. A simple Prolog rule for handling
inheritance in semantic net is given below. Varoius queries can be answered by the following inheritance
program and are mentioned in Table 7.3.
The English sentences john gives an apple to mike and john and mike are human may be represented in semantic
network as shown in Fig. 7.3, which shows alternative way of representing semantic net in detail with more
semantic links.
Here, 'E' represents an event which is an act of giving, whose actor is John, the object is an apple, and recipient
is mike. It should be noted that semantic net can hold semantic information about situation such as actor of an
event giving is john and object is apple, in the sentence John gives an apple to mike. The relationships in the
network shown in Fig. 7.3 can be expressed in clausal form of logic as follows:
Therefore, the entire semantic net can be coded using binary representation (two- argument representation).
Such representation is advantageous when additional information is added to the given system. For example,
in the sentence john gave an apple to mike in the kitchen, it is easy to add location(E, kitchen) to the set of facts
given above.
In first-order predicate logic, predicate relation can have n arguments, where n ≥ 1. For example, the sentence
john gives an apple to mike is easily represented in predicate logic by give (john, mike, apple). Here, john, mike,
and apple are arguments, while give represents a predicate relation.
The predicate logic representation has greater advantages compared to semantic net representation as it can
express general propositions, in addition to simple assertions. For example, the sentence john gives an apple to
everyone he likes is expressed in predicate logic by the clause as follows:
give(john, X, apple) ← likes (john, X)
Here, the symbol X is a variable representing any individual. The arrow represents the logical connective implied
by. The left side of← contains conclusion(s), while the right side contains condition(s).
Despite all the advantages, it is not convenient to add new information in an n-ary representation of predicate
logic.
For example, if we have 3-ary relationship give (john, mike, apple) representing john gives an apple to mike and
to capture an additional information about kitchen in the sentence john gave an apple to mike in the kitchen,
we would have to replace the 3-ary representation give (john, mike, apple) with a new 4-ary representation
such as give (john, mike, apple, kitchen).
Further, a clause in logic can have several conditions, all of which must hold for the conclusion to be true and
can have several alternative conclusions, at least one of which must hold if all the conditions are true. For
example,
The sentence if john gives something he likes to a person, then he also likes that person can be expressed in
clausal representation in logic as
likes (john, X)give(john, X, Y), likes(john, Y)
The sentence every human is either male or female is expressed by the following clause
male(X), female(X)← human(X)
In conventional semantic network, we cannot express clausal form of logic. To overcome this shortcoming, R
Kowalski and his colleagues (1979) proposed an extended semantic network (ESNet) that combines the
advantages of both logic and semantic networks.
ESNet can be interpreted as a variant syntax for the clausal form of logic; it has the same expressive power as
that of predicate logic with well-defined semantics, inference rules, and a procedural interpretation.
It also incorporates the advantages of using binary relation as in semantic network rather than relations of logic.
Therefore, ESNet is a much powerful representation as compared to logic and semantic network.
In ESNet, the terms are represented by nodes similar to as done in conventional semantic network; constant,
variable, and functional terms are represented by constant, variable, and functional nodes, respectively.
Binary predicate symbols in clausal logic are represented by labels on arcs of ESNet. An atom of the form love
(john, mary) is an arc labelled as love with its two end nodes representing john and mary. The direction of the
arc (link) indicates the order of the arguments of the predicate symbol which labels the arc as follows:
love
john mary
Conclusions and conditions of clausal form are represented in ESNet by different kinds of The arcs denoting
conditions (negative atoms) are drawn with dotted arrow lines. These are called denial links (------), while the
arcs denoting conclusions (positive atom) are drawn with continuous arrow lines.
These are known as assertion links (—>). For example, the clausal representation (or rule) grandfather(X, Y) ←
father(X, Z), parent(Z, Y) for grandfather in logic can be represented in ESNet as given in Fig. 7.4. Here, X and Y
are variables; grandfather(X. Y) is the consequent (conclusion), and father(X, Z) and parent(Z, Y) are the
antecedents (conditions).
Similarly, the clausal rule male(X), female(X) ← human(X) can be represented using binary representation as
isa(X, male), isa(X, female) ←isa( X, human) and subsequently in ESNet as shown in Fig. 7.5.
Inference Rules
Inference rules are embedded in the representation itself.
The representation of the inference for every action of giving, there is an action of taking in clausal logic is
action(f(X), take) ← action(X, give). The interpretation of this rule is that the event of taking action is a function
of the event of giving action. In the ESNet representation, functional terms, such as f(X), are represented by a
single node. The representation of the statement action(f(X), take) ← action(X, give) in ESNet is as shown in Fig.
7.6.
The inference rule that an actor who performs a taking action is also the recipient of this action and can be
easily represented in ciausal logic; ESNet as given below. Here, E is a variable representing an event of an action
of taking.
recipient(E, X) ←acton(E, take), actor (E, X)
Solution: Here, E is a variable for some event and 'e' is an actual event. ESNet representation of the clauses is
shown in Fig. 7.8. Here, the partition is shown to delimit the clause pictorially by enclosing the clausal rule in an
ellipse.
The hierarchy links, isa, inst, and part of (or prop), are available in semantic networks and are also available as
special cases in ESNet. It is important to note that part-of link has a hidden existential quantifier. For example,
the assertion every human has two legs could be represented using part of link as follows:
part_of
two_legs human
The interpretation of above representation is that for every human, there exist two legs which are part of that
human.
The contradiction in ESNet can be represented as shown in Fig. 7.9. Here P part of X is conclusion and P part of
Y is condition, where Y is linked with X via isa link. Such kind of representation is contradictory and hence there
is a contradiction in ESNet.
Forward reasoning inference mechanism (also called bottom-up approach) in clausal logic derives new assertion
from old ones, that is, in this mechanism, we start with the given assertions and derive new assertions using
clausal rule.
On the other hand, in backward reasoning inference mechanism (or top-down approach), we prove the query
from the set of clauses using resolution refutation method in clausal logic. Both these inference rules are
available in ESNet also.
Given an ESNet, apply the following reduction (resolution) using modus ponen rule of logic {i.e., given (A← B)
and B, then conclude A). For example, consider the following set of clauses:
Using modus ponen rule of logic, we can easily derive that isa(john, human) holds true. Now, let us derive or
inference isa(john, human) using ESNet representation. The clauses are represented in ESNet as shown Table
7.4.
The new assertion isa (john, human) is inferred by elimination of assertion and their denial links with the help
of appropriate substitution. Here isa(X, human) is an assertion link and isa(X, man) is denial link.
In backward inferencing mechanism, we can prove a conclusion or goal from a given ESNet by adding the denial
of the conclusion to the network and show that the resulting set of clauses in the network gives contradiction.
This is done by performing successive steps of resolution until an explicit contradiction is generated. It is similar
to proof by resolution refutation in clausal form of logic. Consider the set of clauses (given in Table 7.5)
represented in clausal form.
We need to prove isa(john, human) using the same network. The proof is shown in Table 7.5.
After adding denial link in ESNet, we get the reduction in ESNet as shown in Table 7.6 by elimination of assertion
and their denial link with the help of appropriate substitution.
Thus, we can conclude that inferencing and inheritance are important and embedded capabilities of ESNet.
The corresponding ESNet representation is shown in Fig. 7.10. Since the variables in different clauses are
different, we choose different variables in ESNet to avoid confusion in the representation. Now, we proceed to
illustrate both inferencing methods, namely, forward reasoning inference and backward reasoning inference,
with the help of suitable examples.
The new assertion john is human can be inferred or derived as discussed in previous example. The steps for
inferring that john is animate are given in Fig. 7.11. The assertion john is a living_thing can be inferred similarly.
(i)
(ii)
(iii)
Further, John is living_thing can be derived by binding X with John.
Let us solve a query isa(john, living_thing) (i.e., Is John a living thing?) using backward inferencing mechanism.
In this mechanism, we add the denial of the fact that John is not a living_thing in ESNet and see if we can derive
inconsistency or an empty network.
In Fig. 7.12 each part shows reduction in the network by elimination of assertions and their denial links with the
help of appropriate substitutions. Here denial links are enclosed in rectangular box to distinguish it from rest of
network. Note that dotted lines are denial links, while continuous lines are assertion links. These are removed
after substitution of actual value of a variable, if possible.
(i)
(ii)
(iii)
Let us consider another example. The sentences, "Anyone who gives something he likes to a person likes that
person also. John gives an apple to Mike. John likes an apple", can be expressed in both binary clausal form and
ESNet representation as given below.
Clausal Representation
likes(X,Z) ← action(E, give), object(E, Y), actor(E, X), recipient(E, Z), likes(X, Y)
action(e, give).
object(e, apple).
actor(e, john).
recipient(e, mike).
likes(john, apple).
Here, E is a variable which will be bound to an actual event 'e' in the process of resolution.
ESNet Representation
The extended semantic network for the clauses described above is given in Fig. 7.13.
From ESNet given in Fig. 7.13, we will prove likes (john, mike). It can be proved by both forward and backward
reasoning inference mechanisms. Proof of likes (john, mike) is shown below using both the methods.
Here we have to reduce the network by resolving clauses till we obtain the desired assertion. The network is
given in Fig. 7.14. We can easily eliminate assertion and its corresponding denial with likes links by unifying Y =
apple and X = john. The ESNet is reduced to the one shown in Fig. 7.14.
The network shown in Fig. 7.14 is further reduced to the following conclusion by unifying E-e and Z = mike.
We can clearly see that a link likes between john and mike is deduced after eliminating all links which are
assertions and denials. For example, action(e, give) is an assertion and action(E, give) is a denial, which is a
contradiction and is hence eliminated. Similarly, other such eliminations take place and we are left with the
final assertion likes (john, mike).
Let us now prove likes (john, mike) using backward reasoning method. In order to prove likes (john, mike), add
the denial link to the network and try to reduce the network to empty. The denial link is shown as follows:
likes
john ---------------------- mike
The reduction of the network is shown step by step in Fig. 7.15. Denial link is enclosed in rectangular box.
(i)
(ii)
(iii)
When E is unified with ‘e’ then entire ESNet is reduced to contradiction and hence, the query likes(john, mike)
is proved using backward reasoning inference process.
Inheritance
In a conventional semantic network, lower level nodes in isa hierarchy inherit properties from higher level nodes
unless the properties are redefined in the node itself. Consider the following logic program:
These logic clauses are represented as ESNet as shown in Fig. 7.16. We can show using ESNet that john inherits
the property of having two_legs from human. In Fig. 7.17, we have used different variables in different clauses
as the scope of variable is clause itself in which it is appearing.
In order to show that john has two legs, we add a denial link of john has two legs to the network (Fig. 7.17), and
use general-purpose backward reasoning inference mechanism and try to get a contradiction. In Fig. 7.17, we
get X2 = john after elimination of assertion and its denial links. The reduced ESNet is shown in Fig. 7.18.
Implementation
Implementation of ESNet can be done in any programming language or using a tool which facilitates
implementation of semantic networks. The clauses can be represented either explicitly, by adding them to the
network, or implicitly, by using the structure-sharing method [Boyer and Moore 1972].
The resolvents may be represented either explicitly or implicitly and by pointers to their parents along with a
record of the matching substitution. When a resolvent is created, a parent may be deleted if no other match
exists for the atom being matched in that parent.
A major point of difference between conventional semantic networks and extended semantic net- works is as
follows: in a conventional semantic network, procedures are generally written in the host programming
language, whereas in ESNet, procedures are integrated with the rest of the database and are executed by the
same general-purpose mechanism which performs inference in the network.
Consider a sentence john gives an apple to mike and if we wish to ask who takes an apple?, then the knowledge
representation scheme should have some mechanism of storing the knowledge that for every act of giving,
there is a corresponding act of taking.
Such rules can be easily integrated with the rest of the network and the information about its use is incorporated
in the general- purpose inference system associated with ESNet.
Such inferencing is not possible in conventional semantic networks or even in clausal logic unless we write
explicit rule for it. The ESNet can also be considered as an abstract data structure for the implementation of a
proof procedure.
A characteristic feature of ESNet is indexing an argument or predicate where direct access is provided to all
atoms (both conditions and conclusions) containing the given term or predicate.
A frame may be defined as a data structure that is used for representing a stereotyped situation. It consists of
a collection of attributes or slots and associated values that describe some real-world entity.
With the increase in complexity of a problem, the representation should become more structured for more
beneficial use. Therefore, frames may be considered to represent the ways of organizing as well as packaging
knowledge in a more structured form.
Frames are slightly similar to the concept of class of object-oriented paradigm; class also contains attributes
and methods. In frames, it consists of attributes or slots; slots are described with attribute-value pairs
<slot_name, value>. Slots are generally complex structures that have facets (or fillers) describing their
properties.
The value of a slot may be a primitive, such as a text string, constant or an integer, or it may be another frame.
So, slots may contain value, refer to other frames (relations) or contain methods.
Most of the frame systems allow multiple values for slots and some systems support procedural attachments
as well. These procedural attachments (methods) can be used for computing the slot value whenever required.
Frames may contain triggers for checking consistency or obtaining updates of other slots. Therefore, frames are
basically a machine-usable formalization of concepts or schemata. The general structure of a frame is given in
Table 7.7.
A frame may contain as many <slots-filler> pairs as required to describe an object. Fillers are also known as
facets. Each slot may contain one or more facets or fillers from the list given in Table 7.8.
A class frame generally has certain default values which can be redefined at lower levels. In a class frame
possesses an actual value facet, then decedent frames cannot modify that value; this value then remains
unchanged for all subclasses and instances of that class.
The related frames are linked together into frame systems and are organized into hierarchies or network of
frames Each frame in the network is either a class frame or an instance frame. As an example, a frame has been
defined in Table 7.9 (Bedi, 1999).
Frames in a network of frames are connected using the links discussed below. Other required information may
be made available using slot-value pair concept.
• Ako: This link connects two class frames, one of which is a kind of the other class, e.g., the class
child_hospital is a kind of the class hospital. A class can define its own slots and also inherits slot-value
pairs from its super class. It gives a sub-typing hierarchy where all instances of class frame are instances
of super class frames. For example, all child hospitals are hospitals but all hospitals may not be child
hospitals. With the help of this link, knowledge representation becomes more structured and memory
efficient.
• Inst: This link connects a particular instance frame to a class frame, e.g., AIIMS is an instance of the class
frame hospital. An instance class possesses the same structure as its class frame.
• Part_of: This link connects two class frames one of which is contained in the other class, e.g.. ward is
Part of the class hospital,
Frame Descriptions
The following are descriptions of some frames of the network.
The graphical representation of a frame network for the class ‘hospital’ described above is shown in Fig. 7.19
A network of frames can be viewed as a three-dimensional figure: one dimension is concerned with the
description of a class along with all the classes contained in it and connected with Part_of link. The second
dimension in the hierarchy is of Ako links, while the third dimension contains the instances of all classes
connected with Inst links. It is important to note that each frame can have at most two links with the following
combinations:
• A frame can be an instance of some frame and a part of another frame. So, it can have
both Inst and Part of links.
• A frame can be a-kind-of frame of one frame and a part of another frame. So, it can have
Ako and Part of links.
• However, it is not possible to have a frame which is both an instance and a-kind-of some
class at the same time. So, the links Inst and Ako in a frame are not possible.
Figure 7.20 shows the three-dimensional view of the hospital frame structure.
An instance of a hospital is a specific hospital, such as AIIMS, which will have instances of all the classes in the
network of hospital frame, that is, instances of the lab, doctor, and ward.
Further, if we wish to represent the three-dimensional structure of a network of frames in first-order predicate
logic [Chen, et al. 1986], it could be done as follows:
∀X (frame(X) = ∃ Y1, Y2,..., Yn ((Ako(X, Y1) V Inst(X, Y)) Ʌ Part_of(X, Y2) Ʌ slot3(X, Y3) Ʌ slot4 (X, Y4) Ʌ...Ʌ
slotn (X, Yn))),
where, frame(X) means that X is a frame, slot, e (Inst, Ako}; slot, Part_of, and slot,(X, Y1) means that Y, is value
of slot, of frame X.
Inheritance in Frames
Inheritance is defined as a mechanism which is utilized for passing knowledge from one frame to other frames
down through the taxonomy from general to specific frame. It is a good way of obtaining information that is
not stored in the place we first looked. It leads to cognitive economy, where information is only stored in one
place, while it can be retrieved from different parts of the network.
Demons allow us to invoke rules within the frames and are considered to be a powerful style of programming.
These can be attached with a slot along with other required facets. For example, an attached demon if needed
may be used for monitoring the behaviour of the system. Another demon if_added may be used to validate
data when the data is added in the value slot. For example, if value of slot age is to be entered for a patient of
child hospital, it will check whether it lies in the specified range.
Another use of attaching demons is that they allow dynamic information retrieval and storage. Here, one may
get the value directly from the slot or may have to dynamically calculate the value of the required slot on the
basis of other values. For example, the doctor frame can have a slot for tax that contains a demon if needed for
calculating tax using salary information available with each doctor instances.
There is no need to store the tax calculated for each doctor separately, instead it can be calculated using the
demon stored in doctor frame at the tax slot. Let us consider another example. If date_of_birth slot for a doctor
is given and we are interested in knowing his/her age, a demon that calculates age from date-of-birth value and
the current date (which can be obtained from the system) could be easily calculated.
Instances of the frames can be declared as objects of the relevant classes. Slots can be defined as variables and
procedural attachment as methods. Inheritance is inbuilt in these languages.
Representation of Frames in Prolog Frames can also be easily implemented using Prolog language. Each frame
is represented as a fact in Prolog as frame (F_name, [ slot1,(facet(value), ...), slot2 (facet(value), ...), ...]). Let us
represent part of the network of frame system for hospital in Prolog as follows.
An inheritance rules for frames might look like the one given below.
Introduction
One of the goals of AI is to understand the concept of intelligence and develop intelligent computer programs.
An example of a computer program that exhibits intelligent behaviour is an expert system (ES).
Expert systems are meant to solve real-world problems which require specialized human expertise and provide
expert quality advice, diagnoses, and recommendations.
An ES is basically a software program or system that tries to perform tasks similar to human experts in a specific
domain of the problem. It incorporates the concepts and methods of symbolic inference, reasoning, and the
use of knowledge for making these inferences.
Expert systems represent their knowledge and expertise as data and rules within the computer system; these
rules and data can be called upon whenever needed to solve problems.
It should be noted that the term expert systems is often re- served for programs whose knowledge base contains
the knowledge provided by human experts in contrast to knowledge gathered from textbooks or non-experts.
• The system called MYCIN was developed using the expertise of best diagnosticians of
bacterial infections whose performance was found to be better than the average clinician.
• In another real-world case, at a chemical refinery, a knowledge engineer was assigned to produce
an ES to reproduce the expertise of an experienced retired employee to save the company
incurring the loss of the valued knowledge asset that the employee possessed.
The power of an ES lies in its store of knowledge regarding the problem domain; the more knowledge a system
is provided, the more competent it becomes.
Expert systems may or may not possess learning components; however, once they are fully developed, their
performance is evaluated by subjecting them to real- world problem-solving situation.
Representation of knowledge in a computer is not straight forward and requires special expertise. A knowledge
engineer handles the responsibility of extracting this knowledge and building the ES's knowledge base. This
process of gathering knowledge from a domain expert and codifying it according to the formalism is called
knowledge engineering. This phase is known as knowledge acquisition, which is a big area of research.
Generally, an initial prototype based on the information extracted by interviewing the expert is developed. This
proto- type is then iteratively refined on the basis of the feedback received from the experts and potential users
of the ES.
The developed system should be able to explain its reasoning to its users and answer questions about the
solution process. Moreover, updating the system should just involve adding or deleting localized regions of
knowledge.
A simple ES primarily consists of a knowledge base and an inference engine, while features such ES.
To be more precise, the different interdependent and overlapping phases involved in building ES may be
categorized as follows:
• Conceptualization Phase In this phase, knowledge engineer and domain expert decide
the concepts, relations and control mechanism needed to describe the problem-solving
method. At this stage, the issue of granularity is also addressed, which refers to the level
of details required in the knowledge.
• Formalization Phase This phase involves expressing the key concepts and relations in
some framework supported by ES building tools. Formalized knowledge consists of data
structures, inference rules, control strategies, and languages required for
implementation.
• Implementation Phase During this phase, formalized knowledge is converted to a
working computer program, initially called prototype of the whole system.
• Testing Phase This phase involves evaluating the performance and utility of prototype
system and revising the system, if required. The domain expert evaluates the prototype
system and provides feedback, which helps the knowledge engineer to revise it.
Knowledge Engineering
The whole process of building an ES is often referred to as knowledge engineering. It typically involves a special
form of interaction between ES builder, or the knowledge engineer, one or more domain experts, and potential
users.
Although there are different ways and methods of knowledge engineering, the basic approach remains the
same. The tasks and responsibilities of a knowledge engineer involve the following:
• Ensuring that the computer has all the knowledge needed to solve a problem.
• Choosing one or more forms to represent the required knowledge.
• Ensuring that the computer can use the knowledge efficiently by selecting some of the
reasoning methods.
Figure 8.1 shows the interaction between the knowledge engineer and the domain expert to produce an ES.
The main role of the knowledge engineer begins only once the problem of some domain for developing an ES
is decided. The job of the knowledge engineer involves close collaboration with the domain expert(s) and the
end user(s).
The knowledge engineer may or may not have any knowledge of the application domain initially; however,
he/she must become familiar with the problem domain by reading introductory texts or literature and talking
to the domain expert(s).
The next step of the process involves a more systematic interviewing of the expert. The knowledge engineer
will then extract general rules from the discussion and interview held with expert(s) and get them checked by
the expert(s) for correctness.
The engineer then translates the knowledge into a computer-usable language and designs an inference engine,
which is a reasoning structure that uses the knowledge appropriately.
He/she also determines the mechanism to integrate the use of uncertain knowledge in the reasoning process,
and should know the kinds of explanation that may be useful to the end user.
The domain knowledge, consisting of both formal, textbook knowledge and experiential knowledge (obtained
by the expert' experiences), is entered into the program piece by piece.
In the initial stages, the knowledge engineer may encounter a number of problems such as the inference engine
may not be right, the form of knowledge representation may not be appropriate for the kind of knowledge
needed for the task, or the expert may find the pieces of knowledge incorrect.
The basic development cycle should include the development of an initial prototype and iterative testing and
modification of that prototype by both experts (for checking the validity of the rules) and users (for checking
the performance of the system and explanations for the answers).
In order to develop the initial prototype, the knowledge engineer will have to take provisional decisions
regarding appropriate knowledge representation (e.g., rules, semantic net or frames, etc.) and inference
methods (e.g., forward chaining or backward chaining or both).
To test these basic design decisions, the first prototype may be so designed that it only solves a small part of
the overall problem.
During the initial years of ES development era, there were unrealistic expectations about the potential benefits
of these systems. But now it has been realized that building expert systems for very complicated problems is
not successful and may not fulfil expectations.
There are two ways of building an ES: they can be built from either scratch or by using ES shells or tools.
Currently, expert system shells are available and widely used; however, they are often used to solve fairly simple
problems.
Knowledge Representation
The heart of ES is the powerful corpus of knowledge that is accumulated during the system-building phase;
accumulation and codification of knowledge is one of the most important aspects of ES.
Expert knowledge of the problem domain is organized in such a way that this knowledge is separated from
other knowledge possessed by the system such as knowledge about user's interaction, general knowledge
about how to solve a problem, etc.
The collection of domain knowledge is called knowledge base, while the general problem-solving knowledge
may be called inference engine, user interface, etc.
The most common knowledge representation scheme for expert systems consists of production rules, or simply
rules; they are of the form if-then, where the if part contains a set of conditions in some logical combination.
The piece of knowledge represented by the production rule is relevant to the line of reasoning being developed
when if part of the rule is satisfied; consequently, the then part can be concluded. Expert systems in which
knowledge is represented in the form of rules are called rule-based systems.
The rules may have certain conclusions or may have some degree of uncertainty; statistical techniques (such as
probability) are used to handle such rules.
Another widely used representation in ES is called the unit (also known as frame, semantic net, etc.), which is
based upon a more passive view of knowledge. The unit is an assemblage of associated symbolic knowledge
about an entity to be represented.
Typically, a unit consists of a list of properties of an entity and associated values for those properties.
Expert system architecture may be effectively described with the help of a diagram as given in Fig. 8.2, which
contains important components of the system.
As shown in the figure, the user interacts with the system through a user interface which may use menus, natural
language, or any other style of interaction.
Then, an inference engine is used to reason with the expert knowledge s well as the data specific to the problem
being solved.
Case-specific data includes both data provided by the user and partial conclusions along with certainty measures
based on this data.
As shown in the figure, the user interacts with the system through a user interface which may use menus, natural
language, or any other style of interaction.
Then, an inference engine is used to reason with the expert knowledge as well as the data specific to the problem
being solved.
Case-specific data includes both data provided by the user and partial conclusions along with certainty measures
based on this data.
Generally, all expert systems possess an explanation subsystem, which allows the program to explain its
reasoning to the user.
Some systems also have a knowledge acquisition module that helps the expert or knowledge engineer to easily
update and check the knowledge base.
Knowledge Base
Knowledge base of an ES consists of knowledge regarding problem domain in the form of static and dynamic
databases. Static knowledge consists of rules and facts, or any other form of knowledge representation which
may be compiled as a part of the system and does not change during the execution of the system.
On the other hand, dynamic knowledge consists of facts related to a particular consultation of the system
collected by asking various questions to the user who is consulting the ES.
At the beginning of the consultation, the dynamic knowledge base (often called working memory) is empty. As
the consultation progresses, dynamic knowledge base (in the form of facts only) grows and is used in decision
making along with static knowledge. Working memory is deleted at the end of consultation of the system.
Inference Engine
An inference engine developed for an ES consists of inference mechanism as well as control strategy. The term
inference refers to the process of searching through knowledge base and deriving new knowledge.
It involves formal reasoning by matching and unification similar to the one performed by human expert to solve
problems in a specific area of knowledge using modus ponen rule.
An inference rule may be defined as a statement that has two parts, an if clause and a then clause. This rule
enables expert systems to find solutions to diagnostic and prescriptive problems.
Knowledge base of an ES comprises of many such inference rules. They are entered as separate file of rules and
the inference engine uses them together to draw conclusions. Each rule is independent of others and may be
deleted or added without affecting other rules.
Inference mechanism uses control strategy that determines the order in which rules are applied. There are
mainly two types of reasoning mechanisms that use inference rules: backward chaining and forward chaining.
The process of forward chaining starts with the available data and uses inference rules to conclude more data
until a desired goal is achieved. An inference engine uses facts from static and dynamic knowledge bases and
searches through the rules until it finds one in which the if clause is known to be true. A rule is then said to
succeed.
It then concludes the then clause and adds this information to the dynamic knowledge base. The inference
engine continues to repeat the process until a goal is reached. Since the data available determines which
inference rules are used, this method is also known as data driven method.
Rule 1 If symptoms are headache, sneezing, running_nose and sore throat, then patient has cold.
Rule 2 If symptoms are fever, cough and running_nose, then patient has measles.
Facts are generated in working memory by asking questions to the user whether he has fever, running_nose,
cough, etc.
Thus, in forward chaining, we start with the facts given by the user and try to find an appropriate rule whose if
part is satisfied and subsequently the then part is concluded.
Back- ward chaining starts with a list of goals and works backwards to see if there is data which will allow it to
conclude any of these goals. An inference engine using backward chaining would search the inference rules
until it finds one whose then part matches a desired goal. If the if part of that inference rule is not known to be
true, then it is added to the list of goals.
Consider the same example discussed above. In order to satisfy a goal called cold, the inference engine will
select a rule with conclusion as cold and will try to find the facts in the if part of the rule whether the user has
headache, sneezing, running nose, and sore throat.
If yes, then cold is established otherwise it tries other rule for goal, if it exists. If we are not able to satisfy all the
rules with the goal cold then other goals such as measles will be tried.
Using rule 2, if the symptoms of the user are fever, running nose, and cough, then measles is concluded. The
inference engine using backward chaining tries to prove conclusion of the rules one by one till it succeeds or all
the rules are exhausted. This method is also known as goal-driven method.
Knowledge Acquisition
Knowledge present in an ES may be obtained from many sources such as textbooks, reports, case studies,
empirical data, and domain expert which are a prominent source of knowledge.
A knowledge acquisition module allows the system to acquire more knowledge regarding the problem domain
from experts. Interaction between the knowledge engineer and the domain expert involves prolonged series of
intense systematic interviews or using a questionnaire (carefully designed to get expertise knowledge).
The knowledge engineer working on a system should be able to extract expert methods, procedures, strategies,
and thumb rules for solving the problem at hand.
Later, the knowledge can be updated (insertion, deletion, or updation) by using knowledge acquisition module
of the system. This system will give all these facilities to expert so that the knowledge.
Case History
Case history stores the files created by inference engine using the dynamic database (created at the time of
different consultation of the system) and is used by the learning module to enrich its knowledge base.
Different cases with solutions are stored in Case Base system and these cases are used for solving the problem
using Case Base Reasoning (CBR).
User Interfaces
User interface of an ES allows user to communicate with the system in an interactive manner and helps the
system in creating working knowledge for the problem that has to be solved.
The function of the interface is to present questions and information to the user and supply the responses of
user to the inference engine.
The end-user usually sees an ES through an interactive dialogue module. We observe from the dialogue module
given in Table 8.1 how the system leads the user through a set of questions, whose purpose is to determine a
set of symptoms.
Explanation Module
Most of the ES has explanation facilities that enable users to query the system about why it asked some
questions and how it reached some conclusion; these modules are called How and Why.
The sub-module How tells users regarding the process through which the system has reached a particular
solution, while the Why sub-module explains to the user, the reasoning behind arriving at a solution.
These questions are answered by referring to the system goals, the rules being used, and any existing problem
data.
The knowledge structure of an ES consists of a rule, a set of antecedent conditions, which, if true, allows the
assertion of a consequent. To illustrate the use of explanation facilities, the Table 8.2 shows explanation for
why and how questions.
Special Interfaces
Special interfaces may be used in ES for performing specialized activities, such as handling uncertainty in
knowledge. These interfaces form a major area of expert system research which involves methods for reasoning
with uncertain data and uncertain knowledge.
A point to be kept in mind regarding knowledge is that it is generally incomplete and uncertain. To deal with
uncertain knowledge, a confidence factor or a weight may be associated with a rule.
The set of methods for using uncertain knowledge in combination with uncertain data in the reasoning process
is called reasoning with uncertainty.
Probability theory is the oldest method used to determine these certainties. Another important subclass of
methods for reasoning with uncertainty is called fuzzy logic and the systems that use them are known as fuzzy
systems.
In traditional applications, problem expertise is encoded in program as well as in the form of data structures.
On the other hand, in the ES approach, all problem-related expertise is encoded in data structures only and not
in the programs.
Traditional computer programs perform tasks using conventional decision- making logic, which is often
embedded as a part of the code in the form of a basic algorithm containing little knowledge for solving that
specific problem. Hence, if the knowledge changes, the program has to be rebuilt.
However, in expert systems, small fragments of human experience are collected into a knowledge base which
are used to reason through a problem. A different problem, within the domain of the knowledge base, can be
solved using the same program without having to reprogram the system.
Another advantage of expert systems over traditional systems is that they allow the use of confidences or
certainty factors. This is similar to human reasoning where one cannot always conclude things with 100%
confidence.
For example, consider the statement If weather is humid, then it might probably rain. The use of words such as
if, then, might, probably, etc., indicate that there is some uncertainty involved in the statement. This type of
reasoning can be imitated by using numeric values called confidences in ES.
Further, conventional programs are designed to always produce correct answers, whereas expert systems are
designed to behave like human experts and may sometimes produce incorrect results.
• Expertise An ES should exhibit expert performance, have high level of skill, and possess
adequate robustness. The high-level expertise and skill of an ES aids in problem solving
and makes the system cost effective.
• Self knowledge A system should be able to explain and examine its own reasoning.
• Learning capability A system should learn from its mistakes and mature as it grows.
Flexibility provided by the ES helps it grow incrementally
• Predictive modelling power This is one of the important features of ES. The system can
act as an information processing model of problem solving. It can explain how new
situation led to the change, which helps users to evaluate the effect of new facts and
understand their relationship to the solution.
• Does the system make decisions that experts generally agree to?
• Are the inference rules correct and complete?
• Does the control strategy allow the system to consider items in the natural order that the
expert prefers?
• Are relevant questions asked to the user in proper order (otherwise it will be an irritating
process)?
• Are the explanation given by the ES adequate for describing how and why conclusions?
One has to justify the need for developing an ES as it involves a lot of time and money. In order to avoid costly
and embarrassing failures, one should evaluate whether a problem is suitable for an ES solution using the
following guidelines:
• Specialized knowledge problems If there is a rare specialized knowledge involved in
some special domain (say, oil exploration and medicine), then it is worth developing an
ES to solve problems pertaining to the domain as human experts are scarce and
unavailable. This implies that we typically need to develop ES for problems that require
highly specialized expertise, which is likely to be lost due to personnel changes or
retirement.
• High payoff An ES may be developed if the task to be performed has a very high payoff.
The company may need similar expertise at a large number of different physical
locations.
• The type of problem The problem for which an ES is to be developed must be structured
and it does not require common sense knowledge as common sense knowledge is hard
to capture and represent. It is easier to deal with ES developed for highly technical fields.
Advantages
Disadvantages
The basic hypothesis of AI is that intelligent behaviour can be described as symbol manipulation and can be
modelled with the symbol processing capabilities of the computer.
Special programming languages were invented in the late 1950s and they facilitate symbol manipulation. The
most prominent of them is called LISP (LISt Processing), which is based on lambda calculus.
Another Al programming language, known as Prolog (PROgramming in LOGic), was invented in the early 1970s.
Prolog is based on first-order predicate calculus
A variety of logic-based programming languages have been developed since the development of Prolog causing
the term prolog to become generic.
Now-a-days, object-oriented languages (C++, Java, etc.) and even C are used for developing ES.