AI(FinalNotes)
AI(FinalNotes)
intelligence
o An intelligent agent needs knowledge about the real world for taking decisions
and reasoning to act efficiently.
o Knowledge-based agents are those agents who have the capability of maintaining
an internal state of knowledge, reason over that knowledge, update their
knowledge after observations and take actions. These agents can represent
the world with some formal representation and act intelligently.
o Knowledge-based agents are composed of two main parts:
o Knowledge-base and
o Inference system.
1
The above diagram is representing a generalized architecture for a knowledge-based agent.
The knowledge-based agent (KBA) take input from the environment by perceiving the
environment. The input is taken by the inference engine of the agent and which also
communicate with KB to decide as per the knowledge store in KB. The learning element of
KBA regularly updates the KB by learning new knowledge.
Inference system
Inference means deriving new sentences from old. Inference system allows us to add a new
sentence to the knowledge base. A sentence is a proposition about the world. Inference
system applies logical rules to the KB to deduce new information.
Inference system generates new facts so that an agent can update the KB. An inference
system works mainly in two rules which are given as:
o Forward chaining
2
o Backward chaining
1. TELL: This operation tells the knowledge base what it perceives from the
environment.
2. ASK: This operation asks the knowledge base what action it should perform.
3. Perform: It performs the selected action.
1. function KB-AGENT(percept):
2. persistent: KB, a knowledge base
3. t, a counter, initially 0, indicating time
4. TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))
5. Action = ASK(KB, MAKE-ACTION-QUERY(t))
6. TELL(KB, MAKE-ACTION-SENTENCE(action, t))
7. t=t+1
8. return action
The knowledge-based agent takes percept as input and returns an action as output. The
agent maintains the knowledge base, KB, and it initially has some background knowledge of
the real world. It also has a counter to indicate the time for the whole process, and this
counter is initialized with zero.
Each time when the function is called, it performs its three operations:
The MAKE-ACTION-QUERY generates a sentence to ask which action should be done at the
current time.
MAKE-ACTION-SENTENCE generates a sentence which asserts that the chosen action was
executed.
3
Various levels of knowledge-based agent:
A knowledge-based agent can be viewed at different levels which are given below:
1. Knowledge level
Knowledge level is the first level of knowledge-based agent, and in this level, we need to
specify what the agent knows, and what the agent goals are. With these specifications, we
can fix its behavior. For example, suppose an automated taxi agent needs to go from a
station A to station B, and he knows the way from A to B, so this comes at the knowledge
level.
2. Logical level:
At this level, we understand that how the knowledge representation of knowledge is stored.
At this level, sentences are encoded into different logics. At the logical level, an encoding of
knowledge into logical sentences occurs. At the logical level we can expect to the automated
taxi agent to reach to the destination B.
3. Implementation level:
This is the physical representation of logic and knowledge. At the implementation level
agent perform actions as per logical and knowledge level. At this level, an automated taxi
agent actually implements his knowledge and logic so that he can reach to the destination.
However, in the real world, a successful agent can be built by combining both declarative
and procedural approaches, and declarative knowledge can often be compiled into more
efficient procedural code.
4
representation and reasoning. Hence we can describe Knowledge representation as
following:
What to Represent:
Following are the kind of knowledge which needs to be represented in AI systems:
o Object: All the facts about objects in our world domain. E.g., Guitars contains
strings, trumpets are brass instruments.
o Events: Events are the actions which occur in our world.
o Performance: It describe behavior which involves knowledge about how to do
things.
o Meta-knowledge: It is knowledge about what we know.
o Facts: Facts are the truths about the real world and what we represent.
o Knowledge-Base: The central component of the knowledge-based agents is the
knowledge base. It is represented as KB. The Knowledgebase is a group of the
Sentences (Here, sentences are used as a technical term and not identical with the
English language).
Types of knowledge
Following are the various types of knowledge:
5
1. Declarative Knowledge:
2. Procedural Knowledge
3. Meta-knowledge:
4. Heuristic knowledge:
6
o Heuristic knowledge is representing knowledge of some experts in a filed or subject.
o Heuristic knowledge is rules of thumb based on previous experiences, awareness of
approaches, and which are good to work but not guaranteed.
5. Structural knowledge:
Let's suppose if you met some person who is speaking in a language which you don't know,
then how you will able to act on that. The same thing applies to the intelligent behavior of
the agents.
As we can see in below diagram, there is one decision maker which act by sensing the
environment and using knowledge. But if the knowledge part will not present then, it cannot
display intelligent behavior.
AI knowledge cycle:
7
An Artificial intelligence system has the following components for displaying intelligent
behavior:
o Perception
o Learning
o Knowledge Representation and Reasoning
o Planning
o Execution
The above diagram is showing how an AI system can interact with the real world and what
components help it to show intelligence. AI system has Perception component by which it
retrieves information from its environment. It can be visual, audio or another form of
sensory input. The learning component is responsible for learning from data captured by
Perception comportment. In the complete cycle, the main components are knowledge
representation and Reasoning. These two components are involved in showing the
intelligence in machine-like humans. These two components are independent with each
other but also coupled together. The planning and execution depend on analysis of
Knowledge representation and reasoning.
8
o This approach of knowledge representation is famous in database systems where the
relationship between different entities is represented.
o This approach has little opportunity for inference.
Player1 65 23
Player2 58 18
Player3 75 24
2. Inheritable knowledge:
o In the inheritable knowledge approach, all data must be stored into a hierarchy of
classes.
o All classes should be arranged in a generalized form or a hierarchal manner.
o In this approach, we apply inheritance property.
o Elements inherit values from other members of a class.
o This approach contains inheritable knowledge which shows a relation between
instance and class, and it is called instance relation.
o Every individual frame can represent the collection of attributes and its value.
o In this approach, objects and values are represented in Boxed nodes.
o We use Arrows which point from objects to their values.
o Example:
9
3. Inferential knowledge:
o Inferential knowledge approach represents knowledge in the form of formal logics.
o This approach can be used to derive more facts.
o It guaranteed correctness.
o Example: Let's suppose there are two statements:
1. Marcus is a man
2. All men are mortal
Then it can represent as;
man(Marcus)
∀x = man (x) ----------> mortal (x)s
4. Procedural knowledge:
o Procedural knowledge approach uses small programs and codes which describes how
to do specific things, and how to proceed.
o In this approach, one important rule is used which is If-Then rule.
o In this knowledge, we can use various coding languages such as LISP
language and Prolog language.
o We can easily represent heuristic or domain-specific knowledge using this approach.
o But it is not necessary that we can represent all cases in this approach.
10
Requirements for knowledge Representation system:
A good knowledge representation system must possess the following properties.
1. 1. Representational Accuracy:
KR system should have the ability to represent all kind of required knowledge.
2. 2. Inferential Adequacy:
KR system should have ability to manipulate the representational structures to
produce new knowledge corresponding to existing structure.
3. 3. Inferential Efficiency:
The ability to direct the inferential knowledge mechanism into the most productive
directions by storing appropriate guides.
4. 4. Acquisitional efficiency- The ability to acquire the new knowledge easily using
automatic methods.
1. Logical Representation
2. Semantic Network Representation
3. Frame Representation
4. Production Rules
11
1. Logical Representation
Logical representation is a language with some concrete rules which deals with propositions
and has no ambiguity in representation. Logical representation means drawing a conclusion
based on various conditions. This representation lays down some important communication
rules. It consists of precisely defined syntax and semantics which supports the sound
inference. Each sentence can be translated into logics using syntax and semantics.
Syntax:
o Syntaxes are the rules which decide how we can construct legal sentences in the
logic.
o It determines which symbol we can use in knowledge representation.
o How to write those symbols?
Semantics:
o Semantics are the rules by which we can interpret the sentence in the logic.
o Semantic also involves assigning a meaning to each sentence.
a. Propositional Logics
b. Predicate logics
Note: We will discuss Prepositional Logics and Predicate logics in later chapters.
12
Note: Do not be confused with logical representation and logical reasoning as logical
representation is a representation language and reasoning is a process of thinking
logically.
Example: Following are some statements which we need to represent in the form of nodes
and arcs.
Statements:
a. Jerry is a cat.
b. Jerry is a mammal
c. Jerry is owned by Priya.
d. Jerry is brown colored.
e. All Mammals are animal.
13
In the above diagram, we have represented the different type of knowledge in the form of
nodes and arcs. Each object is connected with another object by some relation.
3. Frame Representation
A frame is a record like structure which consists of a collection of attributes and its values to
describe an entity in the world. Frames are the AI data structure which divides knowledge
into substructures by representing stereotypes situations. It consists of a collection of slots
14
and slot values. These slots may be of any type and sizes. Slots have names and values
which are called facets.
Facets: The various aspects of a slot is known as Facets. Facets are features of frames
which enable us to put constraints on the frames. Example: IF-NEEDED facts are called
when data of any particular slot is needed. A frame may consist of any number of slots, and
a slot may include any number of facets and facets may have any number of values. A
frame is also known as slot-filter knowledge representation in artificial intelligence.
Frames are derived from semantic networks and later evolved into our modern-day classes
and objects. A single frame is not much useful. Frames system consist of a collection of
frames which are connected. In the frame, knowledge about an object or event can be
stored together in the knowledge base. The frame is a type of technology which is widely
used in various applications including Natural language processing and machine visions.
Example: 1
Let's take an example of a frame for a book
Slots Filters
Year 1996
Page 1152
Example 2:
Let's suppose we are taking an entity, Peter. Peter is an engineer as a profession, and his
age is 25, he lives in city London, and the country is England. So following is the frame
representation for this:
15
Slots Filter
Name Peter
Profession Doctor
Age 25
Weight 78
4. Production Rules
Production rules system consist of (condition, action) pairs which mean, "If condition then
action". It has mainly three parts:
In production rules agent checks for the condition and if the condition exists then production
rule fires and corresponding action is carried out. The condition part of the rule determines
16
which rule may be applied to a problem. And the action part carries out the associated
problem-solving steps. This complete process is called a recognize-act cycle.
The working memory contains the description of the current state of problems-solving and
rule can write knowledge to the working memory. This knowledge match and may fire other
rules.
If there is a new situation (state) generates, then multiple production rules will be fired
together, this is called conflict set. In this situation, the agent needs to select a rule from
these sets, and it is called a conflict resolution.
Example:
o IF (at bus stop AND bus arrives) THEN action (get into the bus)
o IF (on the bus AND paid AND empty seat) THEN action (sit down).
o IF (on bus AND unpaid) THEN action (pay charges).
o IF (bus arrives at destination) THEN action (get down from the bus).
Example:
1. a) It is Sunday.
2. b) The Sun rises from West (False proposition)
3. c) 3+3= 7(False proposition)
4. d) 5 is a prime number.
17
o Propositional logic is also called Boolean logic as it works on 0 and 1.
o In propositional logic, we use symbolic variables to represent the logic, and we can
use any symbol for a representing a proposition, such A, B, C, P, Q, R, etc.
o Propositions can be either true or false, but it cannot be both.
o Propositional logic consists of an object, relations or function, and logical
connectives.
o These connectives are also called logical operators.
o The propositions and connectives are the basic elements of the propositional logic.
o Connectives can be said as a logical operator which connects two sentences.
o A proposition formula which is always true is called tautology, and it is also called a
valid sentence.
o A proposition formula which is always false is called Contradiction.
o A proposition formula which has both true and false values is called
o Statements which are questions, commands, or opinions are not propositions such as
"Where is Rohini", "How are you", "What is your name", are not propositions.
a. Atomic Propositions
b. Compound propositions
Example:
Example:
18
Logical Connectives:
Logical connectives are used to connect two simpler propositions or representing a sentence
logically. We can create compound propositions with the help of logical connectives. There
are mainly five connectives, which are given as follows:
Truth Table:
In propositional logic, we need to know the truth values of propositions in all possible
scenarios. We can combine all the possible combination with logical connectives, and the
representation of these combinations in a tabular format is called Truth table. Following
are the truth table for all logical connectives:
19
Truth table with three propositions:
We can build a proposition composing three propositions P, Q, and R. This truth table is
made-up of 8n Tuples as we have taken three proposition symbols.
20
Precedence of connectives:
Just like arithmetic operators, there is a precedence order for propositional connectors or
logical operators. This order should be followed while evaluating a propositional problem.
Following is the list of the precedence order for operators:
Precedence Operators
Note: For better understanding use parenthesis to make sure of the correct
interpretations. Such as ¬R∨ Q, It can be interpreted as (¬R) ∨ Q.
Logical equivalence:
Logical equivalence is one of the features of propositional logic. Two propositions are said to
be logically equivalent if and only if the columns in the truth table are identical to each
other.
21
Let's take two propositions A and B, so for logical equivalence, we can write it as A⇔B. In
below truth table we can see that column for ¬A∨ B and A→B, are identical hence A is
Equivalent to B
Properties of Operators:
o Commutativity:
o P∧ Q= Q ∧ P, or
o P ∨ Q = Q ∨ P.
o Associativity:
o (P ∧ Q) ∧ R= P ∧ (Q ∧ R),
o (P ∨ Q) ∨ R= P ∨ (Q ∨ R)
o Identity element:
o P ∧ True = P,
o P ∨ True= True.
o Distributive:
o P∧ (Q ∨ R) = (P ∧ Q) ∨ (P ∧ R).
o P ∨ (Q ∧ R) = (P ∨ Q) ∧ (P ∨ R).
o DE Morgan's Law:
o ¬ (P ∧ Q) = (¬P) ∨ (¬Q)
o ¬ (P ∨ Q) = (¬ P) ∧ (¬Q).
o Double-negation elimination:
o ¬ (¬P) = P.
22
Rules of Inference in Artificial intelligence
Inference:
In artificial intelligence, we need intelligent computers which can create new logic from old
logic or by evidence, so generating the conclusions from evidence and facts is
termed as Inference.
Inference rules:
Inference rules are the templates for generating valid arguments. Inference rules are
applied to derive proofs in artificial intelligence, and the proof is a sequence of the
conclusion that leads to the desired goal.
In inference rules, the implication among all the connectives plays an important role.
Following are some terminologies related to inference rules:
From the above term some of the compound statements are equivalent to each other, which
we can prove using truth table:
Hence from the above truth table, we can prove that P → Q is equivalent to ¬ Q → ¬ P, and
Q→ P is equivalent to ¬ P → ¬ Q.
23
1. Modus Ponens:
The Modus Ponens rule is one of the most important rules of inference, and it states that if P
and P → Q is true, then we can infer that Q will be true. It can be represented as:
Example:
2. Modus Tollens:
The Modus Tollens rule state that if P→ Q is true and ¬ Q is true, then ¬ P will also true.
It can be represented as:
24
3. Hypothetical Syllogism:
The Hypothetical Syllogism rule state that if P→R is true whenever P→Q is true, and Q→R is
true. It can be represented as the following notation:
Example:
Statement-1: If you have my home key then you can unlock my home. P→Q
Statement-2: If you can unlock my home then you can take my money. Q→R
Conclusion: If you have my home key then you can take my money. P→R
4. Disjunctive Syllogism:
The Disjunctive syllogism rule state that if P∨Q is true, and ¬P is true, then Q will be true. It
can be represented as:
Example:
Proof by truth-table:
25
5. Addition:
The Addition rule is one the common inference rule, and it states that If P is true, then P∨Q
will be true.
Example:
Proof by Truth-Table:
6. Simplification:
The simplification rule state that if P∧ Q is true, then Q or P will also be true. It can be
represented as:
Proof by Truth-Table:
7. Resolution:
The Resolution rule state that if P∨Q and ¬ P∧R is true, then Q∨R will also be true. It can
be represented as
26
Proof by Truth-Table:
To represent the above statements, PL logic is not sufficient, so we required some more
powerful logic, such as first-order logic.
First-Order logic:
o First-order logic is another way of knowledge representation in artificial intelligence.
It is an extension to propositional logic.
o FOL is sufficiently expressive to represent the natural language statements in a
concise way.
o First-order logic is also known as Predicate logic or First-order predicate logic.
First-order logic is a powerful language that develops information about the objects
in a more easy way and can also express the relationship between those objects.
o First-order logic (like natural language) does not only assume that the world contains
facts like propositional logic but also assumes the following things in the world:
o Objects: A, B, people, numbers, colors, wars, theories, squares, pits,
wumpus, ......
27
o Relations: It can be unary relation such as: red, round, is adjacent, or n-
any relation such as: the sister of, brother of, has color, comes between
o Function: Father of, best friend, third inning of, end of, ......
o As a natural language, first-order logic also has two main parts:
o Syntax
o Semantics
Variables x, y, z, a, b,....
Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==
Quantifier ∀, ∃
Atomic sentences:
o Atomic sentences are the most basic sentences of first-order logic. These sentences
are formed from a predicate symbol followed by a parenthesis with a sequence of
terms.
o We can represent atomic sentences as Predicate (term1, term2, ......, term n).
28
Example: Ravi and Ajay are brothers: => Brothers(Ravi, Ajay).
Chinky is a cat: => cat (Chinky).
Complex Sentences:
o Complex sentences are made by combining atomic sentences using connectives.
Consider the statement: "x is an integer.", it consists of two parts, the first part x is the
subject of the statement and second part "is an integer," is known as a predicate.
Universal Quantifier:
Universal quantifier is a symbol of logical representation, which specifies that the statement
within its range is true for everything or every instance of a particular thing.
29
o For all x
o For each x
o For every x.
Example:
All man drink coffee.
Let a variable x which refers to a cat so all x can be represented in UOD as below:
It will be read as: There are all x where x is a man who drink coffee.
Existential Quantifier:
Existential quantifiers are the type of quantifiers, which express that the statement within
its scope is true for at least one instance of something.
It is denoted by the logical operator ∃, which resembles as inverted E. When it is used with
a predicate variable then it is called as an existential quantifier.
30
Note: In Existential quantifier we always use AND or Conjunction symbol (∧).
If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:
Example:
Some boys are intelligent.
It will be read as: There are some x where x is a boy who is intelligent.
Points to remember:
o The main connective for universal quantifier ∀ is implication →.
o The main connective for existential quantifier ∃ is and ∧.
Properties of Quantifiers:
31
o In universal quantifier, ∀x∀y is similar to ∀y∀x.
o In Existential quantifier, ∃x∃y is similar to ∃y∃x.
o ∃x∀y is not similar to ∀y∃x.
Free Variable: A variable is said to be a free variable in a formula if it occurs outside the
scope of the quantifier.
32
Knowledge Engineering in First-order logic
What is knowledge-engineering?
The process of constructing a knowledge-base in first-order logic is called as knowledge-
engineering. In knowledge-engineering, someone who investigates a particular domain,
learns important concept of that domain, and generates a formal representation of the
objects, is known as knowledge engineer.
In this topic, we will understand the Knowledge engineering process in an electronic circuit
domain, which is already familiar. This approach is mainly suitable for creating special-
purpose knowledge base.
At the first level or highest level, we will examine the functionality of the circuit:
33
At the second level, we will examine the circuit structure details such as:
3. Decide on vocabulary:
The next step of the process is to select functions, predicate, and constants to represent the
circuits, terminals, signals, and gates. Firstly we will distinguish the gates from each other
and from other objects. Each gate is represented as an object which is named by a
constant, such as, Gate(X1). The functionality of each gate is determined by its type,
which is taken as constants such as AND, OR, XOR, or NOT. Circuits will be identified by a
predicate: Circuit (C1).
For gate input, we will use the function In(1, X1) for denoting the first input terminal of the
gate, and for output terminal we will use Out (1, X1).
The function Arity(c, i, j) is used to denote that circuit c has i input, j output.
We use a unary predicate On (t), which is true if the signal at a terminal is on.
o If two terminals are connected then they have the same input signal, it can be
represented as:
34
1. ∀ t1, t2 Terminal (t1) ∧ Terminal (t2) ∧ Connect (t1, t2) → Signal (t1) = Signal (2).
o Signal at every terminal will have either value 0 or 1, it will be represented as:
1. ∀ g Gate(g) ∧ Type(g) = XOR → Signal (Out(1, g)) = 1 ⇔ Signal (In(1, g)) ≠ Signal (In(2,
g)).
o Output of NOT gate is invert of its input:
For the given circuit C1, we can encode the problem instance in atomic sentences as below:
Since in the circuit there are two XOR, two AND, and one OR gate so atomic sentences for
these gates will be:
35
1. For XOR gate: Type(x1)= XOR, Type(X2) = XOR
2. For AND gate: Type(A1) = AND, Type(A2)= AND
3. For OR gate: Type (O1) = OR.
What should be the combination of input which would generate the first output of circuit C1,
as 0 and a second output to be 1?
1. ∃ i1, i2, i3 Signal (In(1, C1))=i1 ∧ Signal (In(2, C1))=i2 ∧ Signal (In(3, C1))= i3
2. ∧ Signal (Out(1, C1)) =0 ∧ Signal (Out(2, C1))=1
7. Debug the knowledge base:
Now we will debug the knowledge base, and this is the last step of the complete process. In
this step, we will try to debug the issues of knowledge base.
Substitution:
Note: First-order logic is capable of expressing facts about some or all objects in the
universe.
Equality:
36
First-Order logic does not only use predicate and terms for making atomic sentences but
also uses another way, which is equality in FOL. For this, we can use equality
symbols which specify that the two terms refer to the same object.
As in the above example, the object referred by the Brother (John) is similar to the object
referred by Smith. The equality symbol can also be used with negation to represent that
two terms are not the same objects.
o Universal Generalization
o Universal Instantiation
o Existential Instantiation
o Existential introduction
1. Universal Generalization:
o Universal generalization is a valid inference rule which states that if premise P(c) is
true for any arbitrary element c in the universe of discourse, then we can have a
conclusion as ∀ x P(x).
Example: Let's represent, P(c): "A byte contains 8 bits", so for ∀ x P(x) "All bytes
contain 8 bits.", it will also be true.
2. Universal Instantiation:
37
o The UI rule state that we can infer any sentence P(c) by substituting a ground term c
(a constant within domain x) from ∀ x P(x) for any object in the universe of
discourse.
Example:1.
Example: 2.
"All kings who are greedy are Evil." So let our knowledge base contains this detail as in the
form of FOL:
So from this information, we can infer any of the following statements using Universal
Instantiation:
3. Existential Instantiation:
Example:
38
From the given sentence: ∃x Crown(x) ∧ OnHead(x, John),
So we can infer: Crown(K) ∧ OnHead( K, John), as long as K does not appear in the
knowledge base.
4. Existential introduction
Generalized Modus Ponens can be summarized as, " P implies Q and P is asserted to be
true, therefore Q must be True."
According to Modus Ponens, for atomic sentences pi, pi', q. Where there is a substitution θ
such that SUBST (θ, pi',) = SUBST(θ, pi), it can be represented as:
Example:
We will use this rule for Kings are evil, so we will find some x such that x is king,
and x is greedy so we can infer that x is evil.
39
4. SUBST(θ,q).
What is Unification?
o Unification is a process of making two different logical atomic expressions identical
by finding a substitution. Unification depends on the substitution process.
o It takes two literals as input and makes them identical using substitution.
o Let Ψ1 and Ψ2 be two atomic sentences and 𝜎 be a unifier such that, Ψ1𝜎 = Ψ2𝜎,
then it can be expressed as UNIFY(Ψ1, Ψ2).
o Example: Find the MGU for Unify{King(x), King(John)}
Substitution θ = {John/x} is a unifier for these atoms and applying this substitution, and
both expressions will be identical.
o The UNIFY algorithm is used for unification, which takes two atomic sentences and
returns a unifier for those sentences (If any exist).
o Unification is a key component of all first-order inference algorithms.
o It returns fail if the expressions do not match with each other.
o The substitution variables are called Most General Unifier or MGU.
E.g. Let's say there are two different expressions, P(x, y), and P(a, f(z)).
In this example, we need to make both above statements identical to each other. For this,
we will perform the substitution.
o Substitute x with a, and y with f(z) in the first expression, and it will be represented
as a/x and f(z)/y.
o With both the substitutions, the first expression will be identical to the second
expression and the substitution set will be: [a/x, f(z)/y].
o Predicate symbol must be same, atoms or expression with different predicate symbol
can never be unified.
o Number of Arguments in both expressions must be identical.
o Unification will fail if there are two similar variables present in the same expression.
40
Unification Algorithm:
Algorithm: Unify(Ψ1, Ψ2)
For each pair of the following atomic sentences find the most general
unifier (If exist).
41
1. Find the MGU of {p(f(a), g(Y)) and p(X, X)}
5. Find the MGU of Q(a, g(x, a), f(y)), Q(a, g(f(b), a), x)}
42
SUBST θ= {b/y}
S1 => {Q(a, g(f(b), a), f(b)); Q(a, g(f(b), a), f(b))}, Successfully Unified.
Resolution in FOL
Resolution
Resolution is a theorem proving technique that proceeds by building refutation proofs, i.e.,
proofs by contradictions. It was invented by a Mathematician John Alan Robinson in the year
1965.
Resolution is used, if there are various statements are given, and we need to prove a
conclusion of those statements. Unification is a key concept in proofs by resolutions.
Resolution is a single inference rule which can efficiently operate on the conjunctive
normal form or clausal form.
Clause: Disjunction of literals (an atomic sentence) is called a clause. It is also known as a
unit clause.
Note: To better understand this topic, firstly learns the FOL in AI.
43
Where li and mj are complementary literals.
This rule is also called the binary resolution rule because it only resolves exactly two
literals.
Example:
We can resolve two clauses which are given below:
Where two complimentary literals are: Loves (f(x), x) and ¬ Loves (a, b)
These literals can be unified with unifier θ= [a/f(x), and b/x] , and it will generate a
resolvent clause:
To better understand all the above steps, we will take an example in which we will apply
resolution.
Example:
a. John likes all kind of food.
b. Apple and vegetable are food
c. Anything anyone eats and not killed is food.
d. Anil eats peanuts and still alive
e. Harry eats everything that Anil eats.
Prove by resolution that:
f. John likes peanuts.
In the first step we will convert all the given statements into its first order logic.
44
Step-2: Conversion of FOL into CNF
In First order logic resolution, it is required to convert the FOL into CNF as CNF form makes
easier for resolution proofs.
45
2. food(Apple) Λ food(vegetables)
3. ∀y ∀z ¬ eats(y, z) V killed(y) V food(z)
4. eats (Anil, Peanuts) Λ alive(Anil)
5. ∀w¬ eats(Anil, w) V eats(Harry, w)
6. ∀g ¬killed(g) ] V alive(g)
7. ∀k ¬ alive(k) V ¬ killed(k)
8. likes(John, Peanuts).
o Eliminate existential instantiation quantifier by elimination.
In this step, we will eliminate existential quantifier ∃, and this process is known
as Skolemization. But in this example problem since there is no existential
quantifier so all the statements will remain same in this step.
o Drop Universal quantifiers.
In this step we will drop all universal quantifier since all the statements are not
implicitly quantified so we don't need it.
1. ¬ food(x) V likes(John, x)
2. food(Apple)
3. food(vegetables)
4. ¬ eats(y, z) V killed(y) V food(z)
5. eats (Anil, Peanuts)
6. alive(Anil)
7. ¬ eats(Anil, w) V eats(Harry, w)
8. killed(g) V alive(g)
9. ¬ alive(k) V ¬ killed(k)
10. likes(John, Peanuts).
In this statement, we will apply negation to the conclusion statements, which will be written
as ¬likes(John, Peanuts)
46
Now in this step, we will solve the problem by resolution tree using substitution. For the
above problem, it will be given as follows:
Hence the negation of the conclusion has been proved as a complete contradiction with the
given set of statements.
47
Forward Chaining and backward chaining in
AI
In artificial intelligence, forward and backward chaining is one of the important topics, but
before understanding forward and backward chaining lets first understand that from where
these two terms came.
Inference engine:
The inference engine is the component of the intelligent system in artificial intelligence,
which applies logical rules to the knowledge base to infer new information from known facts.
The first inference engine was part of the expert system. Inference engine commonly
proceeds in two modes, which are:
a. Forward chaining
b. Backward chaining
Horn clause and definite clause are the forms of sentences, which enables knowledge base
to use a more restricted and efficient inference algorithm. Logical inference algorithms use
forward and backward chaining approaches, which require KB in the form of the first-order
definite clause.
Definite clause: A clause which is a disjunction of literals with exactly one positive
literal is known as a definite clause or strict horn clause.
Horn clause: A clause which is a disjunction of literals with at most one positive literal is
known as horn clause. Hence all the definite clauses are horn clauses.
It is equivalent to p ∧ q → k.
A. Forward Chaining
Forward chaining is also known as a forward deduction or forward reasoning method when
using an inference engine. Forward chaining is a form of reasoning which start with atomic
sentences in the knowledge base and applies inference rules (Modus Ponens) in the forward
direction to extract more data until a goal is reached.
The Forward-chaining algorithm starts from known facts, triggers all rules whose premises
are satisfied, and add their conclusion to the known facts. This process repeats until the
problem is solved.
Properties of Forward-Chaining:
48
o It is a down-up approach, as it moves from bottom to top.
o It is a process of making a conclusion based on known facts or data, by starting from
the initial state and reaches the goal state.
o Forward-chaining approach is also called as data-driven as we reach to the goal
using available data.
o Forward -chaining approach is commonly used in the expert system, such as CLIPS,
business, and production rule systems.
Consider the following famous example which we will use in both approaches:
Example:
"As per the law, it is a crime for an American to sell weapons to hostile nations.
Country A, an enemy of America, has some missiles, and all the missiles were sold
to it by Robert, who is an American citizen."
To solve the above problem, first, we will convert all the above facts into first-order definite
clauses, and then we will use a forward-chaining algorithm to reach the goal.
49
Forward chaining proof:
Step-1:
In the first step we will start with the known facts and will choose the sentences which do
not have implications, such as: American(Robert), Enemy(A, America), Owns(A, T1),
and Missile(T1). All these facts will be represented as below.
Step-2:
At the second step, we will see those facts which infer from available facts and with satisfied
premises.
Rule-(1) does not satisfy premises, so it will not be added in the first iteration.
Rule-(4) satisfy with the substitution {p/T1}, so Sells (Robert, T1, A) is added, which
infers from the conjunction of Rule (2) and (3).
Rule-(6) is satisfied with the substitution(p/A), so Hostile(A) is added and which infers from
Rule-(7).
Step-3:
At step-3, as we can check Rule-(1) is satisfied with the substitution {p/Robert, q/T1,
r/A}, so we can add Criminal(Robert) which infers all the available facts. And hence we
reached our goal statement.
50
Hence it is proved that Robert is Criminal using forward chaining approach.
B. Backward Chaining:
Backward-chaining is also known as a backward deduction or backward reasoning method
when using an inference engine. A backward chaining algorithm is a form of reasoning,
which starts with the goal and works backward, chaining through rules to find known facts
that support the goal.
Example:
In backward-chaining, we will use the same above example, and will rewrite all the rules.
51
o American (p) ∧ weapon(q) ∧ sells (p, q, r) ∧ hostile(r) → Criminal(p) ...(1)
Owns(A, T1) ........(2)
o Missile(T1)
o ?p Missiles(p) ∧ Owns (A, p) → Sells (Robert, p, A) ......(4)
o Missile(p) → Weapons (p) .......(5)
o Enemy(p, America) →Hostile(p) ........(6)
o Enemy (A, America) .........(7)
o American(Robert). ..........(8)
Backward-Chaining proof:
In Backward chaining, we will start with our goal predicate, which is Criminal(Robert), and
then infer further rules.
Step-1:
At the first step, we will take the goal fact. And from the goal fact, we will infer other facts,
and at last, we will prove those facts true. So our goal fact is "Robert is Criminal," so
following is the predicate of it.
Step-2:
At the second step, we will infer other facts form goal fact which satisfies the rules. So as
we can see in Rule-1, the goal predicate Criminal (Robert) is present with substitution
{Robert/P}. So we will add all the conjunctive facts below the first level and will replace p
with Robert.
52
Step-3:t At step-3, we will extract further fact Missile(q) which infer from Weapon(q), as it
satisfies Rule-(5). Weapon (q) is also true with the substitution of a constant T1 at q.
Step-4:
At step-4, we can infer facts Missile(T1) and Owns(A, T1) form Sells(Robert, T1, r) which
satisfies the Rule- 4, with the substitution of A in place of r. So these two statements are
proved here.
53
Step-5:
At step-5, we can infer the fact Enemy(A, America) from Hostile(A) which satisfies Rule-
6. And hence all the statements are proved true using backward chaining.
54
Difference between backward chaining and
forward chaining
Following is the difference between the forward chaining and backward chaining:
o Forward chaining as the name suggests, start from the known facts and move
forward by applying inference rules to extract more data, and it continues until it
reaches to the goal, whereas backward chaining starts from the goal, move
backward by using inference rules to determine the facts that satisfy the goal.
o Forward chaining is called a data-driven inference technique, whereas backward
chaining is called a goal-driven inference technique.
o Forward chaining is known as the down-up approach, whereas backward chaining is
known as a top-down approach.
o Forward chaining uses breadth-first search strategy, whereas backward chaining
uses depth-first search strategy.
o Forward and backward chaining both applies Modus ponens inference rule.
o Forward chaining can be used for tasks such as planning, design process
monitoring, diagnosis, and classification, whereas backward chaining can be
used for classification and diagnosis tasks.
55
o Forward chaining can be like an exhaustive search, whereas backward chaining tries
to avoid the unnecessary path of reasoning.
o In forward-chaining there can be various ASK questions from the knowledge base,
whereas in backward chaining there can be fewer ASK questions.
o Forward chaining is slow as it checks for all the rules, whereas backward chaining is
fast as it checks few required rules only.
5. Forward chaining tests for all the Backward chaining only tests for few
available rules required rules.
56
7. Forward chaining can generate an Backward chaining generates a finite
infinite number of possible number of possible conclusions.
conclusions.
9. Forward chaining is aimed for any Backward chaining is only aimed for
conclusion. the required data.
Reasoning:
The reasoning is the mental process of deriving logical conclusion and making predictions
from available knowledge, facts, and beliefs. Or we can say, "Reasoning is a way to infer
facts from existing data." It is a general process of thinking rationally, to find valid
conclusions.
In artificial intelligence, the reasoning is essential so that the machine can also think
rationally as a human brain, and can perform like a human.
Types of Reasoning
In artificial intelligence, reasoning can be divided into the following categories:
o Deductive reasoning
o Inductive reasoning
o Abductive reasoning
o Common Sense Reasoning
o Monotonic Reasoning
o Non-monotonic Reasoning
57
Note: Inductive and deductive reasoning are the forms of propositional logic.
1. Deductive reasoning:
Deductive reasoning is deducing new information from logically related known information.
It is the form of valid reasoning, which means the argument's conclusion must be true when
the premises are true.
Deductive reasoning is a type of propositional logic in AI, and it requires various rules and
facts. It is sometimes referred to as top-down reasoning, and contradictory to inductive
reasoning.
In deductive reasoning, the truth of the premises guarantees the truth of the conclusion.
Deductive reasoning mostly starts from the general premises to the specific conclusion,
which can be explained as below example.
Example:
2. Inductive Reasoning:
Inductive reasoning is a form of reasoning to arrive at a conclusion using limited sets of
facts by the process of generalization. It starts with the series of specific facts or data and
reaches to a general statement or conclusion.
In inductive reasoning, premises provide probable supports to the conclusion, so the truth
of premises does not guarantee the truth of the conclusion.
58
Example:
Premise: All of the pigeons we have seen in the zoo are white.
3. Abductive reasoning:
Abductive reasoning is a form of logical reasoning which starts with single or multiple
observations then seeks to find the most likely explanation or conclusion for the
observation.
Example:
Conclusion It is raining.
Common Sense reasoning simulates the human ability to make presumptions about events
which occurs on every day.
It relies on good judgment rather than exact logic and operates on heuristic
knowledge and heuristic rules.
Example:
The above two statements are the examples of common sense reasoning which a human
mind can easily understand and assume.
59
5. Monotonic Reasoning:
In monotonic reasoning, once the conclusion is taken, then it will remain the same even if
we add some other information to existing information in our knowledge base. In monotonic
reasoning, adding knowledge does not decrease the set of prepositions that can be derived.
To solve monotonic problems, we can derive the valid conclusion from the available facts
only, and it will not be affected by new facts.
Monotonic reasoning is not useful for the real-time systems, as in real time, facts get
changed, so we cannot use monotonic reasoning.
Example:
It is a true fact, and it cannot be changed even if we add another sentence in knowledge
base like, "The moon revolves around the earth" Or "Earth is not round," etc.
6. Non-monotonic Reasoning
In Non-monotonic reasoning, some conclusions may be invalidated if we add some more
information to our knowledge base.
Logic will be said as non-monotonic if some conclusions can be invalidated by adding more
knowledge into our knowledge base.
60
"Human perceptions for various things in daily life, "is a general example of non-monotonic
reasoning.
Example: Let suppose the knowledge base contains the following knowledge:
So from the above sentences, we can conclude that Pitty can fly.
However, if we add one another sentence into knowledge base "Pitty is a penguin", which
concludes "Pitty cannot fly", so it invalidates the above conclusion.
61
o In deductive reasoning, the conclusions are certain, whereas, in Inductive reasoning,
the conclusions are probabilistic.
o Deductive arguments can be valid or invalid, which means if premises are true, the
conclusion must be true, whereas inductive argument can be strong or weak, which
means conclusion may be false even if premises are true.
The differences between inductive and deductive can be explained using the below diagram
on the basis of arguments:
62
Comparison Chart:
63
Structure Deductive reasoning reaches Inductive reasoning reaches from
from general facts to specific facts to general facts.
specific facts.
So to represent uncertain knowledge, where we are not sure about the predicates, we need
uncertain reasoning or probabilistic reasoning.
Causes of uncertainty:
Following are some leading causes of uncertainty to occur in the real world.
Probabilistic reasoning:
Probabilistic reasoning is a way of knowledge representation where we apply the concept of
probability to indicate the uncertainty in knowledge. In probabilistic reasoning, we combine
probability theory with logic to handle the uncertainty.
In the real world, there are lots of scenarios, where the certainty of something is not
confirmed, such as "It will rain today," "behavior of someone for some situations," "A match
between two teams or two players." These are probable sentences for which we can assume
that it will happen but not sure about it, so here we use probabilistic reasoning.
64
Need of probabilistic reasoning in AI:
In probabilistic reasoning, there are two ways to solve problems with uncertain knowledge:
o Bayes' rule
o Bayesian Statistics
Probability: Probability can be defined as a chance that an uncertain event will occur. It is
the numerical measure of the likelihood that an event will occur. The value of probability
always remains between 0 and 1 that represent ideal uncertainties.
We can find the probability of an uncertain event by using the below formula.
Sample space: The collection of all possible events is called sample space.
Random variables: Random variables are used to represent the events and objects in the
real world.
65
Posterior Probability: The probability that is calculated after all evidence or information
has taken into account. It is a combination of prior probability and new information.
Conditional probability:
Conditional probability is a probability of occurring an event when another event has already
happened.
Let's suppose, we want to calculate the event A when event B has already occurred, "the
probability of A under the conditions of B", it can be written as:
If the probability of A is given and we need to find the probability of B, then it will be given
as:
It can be explained by using the below Venn diagram, where B is occurred event, so sample
space will be reduced to set B, and now we can only calculate event A when event B is
already occurred by dividing the probability of P(A⋀B) by P( B ).
Example:
66
In a class, there are 70% of the students who like English and 40% of the students who
likes English and mathematics, and then what is the percent of students those who like
English also like mathematics?
Solution:
Hence, 57% are the students who like English also like Mathematics.
In probability theory, it relates the conditional probability and marginal probabilities of two
random events.
Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian
inference is an application of Bayes' theorem, which is fundamental to Bayesian statistics.
Bayes' theorem allows updating the probability prediction of an event by observing new
information of the real world.
Example: If cancer corresponds to one's age then by using Bayes' theorem, we can
determine the probability of cancer more accurately with the help of age.
Bayes' theorem can be derived using product rule and conditional probability of event A with
known event B:
67
Equating right hand side of both the equations, we will get:
The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic
of most modern AI systems for probabilistic inference.
It shows the simple relationship between joint and conditional probabilities. Here,
P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability
of hypothesis A when we have occurred an evidence B.
P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we
calculate the probability of evidence.
P(A) is called the prior probability, probability of hypothesis before considering the
evidence
In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes' rule
can be written as:
Where A1, A2, A3,........, An is a set of mutually exclusive and exhaustive events.
Example-1:
Question: what is the probability that a patient has diseases meningitis with a stiff
neck?
68
Given Data:
A doctor is aware that disease meningitis causes a patient to have a stiff neck, and it occurs
80% of the time. He is also aware of some more facts, which are given as follows:
Let a be the proposition that patient has stiff neck and b be the proposition that patient has
meningitis. , so we can calculate the following as:
P(a|b) = 0.8
P(b) = 1/30000
P(a)= .02
Hence, we can assume that 1 patient out of 750 patients has meningitis disease with a stiff
neck.
Example-2:
Question: From a standard deck of playing cards, a single card is drawn. The
probability that the card is king is 4/52, then calculate posterior probability
P(King|Face), which means the drawn face card is a king card.
Solution:
69
Application of Bayes' theorem in Artificial intelligence:
Following are some applications of Bayes' theorem:
o It is used to calculate the next step of the robot when the already executed step is
given.
o Bayes' theorem is helpful in weather forecasting.
o It can solve the Monty Hall problem.
"A Bayesian network is a probabilistic graphical model which represents a set of variables
and their conditional dependencies using a directed acyclic graph."
Bayesian networks are probabilistic, because these networks are built from a probability
distribution, and also use probability theory for prediction and anomaly detection.
Real world applications are probabilistic in nature, and to represent the relationship between
multiple events, we need a Bayesian network. It can also be used in various tasks
including prediction, anomaly detection, diagnostics, automated insight, reasoning,
time series prediction, and decision making under uncertainty.
Bayesian Network can be used for building models from data and experts opinions, and it
consists of two parts:
The generalized form of Bayesian network that represents and solve decision problems
under uncertain knowledge is known as an Influence diagram.
A Bayesian network graph is made up of nodes and Arcs (directed links), where:
70
o Each node corresponds to the random variables, and a variable can
be continuous or discrete.
o Arc or directed arrows represent the causal relationship or conditional probabilities
between random variables. These directed links or arrows connect the pair of nodes
in the graph.
These links represent that one node directly influence the other node, and if there is
no directed link that means that nodes are independent with each other
o In the above diagram, A, B, C, and D are random variables
represented by the nodes of the network graph.
o If we are considering node B, which is connected with node A by a
directed arrow, then node A is called the parent of Node B.
o Node C is independent of node A.
Note: The Bayesian network graph does not contain any cyclic graph. Hence, it is
known as a directed acyclic graph or DAG.
o Causal Component
71
o Actual numbers
Each node in the Bayesian network has condition probability distribution P(Xi |Parent(Xi)
), which determines the effect of the parent on that node.
P[x1, x2, x3,....., xn], it can be written as the following way in terms of the joint probability
distribution.
In general for each variable Xi, we can write the equation as:
Example: Harry installed a new burglar alarm at his home to detect burglary. The alarm
reliably responds at detecting a burglary but also responds for minor earthquakes. Harry
has two neighbors David and Sophia, who have taken a responsibility to inform Harry at
work when they hear the alarm. David always calls Harry when he hears the alarm, but
sometimes he got confused with the phone ringing and calls at that time too. On the other
hand, Sophia likes to listen to high music, so sometimes she misses to hear the alarm. Here
we would like to compute the probability of Burglary Alarm.
Problem:
Calculate the probability that alarm has sounded, but there is neither a burglary,
nor an earthquake occurred, and David and Sophia both called the Harry.
Solution:
o The Bayesian network for the above problem is given below. The network structure is
showing that burglary and earthquake is the parent node of the alarm and directly
72
affecting the probability of alarm's going off, but David and Sophia's calls depend on
alarm probability.
o The network is representing that our assumptions do not directly perceive the
burglary and also do not notice the minor earthquake, and they also not confer
before calling.
o The conditional distributions for each node are given as conditional probabilities table
or CPT.
o Each row in the CPT must be sum to 1 because all the entries in the table represent
an exhaustive set of cases for the variable.
o In CPT, a boolean variable with k boolean parents contains 2 K probabilities. Hence, if
there are two parents, then CPT will contain 4 probability values
o Burglary (B)
o Earthquake(E)
o Alarm(A)
o David Calls(D)
o Sophia calls(S)
We can write the events of problem statement in the form of probability: P[D, S, A, B, E],
can rewrite the above probability statement using joint probability distribution:
73
Let's take the observed probability for the Burglary and earthquake component:
P(E= False)= 0.999, Which is the probability that an earthquake not occurred.
74
True False 0.95 0.04
The Conditional probability of David that he will call depends on the probability of Alarm.
The Conditional probability of Sophia that she calls is depending on its Parent Node "Alarm."
From the formula of joint distribution, we can write the problem statement in the form of
probability distribution:
= 0.00068045.
75
Hence, a Bayesian network can answer any query about the domain by using Joint
distribution.
There are two ways to understand the semantics of the Bayesian network, which is given
below:
o Machine Learning
o Deep Learning
o Natural Language processing
o Expert System
o Robotics
o Machine Vision
o Speech Recognition
Note: Among all of the above, Machine learning plays a crucial role in AI. Machine
learning and deep learning are the ways of achieving AI in real life.
76
Machine Learning
Machine learning is a part of AI which provides intelligence to machines with the ability to
automatically learn with experiences without being explicitly programmed.
o It is primarily concerned with the design and development of algorithms that allow
the system to learn from historical data.
o Machine Learning is based on the idea that machines can learn from past data,
identify patterns, and make decisions using algorithms.
o Machine learning algorithms are designed in such a way that they can learn and
improve their performance automatically.
o Machine learning helps in discovering patterns in data.
77
Types of Machine Learning
o Supervised learning:
Supervised learning is a type of machine learning in which machine learn from
known datasets (set of training examples), and then predict the output. A supervised
learning agent needs to find out the function that matches a given sample set.
Supervised learning further can be classified into two categories of algorithms:
1. Classifications
2. Regression
o Reinforcement learning:
Reinforcement learning is a type of learning in which an AI agent is trained by giving
some commands, and on each action, an agent gets a reward as a feedback.Using
these feedbacks, agent improves its performance.
Reward feedback can be positive or negative which means on each good action,
agent receives a positive reward while for wrong action, it gets a negative reward.
Reinforcement learning is of two types:
1. Positive Reinforcement learning
2. Negative Reinforcement learning
o Unsupervised learning:
Unsupervised learning is associated with learning without supervision or training. In
unsupervised learning, the algorithms are trained with data which is neither labeled
nor classified. In unsupervised learning, the agent needs to learn from patterns
without corresponding output values.
Unsupervised learning can be classified into two categories of algorithms:
1. Clustering
2. Association
78
Natural Language processing
Natural language processing is a subfield of computer science and artificial intelligence. NLP
enables a computer system to understand and process human language such as English.
NLP plays an important role in AI as without NLP, AI agent cannot work on human
instructions, but with the help of NLP, we can instruct an AI system on our language. Today
we are all around AI, and as well as NLP, we can easily ask Siri, Google or Cortana to help
us in our language.
Natural language processing application enables a user to communicate with the system in
their own words directly.
o Speech
o Text
Deep Learning
Deep learning is a subset of machine learning which provides the ability to machine to
perform human-like tasks without human involvement. It provides the ability to an AI agent
to mimic the human brain. Deep learning can use both supervised and unsupervised
learning to train an AI agent.
o Deep learning is implemented through neural networks architecture hence also called
a deep neural network.
o Deep learning is the primary technology behind self-driving cars, speech recognition,
image recognition, automatic machine translation, etc.
o The main challenge for deep learning is that it requires lots of data with lots of
computational power.
79
o The hidden layers perform mathematical operations on inputs, and the
performed data forwarded to the output layer.
o The output layer returns the output to the user.
Expert Systems
o An expert system is an application of artificial intelligence. In artificial
intelligence, expert systems are the computer programs that rely on
obtaining the knowledge of human experts and programming that
knowledge into a system.
o Expert systems emulate the decision-making ability of human experts.
These systems are designed to solve the complex problem through bodies of
knowledge rather than conventional procedural code.
o One of the examples of an expert system is a Suggestion for the spelling
error while typing in the Google search box.
o Following are some characteristics of expert systems:
o High performance
o Reliable
o Highly responsive
o Understandable
80
Robotics
o Robotics is a branch of artificial intelligence and engineering which is used for
designing and manufacturing of robots.
o Robots are the programmed machines which can perform a series of actions
automatically or semi-automatically.
o AI can be applied to robots to make intelligent robots which can perform the task
with their intelligence. AI algorithms are necessary to allow a robot to perform more
complex tasks.
o Nowadays, AI and machine learning are being applied on robots to manufacture
intelligent robots which can also interact socially like humans. One of the best
examples of AI in robotics is Sophia robot.
Machine Vision
o Machine vision is an application of computer vision which enables a machine to
recognize the object.
o Machine vision captures and analyses visual information using one or more video
cameras, analog-to-digital conversations, and digital signal processing.
o Machine vision systems are programmed to perform narrowly defined tasks such as
counting objects, reading the serial number, etc.
o Computer systems do not see in the same way as human eyes can see, but it is also
not bounded by human limitations such as to see through the wall.
o With the help of machine learning and machine vision, an AI agent can be able to
see through walls.
Speech Recognition:
Speech recognition is a technology which enables a machine to understand the spoken
language and translate into a machine-readable format. It can also be said as automatic
Speech recognition and computer speech recognition. It is a way to talk with a
computer, and on the basis of that command, a computer can perform a specific
task.
There is some speech recognition software which has a limited vocabulary of words and
phrase. This software requires unambiguous spoken language to understand and perform
specific task. Today's there are various software or devices which contains speech
recognition technology such as Cortana, Google virtual assistant, Apple Siri, etc.
We need to train our speech recognition system to understand our language. In previous
days, these systems were only designed to convert the speech to text, but now there are
various devices which can directly convert speech into commands.
81
Speech recognition systems can be used in the following areas:
1. Speaker Dependent
2. Speaker Independent
Note: You will study the above topics in detail in later chapters.
The expert system is a part of AI, and the first ES was developed in the year 1970, which
was the first successful approach of artificial intelligence. It solves the most complex issue
as an expert by extracting the knowledge stored in its knowledge base. The system helps in
decision making for compsex problems using both facts and heuristics like a human
expert. It is called so because it contains the expert knowledge of a specific domain and
can solve any complex problem of that particular domain. These systems are designed for a
specific domain, such as medicine, science, etc.
The performance of an expert system is based on the expert's knowledge stored in its
knowledge base. The more knowledge stored in the KB, the more that system improves its
performance. One of the common examples of an ES is a suggestion of spelling errors while
typing in the Google search box.
Below is the block diagram that represents the working of an expert system:
82
Note: It is important to remember that an expert system is not used to replace the
human experts; instead, it is used to assist the human in making a complex decision.
These systems do not have human capabilities of thinking and work on the basis of
the knowledge base of the particular domain.
o High Performance: The expert system provides high performance for solving any
type of complex problem of a specific domain with high efficiency and accuracy.
o Understandable: It responds in a way that can be easily understandable by the
user. It can take input in human language and provides the output in the same way.
83
o Reliable: It is much reliable for generating an efficient and accurate output.
o Highly responsive: ES provides the result for any complex query within a very
short period of time.
o User Interface
o Inference Engine
o Knowledge Base
1. User Interface
With the help of a user interface, the expert system interacts with the user, takes queries as
an input in a readable format, and passes it to the inference engine. After getting the
response from the inference engine, it displays the output to the user. In other words, it is
an interface that helps a non-expert user to communicate with the expert system
to find a solution.
84
derive a conclusion or deduce new information. It helps in deriving an error-free
solution of queries asked by the user.
o With the help of an inference engine, the system extracts the knowledge from the
knowledge base.
o There are two types of inference engine:
o Deterministic Inference engine: The conclusions drawn from this type of
inference engine are assumed to be true. It is based on facts and rules.
o Probabilistic Inference engine: This type of inference engine contains uncertainty
in conclusions, and based on the probability.
o Forward Chaining: It starts from the known facts and rules, and applies the
inference rules to add their conclusion to the known facts.
o Backward Chaining: It is a backward reasoning method that starts from the goal
and works backward to prove the known facts.
3. Knowledge Base
o The knowledgebase is a type of storage that stores knowledge acquired from the
different experts of the particular domain. It is considered as big storage of
knowledge. The more the knowledge base, the more precise will be the Expert
System.
o It is similar to a database that contains information and rules of a particular domain
or subject.
o One can also view the knowledge base as collections of objects and their attributes.
Such as a Lion is an object and its attributes are it is a mammal, it is not a domestic
animal, etc.
85
Development of Expert System
Here, we will explain the working of an expert system by taking an example of MYCIN ES.
Below are some steps to build an MYCIN:
o Firstly, ES should be fed with expert knowledge. In the case of MYCIN, human
experts specialized in the medical field of bacterial infection, provide information
about the causes, symptoms, and other knowledge in that domain.
o The KB of the MYCIN is updated successfully. In order to test it, the doctor provides
a new problem to it. The problem is to identify the presence of the bacteria by
inputting the details of a patient, including the symptoms, current condition, and
medical history.
o The ES will need a questionnaire to be filled by the patient to know the general
information about the patient, such as gender, age, etc.
o Now the system has collected all the information, so it will find the solution for the
problem by applying if-then rules using the inference engine and using the facts
stored within the KB.
o In the end, it will provide a response to the patient by using the user interface.
86
Before using any technology, we must have an idea about why to use that technology and
hence the same for the ES. Although we have human experts in every field, then what is
the need to develop a computer-based system. So below are the points that are describing
the need of the ES:
1. No memory Limitations: It can store as much data as required and can memorize
it at the time of its application. But for human experts, there are some limitations to
memorize all things at every time.
2. High Efficiency: If the knowledge base is updated with the correct knowledge, then
it provides a highly efficient output, which may not be possible for a human.
3. Expertise in a domain: There are lots of human experts in each domain, and they
all have different skills, different experiences, and different skills, so it is not easy to
get a final output for the query. But if we put the knowledge gained from human
experts into the expert system, then it provides an efficient output by mixing all the
facts and knowledge
4. Not affected by emotions: These systems are not affected by human emotions
such as fatigue, anger, depression, anxiety, etc.. Hence the performance remains
constant.
5. High security: These systems provide high security to resolve any query.
87
6. Considers all the facts: To respond to any query, it checks and considers all the
available facts and provides the result accordingly. But it is possible that a human
expert may not consider some facts due to any reason.
7. Regular updates improve the performance: If there is an issue in the result
provided by the expert systems, we can improve the performance of the system by
updating the knowledge base.
o Advising: It is capable of advising the human being for the query of any domain
from the particular ES.
o Provide decision-making capabilities: It provides the capability of decision
making in any domain, such as for making any financial decision, decisions in
medical science, etc.
o Demonstrate a device: It is capable of demonstrating any new products such as its
features, specifications, how to use that product, etc.
o Problem-solving: It has problem-solving capabilities.
o Explaining a problem: It is also capable of providing a detailed description of an
input problem.
o Interpreting the input: It is capable of interpreting the input given by the user.
o Predicting results: It can be used for the prediction of a result.
o Diagnosis: An ES designed for the medical field is capable of diagnosing a disease
without using multiple components as it already contains various inbuilt medical
tools.
88
o Like a human being, it cannot produce a creative output for different scenarios.
o Its maintenance and development costs are very high.
o Knowledge acquisition for designing is much difficult.
o For each domain, we require a specific ES, which is one of the big limitations.
o It cannot learn from itself and hence requires manual updates.
89