Inference in First Order Logic
Inference in First Order Logic
JSS MAHAVIDYAPEETHA
o Universal Generalization
o Universal Instantiation
o Existential Instantiation
o Existential introduction
1. Universal Generalization:
Universal generalization is a valid inference rule which states that if premise P(c)
is true for any arbitrary element c in the universe of discourse, then we can have
a conclusion as ∀ x P(x).
1|Page
Module 4: Inference in First Order Logic
2. Universal Instantiation
So from this information, we can infer any of the following statements using Universal
Instantiation:
3. Existential Instantiation:
2|Page
Module 4: Inference in First Order Logic
o This rule states that one can infer P(c) from the formula given in the form of ∃x P(x)
for a new constant symbol c.
o The restriction with this rule is that c used in the rule must be a new term for which
P(c ) is true.
Example:
So we can infer: Crown(K) ∧ OnHead( K, John), as long as K does not appear in the
knowledge base.
In the rule for Existential Instantiation, the variable is replaced by a single new constant
symbol.
The formal statement is as follows: for any sentence α, variable v, and constant symbol k
that does not appear elsewhere in the knowledge base,
∃ v α /SUBST({v/k}, α) .
Crown(C1) ∧ OnHead(C1, John) as long as C1 does not appear elsewhere in the knowledge
base.
Basically, the existential sentence says there is some object satisfying a condition, and
applying the existential instantiation rule just gives a name to that object. Of course, that
name must not already belong to another object
4. Existential introduction
3|Page
Module 4: Inference in First Order Logic
o This rule states that if there is some element c in the universe of discourse which has
a property P, then we can infer that there exists something in the universe which has
the property P.
4|Page
Module 4: Inference in First Order Logic
lifted version of Modus Ponens—it raises Modus Ponens from ground (variable-free)
propositional logic to first-order logic. The key advantage of lifted inference rules over
propositionalization is that they make only those substitutions that are required to allow
particular inferences to proceed.
Unification:
Unification means making expressions looks identical. This can be done with the process of
substitution.
Lifted inference rules require finding substitutions that make different logical expressions
look identical. This process is called unification and is a key component of all first-order
inference algorithms. The UNIFY algorithm takes two sentences and returns a unifier for
them if one exists:
UNIFY(p, q)= θ where SUBST(θ, p)= SUBST(θ, q) .
Some examples of how UNIFY should behave.
A query AskVars(Knows(John, x)): whom does John know? Answers to this query can be
found by finding all sentences in the knowledge base that unify with Knows(John, x).
The results of unification with four different sentences that might be in the knowledge base:
UNIFY(Knows(John, x), Knows(John, Jane)) = {x/Jane}
UNIFY(Knows(John, x), Knows(y, Bill)) = {x/Bill, y/John}
UNIFY(Knows(John, x), Knows(y, Mother (y))) = {y/John, x/Mother (John)}
UNIFY(Knows(John, x), Knows(x,Elizabeth)) = fail .
The last unification fails because x cannot take on the values John and Elizabeth at the same
time.
Knows(x,Elizabeth) means “Everyone knows Elizabeth,” infer that John knows Elizabeth.
The problem arises only because the two sentences happen to use the same variable name,
x. The problem can be avoided by standardizing apart one of the two sentences being
unified, which means renaming its variables to avoid name clashes. For example, rename x
in Knows(x,Elizabeth) to x1 (a new variable name) without changing its meaning. Now the
unification will work: UNIFY(Knows(John, x), Knows(x1,Elizabeth)) = {x/Elizabeth,
x1/John} .
UNIFY should return a substitution that makes the two arguments look the same. But there
could be more than one such unifier. For example, UNIFY(Knows(John, x), Knows(y, z))
could return {y/John, x/z} or {y/John, x/John, z/John}. The first unifier gives Knows(John,
z) as the result of unification, whereas the second gives Knows(John, John). The second
result could be obtained from the first by an additional substitution {z/John}; we say that
the first unifier is more general than the second, because it places fewer restrictions on the
5|Page
Module 4: Inference in First Order Logic
values of the variables. It turns out that, for every unifiable pair of expressions, there is a
single most general unifier (or MOST GENERAL UNIFIER MGU) that is unique up to
renaming and substitution of variables. (For example, {x/John} and {y/John} are
considered equivalent, as are {x/John, y/John} and {x/John, y/x}.) In this case it is {y/John,
x/z}.
An algorithm for computing most general unifiers
The algorithm works by comparing the structures of the inputs, element by element. The
substitution θ that is the argument to UNIFY is built up along the way and is used to make
sure that later comparisons are consistent with bindings that were established earlier. In a
compound expression such as F(A, B), the OP field picks out the function symbol F and the
ARGS field picks out the argument list (A, B).
The process is simple: recursively explore the two expressions simultaneously “side by
side,” building up a unifier along the way, but failing if two corresponding points in the
structures do not match. There is one expensive step: when matching a variable against a
complex term, one must check whether the variable itself occurs inside the term; if it does,
the match fails because no consistent unifier can be constructed. For example, S(x) can’t
unify with S(S(x)). This so called occur check makes the complexity of the entire algorithm
quadratic in the size of the expressions being unified.
What is Unification?
6|Page
Module 4: Inference in First Order Logic
Substitution θ = {John/x} is a unifier for these atoms and applying this substitution, and
both expressions will be identical.
o The UNIFY algorithm is used for unification, which takes two atomic sentences and
returns a unifier for those sentences (If any exist).
o Unification is a key component of all first-order inference algorithms.
o It returns fail if the expressions do not match with each other.
o The substitution variables are called Most General Unifier or MGU.
E.g. Let's say there are two different expressions, P(x, y), and P(a, f(z)).
In this example, we need to make both above statements identical to each other. For this,
we will perform the substitution.
o Substitute x with a, and y with f(z) in the first expression, and it will be represented
as a/x and f(z)/y.
o With both the substitutions, the first expression will be identical to the second
expression and the substitution set will be: [a/x, f(z)/y].
7|Page
Module 4: Inference in First Order Logic
Unification Algorithm:
8|Page
Module 4: Inference in First Order Logic
For each pair of the following atomic sentences find the most general unifier (If
exist).
5. Find the MGU of Q(a, g(x, a), f(y)), Q(a, g(f(b), a), x)}
9|Page
Module 4: Inference in First Order Logic
SUBST θ= {b/y}
S1 => {Q(a, g(f(b), a), f(b)); Q(a, g(f(b), a), f(b))}, Successfully Unified.
10 | P a g e
Module 4: Inference in First Order Logic
those facts that unify with the query. For other queries, such as Employs(IBM , y), we would
need to have indexed the facts by combining the predicate with the first argument.
Therefore, facts can be stored under multiple index keys, rendering them instantly
accessible to various queries that they might unify with.
Given a sentence to be stored, it is possible to construct indices for all possible queries that
unify with it. For the fact Employs(IBM , Richard), the queries are
Employs(IBM , Richard) Does IBM employ Richard?
Employs(x, Richard) Who employs Richard?
Employs(IBM , y) Whom does IBM employ?
Employs(x, y) Who employs whom?
These queries form a subsumption lattice.
(a) The subsumption lattice whose lowest node lattice for the is
Employs ( IBM, Richard).
(b) The subsumption sentence Employs (John, John).
The lattice has some interesting properties. For example, the child of any node in the lattice
is obtained from its parent by a single substitution; and the “highest” common descendant
of any two nodes is the result of applying their most general unifier.
Forward Chaining:
A forward-chaining algorithm is simple: start with the atomic sentences in the knowledge
base and apply Modus Ponens in the forward direction, adding new atomic sentences, until
no further inferences can be made.
Consider the following problem: The law says that it is a crime for an American to sell
weapons to hostile nations. The country Nono, an enemy of America, has some missiles, and
all of its missiles were sold to it by Colonel West, who is American. Prove that West is a
criminal.
First, represent these facts as first-order definite clauses.
“. . . it is a crime for an American to sell weapons to hostile nations”:
American(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ Hostile(z) ⇒ Criminal(x) . (1)
11 | P a g e
Module 4: Inference in First Order Logic
12 | P a g e
Module 4: Inference in First Order Logic
The first forward-chaining algorithm: Starting from the known facts, it triggers all the rules
whose premises are satisfied, adding their conclusions to the known facts. The process
repeats until the query is answered (assuming that just one answer is required) or no new
Example:
"As per the law, it is a crime for an American to sell weapons to hostile nations.
Country A, an enemy of America, has some missiles, and all the missiles were sold to
it by Robert, who is an American citizen."
13 | P a g e
Module 4: Inference in First Order Logic
To solve the above problem, first, we will convert all the above facts into first-order definite
clauses, and then we will use a forward-chaining algorithm to reach the goal.
Step-1:
In the first step we will start with the known facts and will choose the sentences which do
not have implications, such as: American(Robert), Enemy(A, America), Owns(A, T1),
and Missile(T1). All these facts will be represented as below.
Step-2:
14 | P a g e
Module 4: Inference in First Order Logic
At the second step, we will see those facts which infer from available facts and with satisfied
premises.
Rule-(1) does not satisfy premises, so it will not be added in the first iteration.
Rule-(4) satisfy with the substitution {p/T1}, so Sells (Robert, T1, A) is added, which
infers from the conjunction of Rule (2) and (3).
Rule-(6) is satisfied with the substitution(p/A), so Hostile(A) is added and which infers
from Rule-(7).
At step-3, as we can check Rule-(1) is satisfied with the substitution {p/Robert, q/T1, r/A},
so we can add Criminal(Robert) which infers all the available facts. And hence we
reached our goal statement.
Notice that no new inferences are possible because every sentence that could be concluded
by forward chaining is already contained explicitly in the KB. Such a knowledge base is
called a fixed point of the inference process. Fixed points reached by forward chaining with
15 | P a g e
Module 4: Inference in First Order Logic
first-order definite clauses are similar to those for propositional forward chaining. The
principal difference is that a first order fixed point can include universally quantified atomic
sentences.
Performance Analysis
Second, it is complete for definite clause knowledge bases; that is, it answers every query
whose answers are entailed by any knowledge base of definite clauses. For Datalog
knowledge bases, which contain no function symbols, the proof of completeness is fairly
easy.
The forward-chaining algorithm is designed for ease of understanding rather than for
efficiency of operation. There are three possible sources of inefficiency.
First, the “inner loop” of the algorithm involves finding all possible unifiers such that the
premise of a rule unifies with a suitable set of facts in the knowledge base. This is often
called pattern matching and can be very expensive.
Second, the algorithm rechecks every rule on every iteration to see whether its premises
are satisfied, even if very few additions are made to the knowledge base on each iteration.
Finally, the algorithm might generate many facts that are irrelevant to the goal. We address
each of these issues in turn.
Backward chaining
These algorithms work backward from the goal, chaining through rules to find known facts
that support the proof. Backward-chaining is also known as a backward deduction or
backward reasoning method when using an inference engine. A backward chaining
algorithm is a form of reasoning, which starts with the goal and works backward, chaining
through rules to find known facts that support the goal.
A backward-chaining algorithm
16 | P a g e
Module 4: Inference in First Order Logic
FOL-BC-ASK(KB, goal) will be proved if the knowledge base contains a clause of the form
lhs ⇒ goal, where lhs (left-hand side) is a list of conjuncts. An atomic fact like
American(West) is considered as a clause whose lhs is the empty list. Implement FOL-BC-
ASK as a generator— a function that returns multiple times, each time giving one possible
result.
Backward chaining is a kind of AND/OR search—the OR part because the goal query can be
proved by any rule in the knowledge base, and the AND part because all the conjuncts in
the lhs of a clause must be proved. FOL-BC-OR works by fetching all clauses that might unify
with the goal, standardizing the variables in the clause to be brand-new variables, and then,
if the rhs of the clause does indeed unify with the goal, proving every conjunct in the lhs,
using FOL-BC-AND. That function in turn works by proving each of the conjuncts in turn,
keeping track of the accumulated substitution.
17 | P a g e
Module 4: Inference in First Order Logic
o The backward-chaining method mostly used a depth-first search strategy for proof.
Proof tree constructed by backward chaining to prove that West is a criminal. The tree
should be read depth first, left to right. To prove Criminal(West), we have to prove the four
conjuncts below it. Some of these are in the knowledge base, and others require further
backward chaining. Bindings for each successful unification are shown next to the
corresponding subgoal. Note that once one subgoal in a conjunction succeeds, its
substitution is applied to subsequent subgoals. Thus, by the time FOL-BC-ASK gets to the
last conjunct, originally Hostile(z), z is already bound to Nono.
Backward-Chaining proof:
In Backward chaining, we will start with our goal predicate, which is Criminal(Robert),
and then infer further rules.
Step-1:
At the first step, we will take the goal fact. And from the goal fact, we will infer other facts,
and at last, we will prove those facts true. So our goal fact is "Robert is Criminal," so
following is the predicate of it.
Step-2:
18 | P a g e
Module 4: Inference in First Order Logic
At the second step, we will infer other facts form goal fact which satisfies the rules. So as
we can see in Rule-1, the goal predicate Criminal (Robert) is present with substitution
{Robert/P}. So we will add all the conjunctive facts below the first level and will replace p
with Robert.
Step-3:t At step-3, we will extract further fact Missile(q) which infer from Weapon(q), as it
satisfies Rule-(5). Weapon (q) is also true with the substitution of a constant T1 at q.
Step-4:
At step-4, we can infer facts Missile(T1) and Owns(A, T1) form Sells(Robert, T1, r) which
satisfies the Rule- 4, with the substitution of A in place of r. So these two statements are
proved here.
19 | P a g e
Module 4: Inference in First Order Logic
Step-5: At step-5, we can infer the fact Enemy(A, America) from Hostile(A) which
satisfies Rule- 6. And hence all the statements are proved true using backward chaining.
Following is the difference between the forward chaining and backward chaining:
o Forward chaining as the name suggests, start from the known facts and move
forward by applying inference rules to extract more data, and it continues until it
reaches to the goal, whereas backward chaining starts from the goal, move backward
by using inference rules to determine the facts that satisfy the goal.
o Forward chaining is called a data-driven inference technique, whereas backward
chaining is called a goal-driven inference technique.
o Forward chaining is known as the down-up approach, whereas backward chaining
is known as a top-down approach.
o Forward chaining uses breadth-first search strategy, whereas backward chaining
uses depth-first search strategy.
o Forward and backward chaining both applies Modus ponens inference rule.
o Forward chaining can be used for tasks such as planning, design process
monitoring, diagnosis, and classification, whereas backward chaining can be used
for classification and diagnosis tasks.
o Forward chaining can be like an exhaustive search, whereas backward chaining tries
to avoid the unnecessary path of reasoning.
o In forward-chaining there can be various ASK questions from the knowledge base,
whereas in backward chaining there can be fewer ASK questions.
20 | P a g e
Module 4: Inference in First Order Logic
o Forward chaining is slow as it checks for all the rules, whereas backward chaining is
fast as it checks few required rules only.
5. Forward chaining tests for all the Backward chaining only tests for few
available rules required rules.
9. Forward chaining is aimed for any Backward chaining is only aimed for
conclusion. the required data.
Resolution in FOL
21 | P a g e
Module 4: Inference in First Order Logic
Resolution
Resolution is a theorem proving technique that proceeds by building refutation proofs, i.e.,
proofs by contradictions
Resolution is used, if there are various statements are given, and we need to prove a
conclusion of those statements. Unification is a key concept in proofs by resolutions.
Resolution is a single inference rule which can efficiently operate on the conjunctive
normal form or clausal form.
Conjunctive Normal Form: A sentence represented as a conjunction of clauses is said to
be conjunctive normal form or CNF.
The resolution rule for first-order logic is simply a lifted version of the propositional rule.
Resolution can resolve two clauses if they contain complementary literals, which are
assumed to be standardized apart so that they share no variables.
This rule is also called the binary resolution rule because it only resolves exactly two
literals.
Example:
Where two complimentary literals are: Loves (f(x), x) and ¬ Loves (a, b)
These literals can be unified with unifier θ= [a/f(x), and b/x] , and it will generate a
resolvent clause:
22 | P a g e
Module 4: Inference in First Order Logic
Example:
a. John likes all kind of food.
b. Apple and vegetable are food
c. Anything anyone eats and not killed is food.
d. Anil eats peanuts and still alive
e. Harry eats everything that Anil eats.
Prove by resolution that:
f. John likes peanuts.
In the first step we will convert all the given statements into its first order logic.
In First order logic resolution, it is required to convert the FOL into CNF as CNF form makes
easier for resolution proofs.
23 | P a g e
Module 4: Inference in First Order Logic
g. ∀x ¬ alive(x) V ¬ killed(x)
h. likes(John, Peanuts).
24 | P a g e
Module 4: Inference in First Order Logic
g. ¬ eats(Anil, w) V eats(Harry, w)
h. killed(g) V alive(g)
i. ¬ alive(k) V ¬ killed(k)
j. likes(John, Peanuts).
o Distribute conjunction ∧ over disjunction ¬.
This step will not make any change in this problem.
o In this statement, we will apply negation to the conclusion statements, which will be
written as ¬likes(John, Peanuts)
Now in this step, we will solve the problem by resolution tree using substitution. For the
above problem, it will be given as follows:
Hence the negation of the conclusion has been proved as a complete contradiction with the
given set of statements.
25 | P a g e
Module 4: Inference in First Order Logic
o In the second step of the resolution graph, ¬ food(Peanuts) , and food(z) get
resolved (canceled) by substitution of { Peanuts/z}, and we are left with ¬ eats(y,
Peanuts) V killed(y) .
o In the third step of the resolution graph, ¬ eats(y, Peanuts) and eats (Anil,
Peanuts) get resolved by substitution {Anil/y}, and we are left with Killed(Anil) .
o In the fourth step of the resolution graph, Killed(Anil) and ¬ killed(k) get resolve
by substitution {Anil/k}, and we are left with ¬ alive(Anil) .
o In the last step of the resolution graph ¬ alive(Anil) and alive(Anil) get resolved.
26 | P a g e