0% found this document useful (0 votes)
155 views

Ai Final

The document discusses different techniques for knowledge representation in artificial intelligence, including logical representation, semantic network representation, frame representation, and production rule representation. It provides details on each technique, such as how knowledge is structured and represented, examples to illustrate each technique, and advantages and disadvantages of each approach. The key goal of knowledge representation is to represent information about the real world in a way that enables an AI system to utilize that knowledge to solve problems and behave intelligently.

Uploaded by

Madni Fareed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
155 views

Ai Final

The document discusses different techniques for knowledge representation in artificial intelligence, including logical representation, semantic network representation, frame representation, and production rule representation. It provides details on each technique, such as how knowledge is structured and represented, examples to illustrate each technique, and advantages and disadvantages of each approach. The key goal of knowledge representation is to represent information about the real world in a way that enables an AI system to utilize that knowledge to solve problems and behave intelligently.

Uploaded by

Madni Fareed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Knowledge and Reasoning

Knowledge: Knowledge is awareness or familiarity gained by experiences of facts, data, and


situations.

The reasoning is the mental process of deriving logical conclusion and making predictions from
available knowledge, facts, and beliefs. Or we can say, "Reasoning is a way to infer facts from
existing data." It is a general process of thinking rationally, to find valid conclusions.

In artificial intelligence, the reasoning is essential so that the machine can also think rationally
as a human brain, and can perform like a human."

What is knowledge representation?

Humans are best at understanding, reasoning, and interpreting knowledge. Human knows
things, which is knowledge and as per their knowledge they perform various actions in the real
world. But how machines do all these things comes under knowledge representation and
reasoning. Hence we can describe Knowledge representation as following:

Knowledge representation and reasoning (KR, KRR) is the part of Artificial intelligence which
concerned with AI agents thinking and how thinking contributes to intelligent behavior of
agents.

It is responsible for representing information about the real world so that a computer can
understand and can utilize this knowledge to solve the complex real world problems such as
diagnosis a medical condition or communicating with humans in natural language.

It is also a way which describes how we can represent knowledge in artificial intelligence.
Knowledge representation is not just storing data into some database, but it also enables an
intelligent machine to learn from that knowledge and experiences so that it can behave
intelligently like a human.

What to Represent:

Following are the kind of knowledge which needs to be represented in AI systems:

Object: All the facts about objects in our world domain. E.g., Guitars contains strings, trumpets
are brass instruments.

Events: Events are the actions which occur in our world.

Performance: It describes behavior which involves knowledge about how to do things.

Meta-knowledge: It is knowledge about what we know.


Facts: Facts are the truths about the real world and what we represent.

Knowledge-Base: The central component of the knowledge-based agents is the knowledge


base. It is represented as KB. The Knowledgebase is a group of the Sentences (Here, sentences
are used as a technical term and not identical with the English language).

Techniques of knowledge representation

There are mainly four ways of knowledge representation which are given as follows:

 Logical Representation
 Semantic Network Representation
 Frame Representation
 Production Rules

1. Logical Representation

Logical representation is a language with some concrete rules which deals with propositions
and has no ambiguity in representation. Logical representation means drawing a conclusion
based on various conditions. This representation lays down some important communication
rules. It consists of precisely defined syntax and semantics which supports the sound inference.
Each sentence can be translated into logics using syntax and semantics.

Syntax:

 Syntaxes are the rules which decide how we can construct legal sentences in the logic.
 It determines which symbol we can use in knowledge representation.
 How to write those symbols.

Semantics:

 Semantics are the rules by which we can interpret the sentence in the logic.
 Semantic also involves assigning a meaning to each sentence.

Logical representation can be categorized into mainly two logics:

1. Propositional Logics
2. Predicate logics

Advantages of logical representation:

 Logical representation enables us to do logical reasoning.


 Logical representation is the basis for the programming languages.
Disadvantages of logical Representation:

 Logical representations have some restrictions and are challenging to work with.
 Logical representation technique may not be very natural, and inference may not be so
efficient.

2. Semantic Network Representation

Semantic networks are alternative of predicate logic for knowledge representation. In Semantic
networks, we can represent our knowledge in the form of graphical networks. This network
consists of nodes representing objects and arcs which describe the relationship between those
objects. Semantic networks can categorize the object in different forms and can also link those
objects. Semantic networks are easy to understand and can be easily extended.

This representation consists of mainly two types of relations:

1. IS-A relation (Inheritance)


2. Kind-of-relation

Example: Following are some statements which we need to represent in the form of nodes and
arcs.

Statements:

Jerry is a cat.

Jerry is a mammal

Jerry is owned by Priya.

Jerry is brown colored.

All Mammals are animal.


In the above diagram, we have represented the different type of knowledge in the form of
nodes and arcs. Each object is connected with another object by some relation.

Drawbacks in Semantic representation:

Semantic networks take more computational time at runtime as we need to traverse the
complete network tree to answer some questions. It might be possible in the worst case
scenario that after traversing the entire tree, we find that the solution does not exist in this
network.

Semantic networks try to model human-like memory (Which has 1015 neurons and links) to
store the information, but in practice, it is not possible to build such a vast semantic network.

These types of representations are inadequate as they do not have any equivalent quantifier,
e.g., for all, for some, none, etc.

Semantic networks do not have any standard definition for the link names.

These networks are not intelligent and depend on the creator of the system.

Advantages of Semantic network:

Semantic networks are a natural representation of knowledge.

Semantic networks convey meaning in a transparent manner.

These networks are simple and easily understandable.


3. Frame Representation

A frame is a record like structure which consists of a collection of attributes and its values to
describe an entity in the world. Frames are the AI data structure which divides knowledge into
substructures by representing stereotypes situations. It consists of a collection of slots and slot
values. These slots may be of any type and sizes. Slots have names and values which are called
facets.

Facets: The various aspects of a slot is known as Facets. Facets are features of frames which
enable us to put constraints on the frames.

Example: IF-NEEDED facts are called when data of any particular slot is needed. A frame may
consist of any number of slots, and a slot may include any number of facets and facets may
have any number of values. A frame is also known as slot-filter knowledge representation in
artificial intelligence.

Frames are derived from semantic networks and later evolved into our modern-day classes and
objects. A single frame is not much useful. Frames system consists of a collection of frames
which are connected. In the frame, knowledge about an object or event can be stored together
in the knowledge base. The frame is a type of technology which is widely used in various
applications including Natural language processing and machine visions.

Example: 1

Let's take an example of a frame for a book

Slots Filters

Title Artificial Intelligence

Genre Computer Science

Author Peter Norvig

Edition Third Edition

Year 1996

Page 1152

Example 2:
Let's suppose we are taking an entity, Peter. Peter is an engineer as a profession, and his age is
25, he lives in city London, and the country is England. So following is the frame representation
for this:

Slots Filter

Name Peter

Profession Doctor

Age 25

Marital status Single

Weight 78

Advantages of frame representation:

The frame knowledge representation makes the programming easier by grouping the related
data.

The frame representation is comparably flexible and used by many applications in AI.

It is very easy to add slots for new attribute and relations.

It is easy to include default data and to search for missing values.

Frame representation is easy to understand and visualize.

Disadvantages of frame representation:

In frame system inference mechanism is not be easily processed.

Inference mechanism cannot be smoothly proceeded by frame representation.

Frame representation has a much generalized approach.

4. Production Rules

Production rules system consist of (condition, action) pairs which mean, "If condition then
action". It has mainly three parts:

 The set of production rules


 Working Memory
 The recognize-act-cycle
In production rules agent checks for the condition and if the condition exists then production
rule fires and corresponding action is carried out. The condition part of the rule determines
which rule may be applied to a problem. And the action part carries out the associated
problem-solving steps. This complete process is called a recognize-act cycle.

The working memory contains the description of the current state of problems-solving and rule
can write knowledge to the working memory. This knowledge match and may fire other rules.

If there is a new situation (state) generates, then multiple production rules will be fired
together, this is called conflict set. In this situation, the agent needs to select a rule from these
sets, and it is called a conflict resolution.

Example:

IF (at bus stop AND bus arrives) THEN action (get into the bus)

IF (on the bus AND paid AND empty seat) THEN action (sit down).

IF (on bus AND unpaid) THEN action (pay charges).

IF (bus arrives at destination) THEN action (get down from the bus).

Advantages of Production rule:

The production rules are expressed in natural language.

The production rules are highly modular, so we can easily remove, add or modify an individual
rule.

Disadvantages of Production rule:

Production rule system does not exhibit any learning capabilities, as it does not store the result
of the problem for the future uses.

During the execution of the program, many rules may be active hence rule-based production
systems are inefficient.

Propositional logic in Artificial intelligence

Propositional logic (PL) is the simplest form of logic where all the statements are made by
propositions. A proposition is a declarative statement which is either true or false. It is a
technique of knowledge representation in logical and mathematical form.
Example:

a) It is Sunday.

b) The Sun rises from West (False proposition)

c) 3+3= 7(False proposition)

d) 5 is a prime number.

Following are some basic facts about propositional logic:

 Propositional logic is also called Boolean logic as it works on 0 and 1.


 In propositional logic, we use symbolic variables to represent the logic, and we can use
any symbol for a representing a proposition, such A, B, C, P, Q, R, etc.
 Propositions can be either true or false, but it cannot be both.
 Propositional logic consists of an object, relations or function, and logical connectives.
 These connectives are also called logical operators.
 The propositions and connectives are the basic elements of the propositional logic.
 Connectives can be said as a logical operator which connects two sentences.
 A proposition formula which is always true is called tautology, and it is also called a valid
sentence.
 A proposition formula which is always false is called Contradiction.
 A proposition formula which has both true and false values is called Statements which
are questions, commands, or opinions are not propositions such as "Where is Rohini",
"How are you", "What is your name", are not propositions.

Syntax of propositional logic:

The syntax of propositional logic defines the allowable sentences for the knowledge
representation. There are two types of Propositions:

 Atomic Propositions
 Compound propositions

Atomic Proposition: Atomic propositions are the simple propositions. It consists of a single
proposition symbol. These are the sentences which must be either true or false.

Example:

a) 2+2 is 4, it is an atomic proposition as it is a true fact.

b) "The Sun is cold" is also a proposition as it is a false fact.


Compound proposition: Compound propositions are constructed by combining simpler or
atomic propositions, using parenthesis and logical connectives.

Example:

a) "It is raining today, and street is wet."

b) "Ali is a doctor, and his clinic is in Mumbai."

Logical Connectives:

Logical connectives are used to connect two simpler propositions or representing a sentence
logically. We can create compound propositions with the help of logical connectives. There are
mainly five connectives, which are given as follows:

Negation: A sentence such as ¬ P is called negation of P. A literal can be either Positive literal or
negative literal.

Conjunction: A sentence which has ∧ connective such as, P ∧ Q is called a conjunction.

Example: Rohan is intelligent and hardworking. It can be written as,

P= Rohan is intelligent,

Q= Rohan is hardworking. → P∧ Q.

Disjunction: A sentence which has ∨ connective, such as P ∨ Q. is called disjunction, where P


and Q are the propositions.

Example: "Ritika is a doctor or Engineer",

Here P= Ritika is Doctor. Q= Ritika is Doctor, so we can write it as P ∨ Q.

Implication: A sentence such as P → Q, is called an implica on. Implica ons are also known as
if-then rules. It can be represented as

If it is raining, then the street is wet.

Let P= It is raining, and Q= Street is wet, so it is represented as P → Q

Biconditional: A sentence such as P⇔ Q is a Biconditional sentence, example If I am breathing,


then I am alive

P= I am breathing, Q= I am alive, it can be represented as P ⇔ Q.

Following is the summarized table for Propositional Logic Connectives:


Truth Table:

In propositional logic, we need to know the truth values of propositions in all possible
scenarios. We can combine all the possible combination with logical connectives, and the
representation of these combinations in a tabular format is called Truth table. Following are the
truth table for all logical connectives:
The Wumpus World in Artificial intelligence

The Wumpus world is a simple world example to illustrate the worth of a knowledge-based
agent and to represent knowledge representation. It was inspired by a video game Hunt the
Wumpus by Gregory Yob in 1973.

The Wumpus world is a cave which has 4/4 rooms connected with passageways. So there are
total 16 rooms which are connected with each other. We have a knowledge-based agent who
will go forward in this world. The cave has a room with a beast which is called Wumpus, who
eats anyone who enters the room. The Wumpus can be shot by the agent, but the agent has a
single arrow. In the Wumpus world, there are some Pits rooms which are bottomless, and if
agent falls in Pits, then he will be stuck there forever. The exciting thing with this cave is that in
one room there is a possibility of finding a heap of gold. So the agent goal is to find the gold and
climb out the cave without fallen into Pits or eaten by Wumpus. The agent will get a reward if
he comes out with gold, and he will get a penalty if eaten by Wumpus or falls in the pit.

Note: Here Wumpus is static and cannot move.

Following is a sample diagram for representing the Wumpus world. It is showing some rooms
with Pits, one room with Wumpus and one agent at (1, 1) square location of the world."

 There are also some components which can help the agent to navigate the cave. These
components are given as follows:
 The rooms adjacent to the Wumpus room are smelly, so that it would have some
stench.
 The room adjacent to PITs has a breeze, so if the agent reaches near to PIT, then he will
perceive the breeze.
 There will be glitter in the room if and only if the room has gold.
 The Wumpus can be killed by the agent if the agent is facing to it, and Wumpus will emit
a horrible scream which can be heard anywhere in the cave.

PEAS description of Wumpus world:

To explain the Wumpus world we have given PEAS description as below:

Performance measure:

 +1000 reward points if the agent comes out of the cave with the gold.
 -1000 points penalty for being eaten by the Wumpus or falling into the pit.
 -1 for each action, and -10 for using an arrow.
 The game ends if either agent dies or came out of the cave.

Environment:

 A 4*4 grid of rooms.


 The agent initially in room square [1, 1], facing toward the right.
 Location of Wumpus and gold are chosen randomly except the first square [1,1].
 Each square of the cave can be a pit with probability 0.2 except the first square.

Actuators:

 Left turn,
 Right turn
 Move forward
 Grab
 Release
 Shoot.

Sensors:

 The agent will perceive the stench if he is in the room adjacent to the Wumpus. (Not
diagonally).
 The agent will perceive breeze if he is in the room directly adjacent to the Pit.
 The agent will perceive the glitter in the room where the gold is present.
 The agent will perceive the bump if he walks into a wall.
 When the Wumpus is shot, it emits a horrible scream which can be perceived anywhere
in the cave.

These percepts can be represented as five element list, in which we will have different
indicators for each sensor.

Example if agent perceives stench, breeze, but no glitter, no bump, and no scream then it can
be represented as:

[Stench, Breeze, None, None, None].

The Wumpus world Properties:

Partially observable: The Wumpus world is partially observable because the agent can only
perceive the close environment such as an adjacent room.

Deterministic: It is deterministic, as the result and outcome of the world are already known.

Sequential: The order is important, so it is sequential.

Static: It is static as Wumpus and Pits are not moving.

Discrete: The environment is discrete.

One agent: The environment is a single agent as we have one agent only and Wumpus is not
considered as an agent.

Exploring the Wumpus world:

Now we will explore the Wumpus world and will determine how the agent will find its goal by
applying logical reasoning.

Agent's First step:

Initially, the agent is in the first room or on the square [1,1], and we already know that this
room is safe for the agent, so to represent on the below diagram (a) that room is safe we will
add symbol OK. Symbol A is used to represent agent, symbol B for the breeze, G for Glitter or
gold, V for the visited room, P for pits, W for Wumpus.

At Room [1,1] agent does not feel any breeze or any Stench which means the adjacent squares
are also OK.
"Agent's second Step:

Now agent needs to move forward, so it will either move to [1, 2], or [2,1]. Let's suppose agent
moves to the room [2, 1], at this room agent perceives some breeze which means Pit is around
this room. The pit can be in [3, 1], or [2,2], so we will add symbol P? to say that, is this Pit room?

Now agent will stop and think and will not make any harmful move. The agent will go back to
the [1, 1] room. The room [1,1], and [2,1] are visited by the agent, so we will use symbol V to
represent the visited squares.

Agent's third step:

At the third step, now agent will move to the room [1,2] which is OK. In the room [1,2] agent
perceives a stench which means there must be a Wumpus nearby. But Wumpus cannot be in
the room [1,1] as by rules of the game, and also not in [2,2] (Agent had not detected any stench
when he was at [2,1]). Therefore agent infers that Wumpus is in the room [1,3], and in current
state, there is no breeze which means in [2,2] there is no Pit and no Wumpus. So it is safe, and
we will mark it OK, and the agent moves further in [2,2].

Agent's fourth step:

At room [2,2], here no stench and no breezes present so let's suppose agent decides to move to
[2,3]. At room [2,3] agent perceives glitter, so it should grab the gold and climb out of the
cave."
First-Order Logic in Artificial intelligence

In the topic of Propositional logic, we have seen that how to represent statements using
propositional logic. But unfortunately, in propositional logic, we can only represent the facts,
which are either true or false. PL is not sufficient to represent the complex sentences or natural
language statements. The propositional logic has very limited expressive power. Consider the
following sentence, which we cannot represent using PL logic.

"Some humans are intelligent", or

"Ali likes cricket."

To represent the above statements, PL logic is not sufficient, so we required some more
powerful logic, such as first-order logic.
First-Order logic:

First-order logic is another way of knowledge representation in artificial intelligence. It is an


extension to propositional logic.

FOL is sufficiently expressive to represent the natural language statements in a concise way.

First-order logic is also known as Predicate logic or First-order predicate logic. First-order logic is
a powerful language that develops information about the objects in a more easy way and can
also express the relationship between those objects.

First-order logic (like natural language) does not only assume that the world contains facts like
propositional logic but also assumes the following things in the world:

Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus, ......

Relations: It can be unary relation such as: red, round, is adjacent, or n-any relation such as: the
sister of, brother of, has color, comes between

Function: Father of, best friend, third inning of, end of, ......

As a natural language, first-order logic also has two main parts:

 Syntax
 Semantics

Syntax of First-Order logic:

The syntax of FOL determines which collection of symbols is a logical expression in first-order
logic. The basic syntactic elements of first-order logic are symbols. We write statements in
short-hand notation in FOL.

Basic Elements of First-order logic:

Following are the basic elements of FOL syntax:

 Constant 1, 2, A, John, Mumbai, cat,....


 Variables x, y, z, a, b,....
 Predicates Brother, Father, >,....
 Function sqrt, LeftLegOf, ....
 Connectives ∧, ∨, ¬, ⇒, ⇔
 Equality ==
 Quantifier ∀, ∃
Atomic sentences:

Atomic sentences are the most basic sentences of first-order logic. These sentences are formed
from a predicate symbol followed by a parenthesis with a sequence of terms.

We can represent atomic sentences as Predicate (term1, term2, ......, term n).

Example: Ravi and Ajay are brothers: => Brothers(Ravi, Ajay).

Chinky is a cat: => cat (Chinky).

Complex Sentences:

Complex sentences are made by combining atomic sentences using connectives.

First-order logic statements can be divided into two parts:

 Subject: Subject is the main part of the statement.


 Predicate: A predicate can be defined as a relation, which binds two atoms together in a
statement.

Consider the statement: "x is an integer.", it consists of two parts, the first part x is the subject
of the statement and second part "is an integer," is known as a predicate.

Quantifiers in First-order logic:

A quantifier is a language element which generates quantification, and quantification specifies


the quantity of specimen in the universe of discourse.

These are the symbols that permit to determine or identify the range and scope of the variable
in the logical expression.

There are two types of quantifier:

 Universal Quantifier, (for all, everyone, everything)


 Existential quantifier, (for some, at least one).

Universal Quantifier:

Universal quantifier is a symbol of logical representation, which specifies that the statement
within its range is true for everything or every instance of a particular thing.

The Universal quantifier is represented by a symbol ∀, which resembles an inverted A.

Note: In universal quantifier we use implication "→".


If x is a variable, then ∀x is read as:

For all x

For each x

For every x.

Example:

All man drink coffee.

Let a variable x which refers to a cat so all x can be represented in UOD as below:

∀x man(x) → drink (x, coffee).

It will be read as: There are all x where x is a man who drink coffee.

Existential Quantifier:

Existential quantifiers are the type of quantifiers, which express that the statement within its
scope is true for at least one instance of something.

It is denoted by the logical operator ∃, which resembles as inverted E. When it is used with a
predicate variable then it is called as an existential quantifier.

Note: In Existential quantifier we always use AND or Conjunction symbol (∧).

If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:

There exists a 'x.'

For some 'x.'

For at least one 'x.'

Example:

Some boys are intelligent.

∃x: boys(x) ∧ intelligent(x)

It will be read as: There are some x where x is a boy who is intelligent.

Points to remember:

 The main connective for universal quantifier ∀ is implication →.


 The main connective for existential quantifier ∃ is and ∧.

Properties of Quantifiers:

 In universal quantifier, ∀x∀y is similar to ∀y∀x.


 In Existential quantifier, ∃x∃y is similar to ∃y∃x.
 ∃x∀y is not similar to ∀y∃x.
 Some Examples of FOL using quantifier:

1. All birds fly.

In this question the predicate is "fly(bird)."

And since there are all birds who fly so it will be represented as follows.

∀x bird(x) →fly(x).

2. Every man respects his parent.

In this question, the predicate is "respect(x, y)," where x=man, and y= parent.

Since there is every man so will use ∀, and it will be represented as follows:

∀x man(x) → respects (x, parent).

3. Some boys play cricket.

In this question, the predicate is "play(x, y)," where x= boys, and y= game. Since there are some
boys so we will use ∃, and it will be represented as:

∃x boys(x) → play(x, cricket).

4. Not all students like both Mathematics and Science.

In this question, the predicate is "like(x, y)," where x= student, and y= subject.

Since there are not all students, so we will use ∀ with negation, so following representation for
this:

¬∀ (x) [ student(x) → like(x, Mathema cs) ∧ like(x, Science)].

5. Only one student failed in Mathematics.

In this question, the predicate is "failed(x, y)," where x= student, and y= subject.
Since there is only one student who failed in Mathematics, so we will use following
representation for this:

∃(x) [ student(x) → failed (x, Mathema cs) ∧∀ (y) [¬(x==y) ∧ student(y) → ¬failed (x,
Mathematics)].

Free and Bound Variables:

The quantifiers interact with variables which appear in a suitable way. There are two types of
variables in First-order logic which are given below:

Free Variable: A variable is said to be a free variable in a formula if it occurs outside the scope
of the quantifier.

Example: ∀x ∃(y)[P (x, y, z)], where z is a free variable.

Bound Variable: A variable is said to be a bound variable in a formula if it occurs within the
scope of the quantifier.

Example: ∀x [A (x) B( y)], here x and y are the bound variables."

Forward and Backward Chaining:

Forward chaining starts from known facts and applies inference rule to extract more data unit
it reaches to the goal. It is a bottom-up approach. Forward chaining is known as data-driven
inference technique as we reach to the goal using the available data. Forward chaining
reasoning applies a breadth-first search strategy. Backward chaining reasoning applies a
depth-first search strategy. Forward chaining tests for all the available rules. Forward
chaining is suitable for the planning, monitoring, control, and interpretation application.
Forward chaining can generate an infinite number of possible conclusions. It operates in the
forward direction. Forward chaining is aimed for any conclusion.

Backward chaining starts from the goal and works backward through inference rules to find the
required facts that support the goal. It is a top-down approach. Backward chaining is known as
goal-driven technique as we start from the goal and divide into sub-goal to extract the facts.
Backward chaining only tests for few required rules. Backward chaining is suitable for
diagnostic, prescription, and debugging application. Backward chaining generates a finite
number of possible conclusions. It operates in the backward direction. Backward chaining is
only aimed for the required data."
Semantic network

"A semantic network is a graphic notation for representing knowledge in patterns of


interconnected nodes. Semantic networks became popular in artificial intelligence and natural
language processing only because it represents knowledge or supports reasoning."

"A semantic network is a graphic notation for representing knowledge in patterns of


interconnected nodes. Semantic networks became popular in artificial intelligence and natural
language processing only because it represents knowledge or supports reasoning. These act as
another alternative for predicate logic in a form of knowledge representation.

The structural idea is that knowledge can be stored in the form of graphs, with nodes
representing objects in the world, and arcs representing relationships between those objects.

Semantic nets consist of nodes, links and link labels.

In these networks diagram, nodes appear in form of circles or ellipses or even rectangles which
represents objects such as physical objects, concepts or situations.

Links appear as arrows to express the relationships between objects, and link labels specify
relations.

Relationships provide the basic needed structure for organizing the knowledge, so therefore
objects and relations involved are also not needed to be concrete.
Semantic nets are also referred to as associative nets as the nodes are associated with other
nodes

Semantic Networks Are Majorly Used For

 Representing data
 Revealing structure (relations, proximity, relative importance)
 Supporting conceptual edition
 Supporting navigation

Main Components Of Semantic Networks

Lexical component: nodes denoting physical objects or links are relationships between objects;
labels denote the specific objects and relationships

Structural component: the links or nodes from a diagram which is directed.

Semantic component: Here the definitions are related only to the links and label of nodes,
whereas facts depend on the approval areas.

Procedural part: constructors permit the creation of the new links and nodes. The removal of
links and nodes are permitted by destructors.

Advantages Of Using Semantic Nets

 The semantic network is more natural than the logical representation;


 The semantic network permits using of effective inference algorithm
(graphical algorithm)
 They are simple and can be easily implemented and understood.
 The semantic network can be used as a typical connection application among various
fields of knowledge, for instance, among computer science and anthropology.
 The semantic network permits a simple approach to investigate the problem space.
 The semantic network gives an approach to make the branches of related components
 The semantic network also reverberates with the methods of the people process data.
 The semantic network is characterized by greater cognitive adequacy compared to logic-
based formalism.
 The semantic network has a greater expressiveness compared to logic.

Disadvantages Of Using Semantic Nets

 There is no standard definition for link names


 Semantic Nets are not intelligent, dependent on the creator
 Links are not alike in function or form, confusion in links that asserts relationships and
structural links
 Undistinguished nodes that represent classes and that represents individual objects
 Links on object represent only binary relations

Search tree

In computer science, a search tree is a tree data structure used for locating specific keys from
within a set. In order for a tree to function as a search tree, the key for each node must be
greater than any keys in subtrees on the left, and less than any keys in subtrees on the right.

The advantage of search trees is their efficient search time given the tree is reasonably
balanced, which is to say the leaves at either end are of comparable depths. Various search-
tree data structures exist, several of which also allow efficient insertion and deletion of
elements, which operations then have to maintain tree balance.

Search trees are often used to implement an associative array. The search tree algorithm uses
the key from the key–value pair to find a location, and then the application stores the entire
key– value pair at that particular location.

Types of Trees

Binary search tree

A Binary Search Tree is a node-based data structure where each node contains a key and two
subtrees, the left and right. For all nodes, the left subtree's key must be less than the node's
key, and the right subtree's key must be greater than the node's key. These subtrees must all
qualify as binary search trees.

The worst-case time complexity for searching a binary search tree is the height of the tree,
which can be as small as O(log n) for a tree with n elements.

B-tree

B-trees are generalizations of binary search trees in that they can have a variable number of
subtrees at each node. While child-nodes have a pre-defined range, they will not necessarily be
filled with data, meaning B-trees can potentially waste some space. The advantage is that
Btrees do not need to be re-balanced as frequently as other self-balancing trees.

Due to the variable range of their node length, B-trees are optimized for systems that read
large blocks of data, they are also commonly used in databases.

The time complexity for searching a B-tree is O(log n).


(a,b)-tree

An (a,b)-tree is a search tree where all of its leaves are the same depth. Each node has at least a
children and at most b children, while the root has at least 2 children and at most b children.

The time complexity for searching an (a,b)-tree is O(log n).

Ternary search tree

A ternary search tree is a type of tree that can have 3 nodes: a low child, an equal child, and a
high child. Each node stores a single character and the tree itself is ordered the same way a
binary search tree is, with the exception of a possible third node.

Searching a ternary search tree involves passing in a string to test whether any path contains it.

The time complexity for searching a balanced ternary search tree is O(log n).

Frame

Frames are an artificial intelligence data structure used to divide knowledge into substructures
by representing "stereotyped situations". They were proposed by Marvin Minsky in his 1974
article "A Framework for Representing Knowledge". Frames are the primary data structure used
in artificial intelligence frame language; they are stored as ontologies of sets.

Frames are also an extensive part of knowledge representation and reasoning schemes. They
were originally derived from semantic networks and are therefore part of structure based
knowledge representations. According to Russell and Norvig's "Artificial Intelligence: A Modern
Approach", structural representations assemble " facts about particular object and event types
and arrange the types into a large taxonomic hierarchy analogous to a biological taxonomy".

The frame contains information on how to use the frame, what to expect next, and what to do
when these expectations are not met. Some information in the frame is generally unchanged
while other information, stored in "terminals", usually change. Terminals can be considered as
variables. Top level frames carry information that is always true about the problem in hand,
however, terminals do not have to be true. Their value might change with the new information
encountered. Different frames may share the same terminals.

Each piece of information about a particular frame is held in a slot. The information can
contain:
Frame structure

Facts or Data Values (called facets) Procedures (also called procedural attachments) IF-
NEEDED : deferred evaluation IF-ADDED : updates linked information Default Values For Data
For Procedures Other Frames or Subframes

A frame's terminals are already filled with default values, which is based on how the human
mind works. For example, when a person is told "a boy kicks a ball", most people will visualize a
particular ball (such as a familiar soccer ball) rather than imagining some abstract ball with no
attributes.

One particular strength of frame based knowledge representations is that, unlike semantic
networks, they allow for exceptions in particular instances. This gives frames an amount of
flexibility that allow representations of real world phenomena to be reflected more accurately.

Like semantic networks, frames can be queried using spreading activation. Following the rules
of inheritance, any value given to a slot that is inherited by subframes will be updated
(IFADDED) to the corresponding slots in the subframes and any new instances of a particular
frame will feature that new value as the default.

Features and advantages

Worth noticing here is the easy analogical reasoning (comparison) that can be done between a
boy and a monkey just by having similarly named slots.

Solt Value Types


BOY - (this frame)
ISA Person (parent frame)
SEX Male (instance frame)
AGE Under 12 years (procedural attachment - sets
constraint)
HOME A place (frame)
NUM-LEGS Default=2 (default, inherited from
Person frame)
frame language

A frame language is a technology used for knowledge representation in artificial intelligence.


They are similar to class hierarchies in object-oriented languages although their fundamental
design goals are different. Frames are focused on explicit and intuitive representation of
knowledge whereas objects focus on encapsulation and information hiding. Frames originated
in AI research and objects primarily in software engineering. However, in practice, the
techniques and capabilities of frame and object-oriented languages overlap significantly.
Scripts

A script is a structure that prescribes a set of circumstances which could be expected to follow
on from one another. It is similar to a thought sequence or a chain of situations which could be
anticipated. It could be considered to consist of a number of slots or frames but with more
specialised roles.

Scripts are beneficial because:

Events tend to occur in known runs or patterns.

Causal relationships between events exist.

Entry conditions exist which allow an event to take place

Prerequisites exist upon events taking place. E.g. when a student progresses through a degree
scheme or when a purchaser buys a house.

The components of a script include:

Entry Conditions

-- these must be satisfied before events in the script can occur.

Results

-- Conditions that will be true after events in script occur.

Props

-- Slots representing objects involved in events.

Roles

-- Persons involved in the events.

Track

-- Variations on the script. Different tracks may share components of the same script.

Scenes

-- The sequence of events that occur. Events are represented in conceptual dependency form.

Scripts are useful in describing certain situations such as robbing a bank.

This might involve:


Getting a gun.

Hold up a bank.

Escape with the money.

Here the Props might be

Gun, G.

Loot, L.

Bag, B

Get away car, C.

The Roles might be:

Robber, S.

Cashier, M.

Bank Manager, O.

Policeman, P.

The Entry Conditions might be:

S is poor.

S is destitute.

The Results might be:

S has more money.

O is angry.

M is in a state of shock.

P is shot."

Conceptual Dependency:

In 1977, Roger C. Schank has developed a Conceptual Dependency structure. The Conceptual
Dependency is used to represent knowledge of Artificial Intelligence. It should be powerful
enough to represent these concepts of the sentence of natural language. It states that different
sentence which has the same meaning should have some unique representation.

There are 5 types of states in Conceptual Dependency:

1. Entities

2. Actions

3. Conceptual cases

4. Conceptual dependencies

5. Conceptual tense

Main Goals of Conceptual Dependency:

1. It captures the implicit concept of a sentence and makes it explicit.

2. It helps in drawing inferences from sentences.

3. For any two or more sentences that are identical in meaning. It should be only one
representation of meaning.

4. It provides a means of representation which are language independent.

5. It develops language conversion packages.

Rules of Conceptual Dependency:

Rule-1: It describes the relationship between an actor and the event he or she causes.

Rule-2: It describes the relationship between PP and PA that are asserted to describe it.

Rule-3: It describes the relationship between two PPs, one of which belongs to the set defined
by the other.

Rule-4: It describes the relationship between a PP and an attribute that has already been
predicated on it.

Rule-5: It describes the relationship between two PPs one of which provides a particular kind of
information about the other.

Rule-6: It describes the relationship between an ACT and the PP that is the object of that ACT.
Rule-7: It describes the relationship between an ACT and the source and the recipient of the
ACT.

Rule-8: It describes the relationship between an ACT and the instrument with which it is
performed. This instrument must always be a full conceptualization, not just a single physical
object.

Rule-9: It describes the relationship between an ACT and its physical source and destination.

Rule-10: It represents the relationship between a PP and a state in which it started and another
in which it ended.

Rule-11: It represents the relationship between one conceptualization and another that causes
it.

Rule-12: It represents the relationship between conceptualization and the time at which the
event occurred is described.

Rule-13: It describes the relationship between one conceptualization and another, that is the
time of the first.

Rule-14: It describes the relationship between conceptualization and the place at which it
occurred.
Introduction to Production System in AI

A Production System in AI is a system program that is used to feed some form of artificial
intelligence. It comprises a set of rules to design the characteristic behaviour and involves a
mechanism to obey the rules of the system and respond accordingly. That set of rules is
referred to as production, and it is a fundamental representation for action selection, expert
system, and automated planning. The upcoming section of the articles explains features,
generated rules, advantages, and limitations of the production system in artificial intelligence.

Features of the Production System in AI

The fundamental components of the production system in artificial intelligence includes a


global database, a set of production rules, and a control system. The central data structure is
provided by a global database that is utilized for the operation of the production system as the
global database provides a set of predefined rules. If the defined precondition is accepted, then
the rule is executed. But the implementation of rule changes or updates the database. The
applicable rule is selected by the control system and continues for further computation until
terminating rule on the database is accepted. The control system has the ability to manage
conflicts if multiple rules are executed simultaneously.

The architecture of every sentence in a production system is uniform and simple. The entire
system is unique, as they execute IF-THEN code in every set of executions. It is a source of
knowledge representations and increases the readability of production rules. Hence it is user-
friendly and can be managed without any complexities and difficulties as they are less prone to
challenging tasks.

The code of the production rule and its related knowledge are available in distinct units. So that
the information can be accessed without any dependencies, it is an array of independent facts
that can be edited easily which has no reflections in the production system. The modularity of
the production system possesses a set of finite dimensions that are easily flexible to any
modifications in the system.

The adaptability for editing or altering the rules is easy and enables the enhancement of
production rules in a skeletal format and then selects a concern application that is perfect and
accurate to execute without any delay or imperfections.

The knowledge base of the production system is intensive and doesn’t find any corrupted data
or false information. The data is stored in a pure format and doesn’t comprise of any controlling
strategy or programming information. The production rule is stated in a simple sentence in
English. The semantic problem is rectified by every part of the representation.
Rules of Production System in AI

The rules in the production system fall into two broad categories, such as abductive inference
rules and deductive inference rules. The representation of rules in the production system is an
important part of the functions of the entire system is dependent on rules. The rules are fed
into the operation of database and control system and can be written as follows,

It is based on the IF-THEN condition.

If (condition) then (condition):

It is also called as antecedent-consequent, pair of feedback and results, response to condition


and actions, an act to pattern and action, condition to situation, and response.

Advantages and Disadvantages of the Production System

The representation of rules in the production system is natural and expressed in a simple
format. It has a rapid response to the action cycle, which can recognize and react according to
the separation of control and knowledge. The data or goal-driven is a natural mapping which is
onto research on state space.

The modularity and adaptability of the production rules are efficient and user-friendly. The
flexibility to any modification in the rules is high without affecting the production system.

The production system executes pattern directed control which is more adaptable than
algorithmized control. It enables the exploratory control of search in a hierarchical way if any
complexities occur.

The troubleshooting methods in the production system are reliable, and it takes minimum time
to find the affected parts and provides simple tracing of the systems. It provides a generic
control and informative rules to manage the challenging tasks.

It is a reliable model because of the state-driven attitude of the intelligent machines and
behaves as a reasonable design to the decision making and problem-solving act of humans. It is
robust and provides a rapid response in real-time applications.

Apart from this, the significant features of the production system include incompetence,
opaqueness, lack of learning ability, and resolution of conflicts.

The occurrence of opaqueness is due to the less prioritization of rules. It is executed when
there is any merging or combination of two or more production rules. If the priority of rule is
predetermined, then the probability of opaqueness is less.
Most of the production systems are prone to incompetence in the applied environment. But
well-assembled control methodologies minimize this kind of problem, especially when a
program is executed, multiple rules became active and executed. It happens because there are
many predefined rules in the production system, and a complex search is carried in the
hierarchical method throughout every set of rules for every iteration of a control program.

The production system that depends on the rules doesn’t store the outcomes of the problem,
which helps to solve any future issues. Instead, it goes for every new solution for the same
particular problems and doesn’t exhibit any kind of learning capacities. Hence the lack of
learning capabilities in the production system in artificial intelligence needs to be improvised
for better efficacy and operation.

The rules in the production system should not get involved in any conflict operations. If the
database is updated with new rules, the system should check that there should not be any
conflicts executed between the existing rules and newly updated rules."

What is an Expert System?

An expert system is a computer program that is designed to solve complex problems and to
provide decision-making ability like a human expert. It performs this by extracting knowledge
from its knowledge base using the reasoning and inference rules according to the user queries.

The expert system is a part of AI, and the first ES was developed in the year 1970, which was
the first successful approach of artificial intelligence. It solves the most complex issue as an
expert by extracting the knowledge stored in its knowledge base. The system helps in decision
making for complex problems using both facts and heuristics like a human expert. It is called so
because it contains the expert knowledge of a specific domain and can solve any complex
problem of that particular domain. These systems are designed for a specific domain, such as
medicine, science, etc.

The performance of an expert system is based on the expert's knowledge stored in its
knowledge base. The more knowledge stored in the KB, the more that system improves its
performance. One of the common examples of an ES is a suggestion of spelling errors while
typing in the Google search box.

Below are some popular examples of the Expert System:

DENDRAL: It was an artificial intelligence project that was made as a chemical analysis expert
system. It was used in organic chemistry to detect unknown organic molecules with the help of
their mass spectra and knowledge base of chemistry.
MYCIN: It was one of the earliest backward chaining expert systems that was designed to find
the bacteria causing infections like bacteraemia and meningitis. It was also used for the
recommendation of antibiotics and the diagnosis of blood clotting diseases.

PXDES: It is an expert system that is used to determine the type and level of lung cancer. To
determine the disease, it takes a picture from the upper body, which looks like the shadow. This
shadow identifies the type and degree of harm.

CaDeT: The CaDet expert system is a diagnostic support system that can detect cancer at early
stages.

Characteristics of Expert System

High Performance: The expert system provides high performance for solving any type of
complex problem of a specific domain with high efficiency and accuracy.

Understandable: It responds in a way that can be easily understandable by the user. It can take
input in human language and provides the output in the same way.

Reliable: It is much reliable for generating an efficient and accurate output.

Highly responsive: ES provides the result for any complex query within a very short period of
time.

Components of Expert System

An expert system mainly consists of three components:

 User Interface
 Inference Engine
 Knowledge Base

1. User Interface

With the help of a user interface, the expert system interacts with the user, takes queries as an
input in a readable format, and passes it to the inference engine. After getting the response
from the inference engine, it displays the output to the user. In other words, it is an interface
that helps a non-expert user to communicate with the expert system to find a solution.

2. Inference Engine(Rules of Engine)

The inference engine is known as the brain of the expert system as it is the main processing unit
of the system. It applies inference rules to the knowledge base to derive a conclusion or deduce
new information. It helps in deriving an error-free solution of queries asked by the user.
With the help of an inference engine, the system extracts the knowledge from the knowledge
base.

There are two types of inference engine:

 Deterministic Inference engine: The conclusions drawn from this type of inference
engine are assumed to be true. It is based on facts and rules.
 Probabilistic Inference engine: This type of inference engine contains uncertainty in
conclusions, and based on the probability.

Inference engine uses the below modes to derive the solutions:

 Forward Chaining: It starts from the known facts and rules, and applies the inference
rules to add their conclusion to the known facts.
 Backward Chaining: It is a backward reasoning method that starts from the goal and
works backward to prove the known facts.

3. Knowledge Base

The knowledgebase is a type of storage that stores knowledge acquired from the different
experts of the particular domain. It is considered as big storage of knowledge. The more the
knowledge base, the more precise will be the Expert System.

It is similar to a database that contains information and rules of a particular domain or subject.

One can also view the knowledge base as collections of objects and their attributes. Such as a
Lion is an object and its attributes are it is a mammal, it is not a domestic animal, etc.

Components of Knowledge Base

Factual Knowledge: The knowledge which is based on facts and accepted by knowledge
engineers comes under factual knowledge.

Heuristic Knowledge: This knowledge is based on practice, the ability to guess, evaluation, and
experiences.

Knowledge Representation: It is used to formalize the knowledge stored in the knowledge base
using the If-else rules.

Knowledge Acquisitions: It is the process of extracting, organizing, and structuring the domain
knowledge, specifying the rules to acquire the knowledge.
Development of Expert System

Here, we will explain the working of an expert system by taking an example of MYCIN ES. Below
are some steps to build an MYCIN:

Firstly, ES should be fed with expert knowledge. In the case of MYCIN, human experts
specialized in the medical field of bacterial infection, provide information about the causes,
symptoms, and other knowledge in that domain.

The KB of the MYCIN is updated successfully. In order to test it, the doctor provides a new
problem to it. The problem is to identify the presence of the bacteria by inputting the details of
a patient, including the symptoms, current condition, and medical history.

The ES will need a questionnaire to be filled by the patient to know the general information
about the patient, such as gender, age, etc.

Now the system has collected all the information, so it will find the solution for the problem by
applying if-then rules using the inference engine and using the facts stored within the KB.

In the end, it will provide a response to the patient by using the user interface.

Participants in the development of Expert System

There are three primary participants in the building of Expert System:

Expert: The success of an ES much depends on the knowledge provided by human experts.
These experts are those persons who are specialized in that specific domain.

Knowledge Engineer: Knowledge engineer is the person who gathers the knowledge from the
domain experts and then codifies that knowledge to the system according to the formalism.

End-User: This is a particular person or a group of people who may not be experts, and working
on the expert system needs the solution or advice for his queries, which are complex.

Why Expert System?

Before using any technology, we must have an idea about why to use that technology and
hence the same for the ES. Although we have human experts in every field, then what is the
need to develop a computer-based system.
So below are the points that are describing the need of the ES:

No memory Limitations: It can store as much data as required and can memorize it at the time
of its application. But for human experts, there are some limitations to memorize all things at
every time.

High Efficiency: If the knowledge base is updated with the correct knowledge, then it provides a
highly efficient output, which may not be possible for a human.

Expertise in a domain: There are lots of human experts in each domain, and they all have
different skills, different experiences, and different skills, so it is not easy to get a final output
for the query. But if we put the knowledge gained from human experts into the expert system,
then it provides an efficient output by mixing all the facts and knowledge

Not affected by emotions: These systems are not affected by human emotions such as fatigue,
anger, depression, anxiety, etc.. Hence the performance remains constant.

High security: These systems provide high security to resolve any query.

Considers all the facts: To respond to any query, it checks and considers all the available facts
and provides the result accordingly. But it is possible that a human expert may not consider
some facts due to any reason.

Regular updates improve the performance: If there is an issue in the result provided by the
expert systems, we can improve the performance of the system by updating the knowledge
base."

Capabilities of the Expert System

Below are some capabilities of an Expert System:

Advising: It is capable of advising the human being for the query of any domain from the
particular ES.

Provide decision-making capabilities: It provides the capability of decision making in any


domain, such as for making any financial decision, decisions in medical science, etc.

Demonstrate a device: It is capable of demonstrating any new products such as its features,
specifications, how to use that product, etc.

Problem-solving: It has problem-solving capabilities.

Explaining a problem: It is also capable of providing a detailed description of an input problem.


Interpreting the input: It is capable of interpreting the input given by the user.

Predicting results: It can be used for the prediction of a result.

Diagnosis: An ES designed for the medical field is capable of diagnosing a disease without using
multiple components as it already contains various inbuilt medical tools.

Advantages of Expert System

These systems are highly reproducible.

They can be used for risky places where the human presence is not safe.

Error possibilities are less if the KB contains correct knowledge.

The performance of these systems remains steady as it is not affected by emotions, tension, or
fatigue.

They provide a very high speed to respond to a particular query.

Limitations of Expert System

The response of the expert system may get wrong if the knowledge base contains the wrong
information.

Like a human being, it cannot produce a creative output for different scenarios.

Its maintenance and development costs are very high.

Knowledge acquisition for designing is much difficult.

For each domain, we require a specific ES, which is one of the big limitations.

It cannot learn from itself and hence requires manual updates.

Applications of Expert System

In designing and manufacturing domain

It can be broadly used for designing and manufacturing physical devices such as camera lenses
and automobiles.

In the knowledge domain

These systems are primarily used for publishing the relevant knowledge to the users. The two
popular ES used for this domain is an advisor and a tax advisor.
In the finance domain

In the finance industries, it is used to detect any type of possible fraud, suspicious activity, and
advise bankers that if they should provide loans for business or not.

In the diagnosis and troubleshooting of devices

In medical diagnosis, the ES system is used, and it was the first area where these systems were
used.

Planning and Scheduling

The expert systems can also be used for planning and scheduling some particular tasks for
achieving the goal of that task.

Artificial Neural Network

Artificial Neural Network Tutorial provides basic and advanced concepts of ANNs. Our Artificial
Neural Network tutorial is developed for beginners as well as professions.

The term "Artificial neural network" refers to a biologically inspired sub-field of artificial
intelligence modeled after the brain. An Artificial neural network is usually a computational
network based on biological neural networks that construct the structure of the human brain.
Similar to a human brain has neurons interconnected to each other, artificial neural networks
also have neurons that are linked to each other in various layers of the networks. These
neurons are known as nodes.

Artificial neural network tutorial covers all the aspects related to the artificial neural network.
In this tutorial, we will discuss ANNs, Adaptive resonance theory, Kohonen self-organizing map,
Building blocks, unsupervised learning, Genetic algorithm, etc.
What is Artificial Neural Network?

The term "Artificial Neural Network" is derived from Biological neural networks that develop
the structure of a human brain. Similar to the human brain that has neurons interconnected to
one another, artificial neural networks also have neurons that are interconnected to one
another in various layers of the networks. These neurons are known as nodes.

The given figure illustrates the typical diagram of Biological Neural Network.

The typical Artificial Neural Network looks something like the given figure.

Dendrites from Biological Neural Network represent inputs in Artificial Neural Networks, cell
nucleus represents Nodes, synapse represents Weights, and Axon represents Output.

Relationship between Biological neural network and artificial neural network:

Biological Neural Network Artificial Neural Network

Dendrites Inputs

Cell nucleus Nodes

Synapse Weights

Axon Output

An Artificial Neural Network in the field of Artificial intelligence where it attempts to mimic the
network of neurons makes up a human brain so that computers will have an option to
understand things and make decisions in a human-like manner. The artificial neural network is
designed by programming computers to behave simply like interconnected brain cells.

There are around 1000 billion neurons in the human brain. Each neuron has an association
point somewhere in the range of 1,000 and 100,000. In the human brain, data is stored in such
a manner as to be distributed, and we can extract more than one piece of this data when
necessary from our memory parallelly. We can say that the human brain is made up of
incredibly amazing parallel processors.

We can understand the artificial neural network with an example, consider an example of a
digital logic gate that takes an input and gives an output. "OR" gate, which takes two inputs. If
one or both the inputs are "On," then we get "On" in output. If both the inputs are "Off," then
we get "Off" in output. Here the output depends upon input. Our brain does not perform the
same task. The outputs to inputs relationship keep changing because of the neurons in our
brain, which are "learning."

The architecture of an artificial neural network:

To understand the concept of the architecture of an artificial neural network, we have to


understand what a neural network consists of. In order to define a neural network that consists
of a large number of artificial neurons, which are termed units arranged in a sequence of layers.
Lets us look at various types of layers available in an artificial neural network.

Artificial Neural Network primarily consists of three layers:

Input Layer:

As the name suggests, it accepts inputs in several different formats provided by the
programmer.

Hidden Layer:

The hidden layer presents in-between input and output layers. It performs all the calculations
to find hidden features and patterns.

Output Layer:

The input goes through a series of transformations using the hidden layer, which finally results
in output that is conveyed using this layer.

The artificial neural network takes input and computes the weighted sum of the inputs and
includes a bias. This computation is represented in the form of a transfer function.
It determines weighted total is passed as an input to an activation function to produce the
output. Activation functions choose whether a node should fire or not. Only those who are fired
make it to the output layer. There are distinctive activation functions available that can be
applied upon the sort of task we are performing.

Advantages of Artificial Neural Network (ANN)

Parallel processing capability:

Artificial neural networks have a numerical value that can perform more than one task
simultaneously.

Storing data on the entire network:

Data that is used in traditional programming is stored on the whole network, not on a database.
The disappearance of a couple of pieces of data in one place doesn't prevent the network from
working.

Capability to work with incomplete knowledge:

After ANN training, the information may produce output even with inadequate data. The loss of
performance here relies upon the significance of missing data.

Having a memory distribution:

For ANN is to be able to adapt, it is important to determine the examples and to encourage the
network according to the desired output by demonstrating these examples to the network. The
succession of the network is directly proportional to the chosen instances, and if the event can't
appear to the network in all its aspects, it can produce false output.

Having fault tolerance:

Extortion of one or more cells of ANN does not prohibit it from generating output, and this
feature makes the network fault-tolerance.

Disadvantages of Artificial Neural Network:

Assurance of proper network structure:

There is no particular guideline for determining the structure of artificial neural networks. The
appropriate network structure is accomplished through experience, trial, and error.

Unrecognized behavior of the network:


It is the most significant issue of ANN. When ANN produces a testing solution, it does not
provide insight concerning why and how. It decreases trust in the network.

Hardware dependence:

Artificial neural networks need processors with parallel processing power, as per their
structure. Therefore, the realization of the equipment is dependent.

Difficulty of showing the issue to the network:

ANNs can work with numerical data. Problems must be converted into numerical values before
being introduced to ANN. The presentation mechanism to be resolved here will directly impact
the performance of the network. It relies on the user's abilities.

The duration of the network is unknown:

The network is reduced to a specific value of the error, and this value does not give us optimum
results."

Perceptron

A perceptron is a neural network unit (an artificial neuron) that does certain computations to
detect features or business intelligence in the input data. And this perceptron tutorial will give
you an in-depth knowledge of Perceptron and its activation functions."

Perceptron was introduced by Frank Rosenblatt in 1957. He proposed a Perceptron learning


rule based on the original MCP neuron. A Perceptron is an algorithm for supervised learning of
binary classifiers. This algorithm enables neurons to learn and processes elements in the
training set one at a time.
There are two types of Perceptrons: Single layer and Multilayer.

 Single layer - Single layer perceptrons can learn only linearly separable patterns
 Multilayer - Multilayer perceptrons or feedforward neural networks with two or more
layers have the greater processing power

The Perceptron algorithm learns the weights for the input signals in order to draw a linear
decision boundary.

This enables you to distinguish between the two linearly separable classes +1 and -1.

Note: Supervised Learning is a type of Machine Learning used to learn models from labeled
training data. It enables output prediction for future or unseen data. Let us focus on the
Perceptron Learning Rule in the next section.

Perceptron Learning Rule

Perceptron Learning Rule states that the algorithm would automatically learn the optimal
weight coefficients. The input features are then multiplied with these weights to determine if a
neuron fires or not.

The Perceptron receives multiple input signals, and if the sum of the input signals exceeds a
certain threshold, it either outputs a signal or does not return an output. In the context of
supervised learning and classification, this can then be used to predict the class of a sample.

Perceptron Function

Perceptron is a function that maps its input “x,” which is multiplied with the learned weight
coefficient; an output value ”f(x)”is generated.

In the equation given above:


“w” = vector of real-valued weights

“b” = bias (an element that adjusts the boundary away from origin without any dependence on
the input value)

“x” = vector of input x values

“m” = number of inputs to the Perceptron

The output can be represented as “1” or “0.” It can also be represented as “1” or “-1”
depending on which activation function is used.

Let us learn the inputs of a perceptron in the next section.

Inputs of a Perceptron

A Perceptron accepts inputs, moderates them with certain weight values, then applies the
transformation function to output the final result. The image below shows a Perceptron with a
Boolean output

A Boolean output is based on inputs such as salaried, married, age, past credit profile, etc. It has
only two values: Yes and No or True and False. The summation function “∑” multiplies all inputs
of “x” by weights “w” and then adds them up as follows:

In the next section, let us discuss the activation functions of perceptrons.

Activation Functions of Perceptron

The activation function applies a step rule (convert the numerical output into +1 or -1) to check
if the output of the weighting function is greater than zero or not.

For example:

If ∑ wixi> 0 => then final output “o” = 1 (issue bank loan)

Else, final output “o” = -1 (deny bank loan)

Step function gets triggered above a certain value of the neuron output; else it outputs zero.
Sign Function outputs +1 or -1 depending on whether neuron output is greater than zero or
not. Sigmoid is the S-curve and outputs a value between 0 and 1

Output of Perceptron

Perceptron with a Boolean output:


Inputs: x1…xn

Output: o(x1….xn)

Weights: wi=> contribution of input xi to the Perceptron output;

w0=> bias or threshold

If ∑w.x > 0, output is +1, else -1. The neuron gets triggered only when weighted input reaches a
certain threshold value.

An output of +1 specifies that the neuron is triggered. An output of -1 specifies that the neuron
did not get triggered.

“sgn” stands for sign function with output +1 or -1.

Want to check the Course Preview of Deep Learing? Click here to watch!

Error in Perceptron

In the Perceptron Learning Rule, the predicted output is compared with the known output. If it
does not match, the error is propagated backward to allow weight adjustment to happen.

Let us discuss the decision function of Perceptron in the next section.

Perceptron: Decision Function

A decision function φ(z) of Perceptron is defined to take a linear combination of x and w


vectors.

The value z in the decision function is given by:

The decision function is +1 if z is greater than a threshold θ, and it is -1 otherwise.

This is the Perceptron algorithm.

Bias Unit

For simplicity, the threshold θ can be brought to the left and represented as w0x0, where w0= -
θ and x0= 1.

The value w0 is called the bias unit.

The decision function then becomes:


Output:

The decision function squashes wTx to either +1 or -1 and how it can be used to discriminate
between two linearly separable classes.

Natural Language Processing

Natural Language Processing, or NLP for short, is broadly defined as the automatic
manipulation of natural language, like speech and text, by software.

The study of natural language processing has been around for more than 50 years and grew out
of the field of linguistics with the rise of computers."

Natural language refers to the way we, humans, communicate with each other.

Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial
intelligence concerned with the interactions between computers and human language, in
particular how to program computers to process and analyze large amounts of natural language
data. The goal is a computer capable of "understanding" the contents of documents, including
the contextual nuances of the language within them. The technology can then accurately
extract information and insights contained in the documents as well as categorize and organize
the documents themselves.

History

Natural language processing has its roots in the 1950s. Already in 1950, Alan Turing published
an article titled "Computing Machinery and Intelligence" which proposed what is now called the
Turing test as a criterion of intelligence, though at the time that was not articulated as a
problem separate from artificial intelligence. The proposed test includes a task that involves the
automated interpretation and generation of natural language.

Symbolic NLP (1950s – early 1990s)

The premise of symbolic NLP is well-summarized by John Searle's Chinese room experiment:
Given a collection of rules (e.g., a Chinese phrasebook, with questions and matching answers),
the computer emulates natural language understanding (or other NLP tasks) by applying those
rules to the data it confronts.

1950s: The Georgetown experiment in 1954 involved fully automatic translation of more than
sixty Russian sentences into English. The authors claimed that within three or five years,
machine translation would be a solved problem. However, real progress was much slower, and
after the ALPAC report in 1966, which found that ten-year-long research had failed to fulfill the
expectations, funding for machine translation was dramatically reduced. Little further research
in machine translation was conducted until the late 1980s when the first statistical machine
translation systems were developed.

1960s: Some notably successful natural language processing systems developed in the 1960s
were SHRDLU, a natural language system working in restricted "blocks worlds" with restricted
vocabularies, and ELIZA, a simulation of a Rogerian psychotherapist, written by Joseph
Weizenbaum between 1964 and 1966. Using almost no information about human thought or
emotion, ELIZA sometimes provided a startlingly human-like interaction. When the "patient"
exceeded the very small knowledge base, ELIZA might provide a generic response, for example,
responding to "My head hurts" with "Why do you say your head hurts?".

1970s: During the 1970s, many programmers began to write "conceptual ontologies", which
structured real-world information into computer-understandable data. Examples are MARGIE
(Schank, 1975), SAM (Cullingford, 1978), PAM (Wilensky, 1978), TaleSpin (Meehan, 1976),
QUALM (Lehnert, 1977), Politics (Carbonell, 1979), and Plot Units (Lehnert 1981). During this
time, the first many chatterbots were written (e.g., PARRY).

1980s: The 1980s and early 1990s mark the hey-day of symbolic methods in NLP. Focus areas of
the time included research on rule-based parsing (e.g., the development of HPSG as a
computational operationalization of generative grammar), morphology (e.g., two-level
morphology, semantics (e.g., Lesk algorithm), reference (e.g., within Centering Theory and
other areas of natural language understanding (e.g., in the Rhetorical Structure Theory). Other
lines of research were continued, e.g., the development of chatterbots with Racter and
Jabberwacky. An important development (that eventually led to the statistical turn in the
1990s) was the rising importance of quantitative evaluation in this period.[5] Statistical NLP
(1990s–2010s)

Up to the 1980s, most natural language processing systems were based on complex sets of
hand-written rules. Starting in the late 1980s, however, there was a revolution in natural
language processing with the introduction of machine learning algorithms for language
processing. This was due to both the steady increase in computational power (see Moore's law)
and the gradual lessening of the dominance of Chomskyan theories of linguistics (e.g.
transformational grammar), whose theoretical underpinnings discouraged the sort of corpus
linguistics that underlies the machine-learning approach to language processing.[6]

1990s: Many of the notable early successes on statistical methods in NLP occurred in the field
of machine translation, due especially to work at IBM Research. These systems were able to
take advantage of existing multilingual textual corpora that had been produced by the
Parliament of Canada and the European Union as a result of laws calling for the translation of
all governmental proceedings into all official languages of the corresponding systems of
government. However, most other systems depended on corpora specifically developed for the
tasks implemented by these systems, which was (and often continues to be) a major limitation
in the success of these systems. As a result, a great deal of research has gone into methods of
more effectively learning from limited amounts of data. 2000s: With the growth of the web,
increasing amounts of raw (unannotated) language data has become available since the mid-
1990s. Research has thus increasingly focused on unsupervised and semi-supervised learning
algorithms. Such algorithms can learn from data that has not been hand-annotated with the
desired answers or using a combination of annotated and non-annotated data. Generally, this
task is much more difficult than supervised learning, and typically produces less accurate results
for a given amount of input data. However, there is an enormous amount of non-annotated
data available (including, among other things, the entire content of the World Wide Web),
which can often make up for the inferior results if the algorithm used has a low enough time
complexity to be practical.

Neural NLP (present)

In the 2010s, representation learning and deep neural network-style machine learning methods
became widespread in natural language processing. That popularity was due partly to a flurry of
results showing that such techniques can achieve state-of-the-art results in many natural
language tasks, e.g., in language modeling and parsing. This is increasingly important in
medicine and healthcare, where NLP helps analyze notes and text in electronic health records
that would otherwise be inaccessible for study when seeking to improve care.

Formal grammar

Formal grammar is a set of rules for rewriting strings, along with a "start symbol" from which
rewriting starts. Therefore, a grammar is usually thought of as a language generator. However,
it can also sometimes be used as the basis for a "recognizer"—a function in computing that
determines whether a given string belongs to the language or is grammatically incorrect. To
describe such recognizers, formal language theory uses separate formalisms, known as
automata theory. One of the interesting results of automata theory is that it is not possible to
design a recognizer for certain formal languages.[1]Parsing is the process of recognizing an
utterance (a string in natural languages) by breaking it down to a set of symbols and analyzing
each one against the grammar of the language. Most languages have the meanings of their
utterances structured according to their syntax—a practice known as compositional semantics.
As a result, the first step to describing the meaning of an utterance in language is to break it
down part by part and look at its analyzed form (known as its parse tree in computer science,
and as its deep structure in generative grammar).
WordNet

WordNet is a lexical database of semantic relations between words in more than 200
languages. WordNet links words into semantic relations including synonyms, hyponyms, and
meronyms. The synonyms are grouped into synsets with short definitions and usage examples.
WordNet can thus be seen as a combination and extension of a dictionary and thesaurus. While
it is accessible to human users via a web browser, its primary use is in automatic text analysis
and artificial intelligence applications. WordNet was first created in the English language[4] and
the English WordNet database and software tools have been released under a BSD style license
and are freely available for download from that WordNet website.

You might also like