Comprehensive Guide to Artificial Intelligence
Comprehensive Guide to Artificial Intelligence
Amar Jukuntla
Contents
1 Introduction to Artificial Intelligence 1
1.1 Human Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Limitations of Human Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Need for Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.4 Philosophy of AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Defining AI: Thinking vs. Acting, Humanly vs. Rationally . . . . . . . . . . . . . . . . 2
1.6 History of AI in NASA and DARPA (2000s) . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6.1 NASA’s AI Contributions in the 2000s . . . . . . . . . . . . . . . . . . . . . . . 3
1.6.2 DARPA’s AI Contributions in the 2000s . . . . . . . . . . . . . . . . . . . . . . 3
1.7 Brief History of Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.8 Present State of AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.9 Types of AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.10 Examples of AI Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.11 AI Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.12 Examples of AI Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Problem Solving 24
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
i
CONTENTS ii
4 Knowledge Representation 70
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.2 The Role of Knowledge Representation in AI . . . . . . . . . . . . . . . . . . . . . . . 70
4.3 Knowledge vs. Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4 Importance of Logic in Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.5 Types of Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.6 Characteristics of a Good Knowledge Representation System . . . . . . . . . . . . . . . 73
4.7 Knowledge Representation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.7.1 Logic-Based Knowledge Representation . . . . . . . . . . . . . . . . . . . . . . 75
4.7.2 Semantic Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.7.3 Frames and Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.7.4 Production Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.8 Logic in Knowledge Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.8.1 Propositional Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.8.2 First-Order Logic (FOL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.8.3 Other Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.9 Inference Mechanisms in KR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.10 Practical Applications and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.11 Challenges and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5 Propositional Logic 81
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2 Syntax and Semantics of Propositional Logic . . . . . . . . . . . . . . . . . . . . . . . 81
CONTENTS iii
7 Planning 105
7.1 Planning Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2 What is Planning? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.3 Why Planning? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.3.1 Key Characteristics of Planning . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.3.2 Application Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.4 Planning Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.5 The Language of Planning Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.5.1 Representation of States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.5.2 Representation of Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.5.3 Representation of Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.6 Example: Shopping Planning Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.6.1 Solution Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.6.2 Operator Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.7 Types of Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
7.7.1 Classical Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
7.7.2 Non-Classical Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
7.8 Planning Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.8.1 State Space Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.8.2 Partial Order Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
CONTENTS iv
• Cognitive Biases: Humans are prone to biases, such as confirmation bias, where they favor
information aligning with existing beliefs. For instance, a person might ignore scientific evidence
contradicting their views on a topic.
• Limited Processing Capacity: The human brain processes information slowly compared to
computers. For example, calculating the square root of a large number mentally is time-consuming
and error-prone.
• Memory Constraints: Humans often forget details or misremember events, unlike computers with
precise data storage. A student might struggle to recall specific historical dates during an exam.
• Fatigue and Emotional Influence: Human performance degrades with tiredness or emotional
stress. A tired surgeon may make errors during a complex procedure.
These limitations highlight the need for systems that can augment or surpass human capabilities in specific
domains.
1
CHAPTER 1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE 2
• Scalability: AI can analyze millions of data points, such as medical records, in seconds, far beyond
human capability.
• Precision: AI systems, like those used in autonomous vehicles, make split-second decisions based
on sensor data, reducing human error.
• Automation: Repetitive tasks, such as assembly line quality control, benefit from AI’s ability to
work tirelessly.
• Exploration: AI enables discoveries in fields like astronomy, where algorithms identify patterns in
star data that humans might miss.
For example, IBM’s Watson can diagnose diseases by analyzing patient data against thousands of medical
studies, assisting doctors in making informed decisions.
1.4 Philosophy of AI
The philosophy of AI explores fundamental questions about intelligence, consciousness, and ethics.
Drawing from Russell and Norvig, AI can be viewed through four perspectives: systems that think
like humans, act like humans, think rationally, or act rationally. Philosophers like John Searle question
whether AI can truly understand (e.g., the Chinese Room argument), while others, like Russell, emphasize
designing AI that aligns with human values. Ethical concerns include ensuring AI systems are fair and
transparent. For instance, biased AI in hiring algorithms could unfairly exclude candidates based on
flawed training data.
• Thinking Humanly: AI systems that mimic human thought processes, such as cognitive models or
neural networks. For example, early AI like ELIZA simulated human-like conversation, though it
lacked true understanding.
• Acting Humanly: AI systems that perform tasks indistinguishable from humans, as tested by the
Turing Test. For instance, modern chatbots like Siri aim to converse naturally, passing as human in
limited contexts.
CHAPTER 1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE 3
• Thinking Rationally: AI systems that use logical reasoning to solve problems, such as expert
systems. For example, MYCIN diagnosed infections using rule-based logic, achieving accuracy
comparable to human experts.
• Acting Rationally: AI systems that act to achieve goals efficiently, regardless of human-like thought.
Autonomous vehicles, for instance, navigate roads by optimizing safety and efficiency, not by
mimicking human reasoning.
These distinctions guide AI development, balancing human-like behavior with rational efficiency. The
following diagram illustrates these approaches:
AI Definitions
Acting Rationally
AI techniques like machine learning, probabilistic robotics, and real-time collision avoidance. These
advancements influenced modern self-driving cars. DARPA also funded the Urban Challenge (2007),
where autonomous vehicles navigated urban environments, obeying traffic rules, using AI for real-time
decision-making and sensor fusion. Additionally, DARPA’s investment in natural language processing led
to technologies like Siri, a spinoff from SRI International’s AI research.
AI in 2000s
NASA DARPA
• 1943: Warren McCulloch and Walter Pitts proposed a model of artificial neurons, laying the
foundation for neural networks.
• 1950: Alan Turing introduced the concept of a machine that could simulate any human intelligence,
proposing the famous Turing Test to evaluate machine intelligence.
• 1956: The term “Artificial Intelligence” was coined at the Dartmouth Conference, considered the
birth of AI as a field.
• 1980s: The AI Winter period due to reduced funding and unmet expectations; however, significant
progress in machine learning and knowledge representation continued.
• 1997: IBM’s Deep Blue defeated the reigning world chess champion Garry Kasparov, showcasing
AI’s ability to tackle complex strategy games.
• 2010s: Breakthroughs in deep learning using GPUs led to rapid progress in image and speech
recognition.
• Present and Future: AI continues to grow in diverse applications such as autonomous vehicles,
natural language processing, and medical diagnostics.
CHAPTER 1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE 5
• Generative AI: Tools like ChatGPT and DALL-E create human-like text and images, used in
creative industries and customer service. For example, AI-generated art is now featured in galleries.
• Autonomous Systems: Self-driving cars (e.g., Waymo) and drones use AI for real-time navigation
and decision-making, building on DARPA’s Grand Challenge.
• Healthcare: AI systems like AlphaFold solve complex problems like protein folding, while
diagnostic tools outperform human doctors in detecting certain cancers.
• Explainable AI: DARPA’s XAI program (2017–2021) developed techniques to make AI decisions
transparent, crucial for military and medical applications.
• Ethical Challenges: Bias in AI models and job displacement remain concerns. For instance, facial
recognition systems have faced scrutiny for misidentification errors.
NASA continues to use AI for Earth observation (e.g., NISAR satellite, launched 2025, for monitoring
surface changes) and space exploration, while DARPA’s AI Next campaign (2018–present) invests in
trustworthy AI, such as the ACE program, which tested AI-piloted F-16s in 2024. The following diagram
illustrates current AI trends:
AI in 2025
1.9 Types of AI
AI can be classified into types based on its capabilities and functionality, ranging from narrow systems
designed for specific tasks to theoretical systems with broad cognitive abilities. Drawing from Russell
and Norvig and contemporary classifications, the main types of AI are Narrow AI, General AI, and
Superintelligent AI, each with distinct characteristics and examples.
• Narrow AI (Weak AI): Narrow AI is designed to perform specific tasks with high proficiency,
lacking general intelligence. It operates within a predefined scope and excels in specialized domains.
Examples include:
CHAPTER 1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE 6
• Voice Assistants: Siri and Amazon’s Alexa use natural language processing to interpret and
respond to voice commands, such as setting reminders or answering queries about the weather.
• Recommendation Systems: Netflix and Spotify employ AI to analyze user preferences and
suggest movies or music. For instance, Netflix’s algorithm uses viewing history to recommend
shows like *Stranger Things*.
• Image Recognition: Google’s Vision AI identifies objects in images, such as detecting diabetic
retinopathy in medical scans with accuracy rivaling human experts.
• Autonomous Navigation: NASA’s Perseverance rover uses Narrow AI for terrain navigation,
selecting paths on Mars to avoid obstacles.
• Fraud Detection: PayPal’s AI systems analyze transaction patterns to flag suspicious activities,
preventing unauthorized payments in real-time.
• General AI (Strong AI): General AI aims to replicate human-like intelligence, capable of performing
any intellectual task a human can do across diverse domains. While still theoretical, research is
advancing toward this goal. Examples of progress include:
• Multitask Learning Models: Models like DeepMind’s Gato can perform multiple tasks, such
as playing games, answering questions, and controlling robotic arms, showing early versatility.
• Large Language Models: Systems like Grok (developed by xAI) demonstrate broad capabili-
ties in understanding and generating text across contexts, from answering scientific queries to
writing poetry, though they remain task-limited.
• Game-Playing AI: DeepMind’s AlphaCode solves complex programming problems, adapting
to new challenges in a way that mimics human problem-solving flexibility.
• Superintelligent AI: Superintelligent AI, a hypothetical future stage, surpasses human intelligence
in all domains, including creativity and problem-solving. It remains speculative but raises ethical
concerns, as discussed by Russell regarding AI alignment with human values. No concrete examples
exist, but theoretical scenarios include:
AI Types
• Image Recognition and Classification: AI can identify objects, people, or patterns in images
with high accuracy. For example, Google’s Vision AI can analyze medical images to detect
diabetic retinopathy, matching or surpassing human radiologists’ accuracy. In retail, AI systems like
Amazon’s Rekognition identify products in photos, enabling automated inventory management.
• Natural Language Processing (NLP): AI excels at understanding and generating human language.
Chatbots like Grok (developed by xAI) answer complex queries in real-time, providing information
or assistance. Another example is Google Translate, which translates text across languages with
near-human fluency, using neural machine translation.
• Predictive Analytics: AI predicts outcomes based on historical data. In finance, algorithms like
those used by PayPal detect fraudulent transactions by analyzing patterns in payment data, flagging
anomalies in milliseconds. In weather forecasting, AI models like IBM’s GRAF predict storms
with high precision by processing satellite and radar data.
• Game Playing and Strategy: AI systems excel in strategic tasks. DeepMind’s AlphaGo defeated
world champion Go players by learning optimal strategies through reinforcement learning. In video
games, AI controls non-player characters, adapting to player actions in real-time.
AI Tasks
Navigation
1.11 AI Applications
AI has transformed numerous fields. Key applications include:
• Healthcare: AI systems like DeepMind’s AlphaFold solve protein folding problems, aiding drug
discovery.
• Finance: Fraud detection algorithms analyze transaction patterns to flag suspicious activity, e.g.,
detecting unauthorized credit card use.
• Education: Adaptive learning platforms tailor content to individual student needs, improving
engagement.
• Entertainment: Recommendation systems, like Netflix’s, suggest content based on user preferences.
AI
• Recommendation Engines: Services like Netflix, YouTube, Amazon, and Spotify use machine
learning models to analyze user behavior and preferences. Based on viewing or purchase history,
these systems recommend content or products, improving user engagement.
• Autonomous Vehicles: Companies like Tesla, Waymo, and Uber use AI to process data from
cameras, LiDAR, GPS, and sensors to navigate safely. These systems use deep learning for lane
detection, obstacle avoidance, and real-time decision-making.
• AI in Games: Games like Chess, Go, and StarCraft have seen AI competitors outperforming
humans. AlphaGo (by DeepMind) defeated world champions in Go, showcasing AI’s capability in
strategic planning and complex decision-making.
• Chatbots and Virtual Agents: Used in customer service (e.g., Drift, Intercom, IBM Watson
Assistant), these systems handle inquiries, book tickets, or offer product support by processing user
input through NLP and responding contextually.
CHAPTER 1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE 9
• Healthcare AI Systems: AI is used for medical imaging analysis, predicting patient deterioration,
drug discovery, and robotic surgeries. Systems like IBM Watson for Oncology assist doctors in
diagnosis and treatment planning.
• Finance and Banking AI: Applications include fraud detection, credit scoring, algorithmic trading,
and chatbots for customer queries (e.g., Erica by Bank of America).
• Smart Home Devices: Thermostats (Nest), lighting systems, and security devices use AI to learn
usage patterns and optimize energy or security.
• AI in Education: Platforms like Coursera and Khan Academy use AI to personalize learning paths.
Intelligent tutoring systems assess student performance and adapt lesson delivery.
• Agricultural AI: AI-driven drones and IoT sensors monitor crops, predict yields, and detect diseases.
Platforms like Blue River Technology use computer vision to apply herbicides precisely.
Real-time AI systems process data and make decisions instantaneously, critical in applications like:
• Autonomous Vehicles: Self-driving cars, such as Tesla’s, use AI to process sensor data (e.g.,
LIDAR, cameras) to navigate roads in real-time, avoiding obstacles.
• Smart Assistants: Devices like Amazon’s Alexa process voice commands instantly, using natural
language processing to respond to user queries.
For example, in autonomous vehicles, AI interprets sensor inputs, predicts pedestrian movements, and
adjusts speed within milliseconds.
Chapter 2
Rational Agents and Environments
2.1 Rationality
Rationality in AI refers to the ability of an agent to make decisions that maximize the expected outcome
of its actions, given its knowledge and goals. A rational agent acts to achieve the best possible results or,
when uncertainty is present, the best expected results, as defined by a performance measure. Rationality is
context-dependent, relying on the agent’s perceptions, available actions, and environment. For example, a
chess-playing AI like DeepBlue acts rationally by selecting moves that maximize its chances of winning,
based on a predefined evaluation function.
Definition of Rationality
In Artificial Intelligence, rationality is the property of an agent that chooses actions expected to
maximize its performance measure, based on the percepts and knowledge it has at the time of acting.
A rational agent is not omniscient but always aims to make the best possible decision given the
available information.
• Autonomous Vehicles: A self-driving car, such as Waymo’s, uses AI to choose driving actions (e.g.,
braking, turning) to maximize safety and efficiency. It processes sensor data (LIDAR, cameras) to
avoid obstacles and follow traffic rules, optimizing a performance measure of safe arrival time.
• Medical Diagnosis Systems: IBM’s Watson diagnoses diseases by analyzing patient data against
medical literature. It selects the diagnosis with the highest probability of correctness, maximizing
patient outcomes. For instance, Watson might recommend cancer treatment based on genetic
markers, optimizing survival rates.
• Game-Playing AI: DeepMind’s AlphaGo selects moves in Go to maximize its winning probability.
It evaluates board states using a neural network, choosing actions that lead to optimal long-term
outcomes, as demonstrated in its 2016 victory over Lee Sedol.
10
CHAPTER 2. RATIONAL AGENTS AND ENVIRONMENTS 11
• Logistics Optimization: Amazon’s warehouse robots use AI to optimize package delivery paths.
They choose routes that minimize travel time and energy use, ensuring efficient order fulfillment.
2.3 Agent
An agent is anything that perceives its environment through sensors and acts upon it through actuators
to achieve goals. In AI, agents are computational systems designed to interact with their surroundings
intelligently. For example, a thermostat is a simple agent that senses room temperature and adjusts heating
to maintain a target temperature.
Definition of an Agent in AI
In Artificial Intelligence, an agent is an entity that perceives its environment through sensors and
acts upon that environment through actuators.
An agent’s behavior is defined by the agent function, which maps a sequence of percepts (the
percept history) to an action:
𝑓 : 𝑃∗ → 𝐴
where 𝑃∗ is the set of all percept sequences and 𝐴 is the set of actions.
Examples:
• A software agent that perceives keyboard input and acts by displaying responses.
Agent Architecture
The architecture of an agent defines its internal structure, integrating sensors, actuators, and decision-
making components. A typical agent architecture includes:
• Sensors: Devices or algorithms that perceive the environment (e.g., cameras in a robot).
• Processor: The computational unit that processes perceptions and selects actions based on an agent
program.
For example, in a self-driving car, cameras and LIDAR are sensors, steering and brakes are actuators, and
the onboard computer processes data to navigate.
Agent Program
The agent program is the software that implements the agent’s decision-making process, mapping
perceptions to actions. It defines how the agent behaves in response to environmental inputs. For example,
a chess AI’s agent program evaluates board positions and selects moves based on a minimax algorithm,
optimizing for a win. The program can be rule-based, learning-based, or a hybrid, depending on the
agent’s design.
Definition of Autonomy
Autonomy refers to the degree to which an AI agent can operate independently, making decisions
without human intervention based on its perceptions and internal state.
An autonomous agent is capable of performing tasks and achieving goals on its own, while adapting
to environmental changes and improving its behavior through learning and experience.
• Full autonomy: The agent operates completely independently (e.g., NASA’s Perseverance rover).
• Semi-autonomy: The agent acts independently but may require occasional human guidance (e.g.,
robotic surgery systems).
• Self-driving car: Perceives road conditions, navigates traffic, and makes driving decisions
independently.
• Intelligent personal assistants: Interprets commands and performs tasks without human interven-
tion (e.g., Siri, Alexa).
• Critical for applications like space exploration, military drones, and industrial automation.
• Sensors: Infrared sensors, bump sensors, and dirt detectors perceive the environment (e.g., detecting
walls or dirt patches).
• Agent program: Processes sensor data to navigate rooms, avoid obstacles, and clean efficiently.
For example, it may follow a spiral pattern to cover a dirty area or reverse when hitting a wall.
• Goal: Maximize floor coverage and cleanliness while minimizing energy use.
• Autonomy: Operates without human intervention, dynamically adapting to room layouts and
obstacles.
This agent acts rationally by selecting actions (e.g., turning, cleaning) that optimize its performance
measure (clean floor area).
• Fully Observable: The agent’s sensors capture the complete state of the environment. For
example, a chess-playing AI has full access to the board state, enabling precise decision-making.
CHAPTER 2. RATIONAL AGENTS AND ENVIRONMENTS 14
• Partially Observable: The agent has incomplete information. A self-driving car operates in a
partially observable environment, as it cannot see around corners or detect hidden pedestrians,
requiring probabilistic reasoning.
• Deterministic: The next state is fully determined by the current state and action. A robotic
arm assembling parts in a factory operates in a deterministic environment, where actions like
“move left” have predictable outcomes.
• Stochastic: Outcomes are uncertain. A weather forecasting AI deals with a stochastic envi-
ronment, as weather patterns involve randomness (e.g., predicting rain based on probabilistic
models).
• Static: The environment remains unchanged while the agent deliberates. A crossword-solving
AI works in a static environment, as the puzzle does not change during processing.
• Dynamic: The environment changes over time. A robotic vacuum cleaner operates in a
dynamic environment, as furniture may be moved or new dirt appears while it cleans.
• Discrete: The environment has a finite number of states and actions. A tic-tac-toe AI operates
in a discrete environment, with a limited set of board configurations.
• Continuous: The environment involves continuous variables. A drone navigating airspace
deals with continuous positions and velocities, requiring real-time control adjustments.
• Single-Agent: Only one agent interacts with the environment. A thermostat adjusting room
temperature is a single-agent system.
• Multi-Agent: Multiple agents interact, either cooperatively or competitively. In autonomous
vehicle fleets, cars coordinate to optimize traffic flow, acting as a multi-agent system.
Key Components:
• Condition-Action Rules: A rule base that maps percepts to actions (e.g., IF condition THEN
action).
Working Mechanism:
Formal Function:
Agent Function: 𝑓 : 𝑃 → 𝐴
SIMPLE-REFLEX-AGENT Algorithm
FUNCTION SIMPLE-REFLEX-AGENT(percept) RETURNS an action
PERSISTENT: rules, a set of condition–action rules
state ← INTERPRET-INPUT(percept)
rule ← RULE-MATCH(state, rules)
action ← [Link]
RETURN action
Limitations:
Example: A vacuum robot turns left when it detects an obstacle in front. It doesn’t remember whether
it was there before.
What is a Model?
In the context of AI agents, a model is a representation of how the environment works. It includes:
• Rules for predicting future percepts based on past actions and observations.
Working Principle
• The agent receives a percept from the environment.
• It chooses an action based on the current percept and the internal state derived from the model.
• Model: A map of the building and known rules about movement (e.g., “moving north from room A
leads to room B unless there is a wall”).
MODEL-BASED-REFLEX-AGENT Algorithm
FUNCTION MODEL-BASED-REFLEX-AGENT(percept) RETURNS an action
PERSISTENT:
state, the agent’s current conception of the world state
transition model, how the next state depends on the current state and
action
sensor model, how the current world state is reflected in the agent’s
percepts
rules, a set of condition–action rules
action, the most recent action (initially none)
Key Characteristics
• Uses a goal to evaluate actions.
• Involves search and planning to determine a sequence of actions that leads to the goal.
Architecture
A goal-based agent typically includes:
• Action Selector: Chooses actions that lead toward the goal using search or planning algorithms.
• Planning: Uses A* or Dijkstra’s algorithm to find the optimal route avoiding no-fly zones.
• Model-Based Agent: Maintains internal state but may still use fixed rules.
Advantages
• More flexible than rule-based agents.
Limitations
• Planning can be computationally expensive.
What is Utility?
Utility is a numerical representation of an agent’s preferences over possible outcomes. It helps the agent
to:
How It Works
• The agent perceives the environment.
• Utility Function: Combines safety, speed, comfort, fuel efficiency, and traffic rules into a single
score.
• Action: Choose route and speed that maximizes the total utility.
• Goal-Based Agent: Chooses actions to achieve goals, but doesn’t compare outcomes.
• Utility-Based Agent: Considers multiple goals and uncertainties, selecting the best possible action.
Advantages
• Makes informed decisions under uncertainty.
Limitations
• Defining an accurate and meaningful utility function is difficult.
Example: A stock-trading AI chooses actions that maximize expected returns while minimizing risk.
CHAPTER 2. RATIONAL AGENTS AND ENVIRONMENTS 21
Figure 2.5: A model-based, utility-based agent. It uses a model of the world, along with a utility function
that measures its preferences among states of the world. Then it chooses the action that leads to the
best expected utility, where expected utility is computed by averaging over all possible outcome states,
weighted by the probability of the outcome.
• Learning Element: Responsible for improving the agent’s performance by learning from experience.
• Performance Element: Executes actions and makes decisions based on current knowledge.
• Problem Generator: Suggests new actions that lead to better exploration and learning.
• Initially, the agent has no knowledge of the game strategy and plays randomly.
• The learning element updates a table of state-action values based on the outcome.
CHAPTER 2. RATIONAL AGENTS AND ENVIRONMENTS 22
• Over time, the agent learns to prefer actions that lead to winning and avoids those that result in
losing.
Each architecture type supports different capabilities. The choice depends on the environment
complexity, observability, and the need for adaptability.
Advantages
• Adaptability to dynamic or unknown environments.
• Applicable in various domains such as robotics, autonomous vehicles, and recommendation systems.
Types of AI Agents
AI agents are commonly categorized as follows:
CHAPTER 2. RATIONAL AGENTS AND ENVIRONMENTS 23
• Simple Reflex Agents: Act solely based on the current percept, without considering the history of
percepts.
• Model-Based Reflex Agents: Maintain an internal model of the world to handle partially observable
environments.
• Utility-Based Agents: Select actions based on a utility function that evaluates different possible
outcomes.
• Learning Agents: Improve their performance over time by learning from experience.
Key Components
An intelligent agent typically consists of:
AI agents are foundational in various applications, including robotics, autonomous vehicles, recom-
mendation systems, and personal assistants. As research advances, agents become more autonomous,
adaptive, and capable of operating in complex, dynamic environments.
Chapter 3
Problem Solving
3.1 Introduction
In Artificial Intelligence (AI), many problems can be formulated and solved using the concept of a State
Space. This representation provides a formal model of the problem, allowing AI algorithms to explore
possible solutions through various search techniques.
4. Transition Model (𝑇 (𝑠, 𝑎) → 𝑠′): Describes the result of applying action 𝑎 in state 𝑠.
5. Goal Test: A function that checks whether a given state is a goal state.
24
CHAPTER 3. PROBLEM SOLVING 25
Start
A B
C D E F
Each node represents a state. The edges show actions or transitions. Tree search algorithms like BFS
and DFS explore this space.
Applications in AI
State space problems are foundational to many AI applications, such as:
These algorithms systematically explore the state space to find paths from the initial state to a goal
state.
Working Principle
BFS starts from the initial node (root), explores all of its neighbors, then moves to the neighbors’ neighbors,
and so on.
It uses a FIFO (First-In-First-Out) queue to keep track of nodes to be explored next.
Properties:
A B
C G
• Time Complexity: 𝑂 (𝑏 𝑑 ), where 𝑏 is the branching factor and 𝑑 is the depth of the shallowest goal.
Limitations:
• Nodes: S, A, B, C, G.
• Edges: S to A, S to B, A to C, B to G, C to G.
Step-by-Step Process:
Step-by-Step Example
Consider the following graph/tree:
CHAPTER 3. PROBLEM SOLVING 29
B C
D E F G
B C
D E F G
Examples
Example 1: 8-Puzzle
The 8-puzzle is a 3x3 grid with 8 numbered tiles and a blank space, where the goal is to rearrange
the tiles to a target configuration by sliding the blank.
Problem Setup
• States: All possible tile arrangements (9!/2 = 181, 440 solvable states).
• Initial State:
2 8 3
1 6 4
7 _ 5
• Goal State:
1 2 3
4 5 6
7 8 _
• Actions: Move the blank up, down, left, or right (if valid).
• Path Cost: Each move costs 1.
CHAPTER 3. PROBLEM SOLVING 31
Step-by-Step BFS
1. Depth 0: Start with initial state (2, 8, 3, 1, 6, 4, 7, _, 5). Not the goal. Enqueue successors:
• Move blank up: (2, 8, 3, 1, _, 4, 7, 6, 5).
• Move blank left: (2, 8, 3, 1, 6, 5, 7, _, 4).
Frontier: [(2, 8, 3, 1, _, 4, 7, 6, 5), (2, 8, 3, 1, 6, 5, 7, _, 4)].
2. Depth 1: Dequeue (2, 8, 3, 1, _, 4, 7, 6, 5). Not the goal. Generate successors:
• Move blank down: (2, 8, 3, 1, 6, 4, 7, _, 5) (already explored, skip).
• Move blank left: (2, 8, 3, _, 1, 4, 7, 6, 5).
• Move blank right: (2, 8, 3, 1, 4, _, 7, 6, 5).
Frontier: [(2, 8, 3, 1, 6, 5, 7, _, 4), (2, 8, 3, _, 1, 4, 7, 6, 5), (2, 8, 3, 1, 4, _, 7, 6, 5)].
3. Continue exploring all depth 1 nodes, then depth 2, and so on.
4. Suppose the goal is found at depth 10 (a typical solution length for a scrambled 8-puzzle).
The water jug problem involves measuring exactly 2 liters using a 4-liter jug (A) and a 3-liter jug
(B).
Problem Setup
• States: (𝑎, 𝑏), where 𝑎 ∈ [0, 4] is water in Jug A, 𝑏 ∈ [0, 3] is water in Jug B.
• Initial State: (0, 0).
• Goal State: (2, 𝑏) or (𝑎, 2).
• Actions:
1. Fill A: (𝑎, 𝑏) → (4, 𝑏).
2. Fill B: (𝑎, 𝑏) → (𝑎, 3).
3. Empty A: (𝑎, 𝑏) → (0, 𝑏).
4. Empty B: (𝑎, 𝑏) → (𝑎, 0).
5. Pour A to B: (𝑎, 𝑏) → (𝑎 − min(𝑎, 3 − 𝑏), 𝑏 + min(𝑎, 3 − 𝑏)).
6. Pour B to A: (𝑎, 𝑏) → (𝑎 + min(𝑏, 4 − 𝑎), 𝑏 − min(𝑏, 4 − 𝑎)).
• Path Cost: Each action costs 1.
CHAPTER 3. PROBLEM SOLVING 32
Step-by-Step BFS
1. Depth 0: Start with (0, 0). Not the goal. Enqueue successors:
• Fill A: (4, 0).
• Fill B: (0, 3).
Frontier: [(4, 0), (0, 3)].
2. Depth 1: Dequeue (4, 0). Not the goal. Successors:
• Fill A: (4, 0) (already explored, skip).
• Empty A: (0, 0) (already explored, skip).
• Pour A to B: (1, 3).
Frontier: [(0, 3), (1, 3)].
3. Dequeue (0, 3). Not the goal. Successors:
• Fill B: (0, 3) (already explored, skip).
• Empty B: (0, 0) (already explored, skip).
• Pour B to A: (3, 0).
Frontier: [(1, 3), (3, 0)].
4. Depth 2: Dequeue (1, 3). Not the goal. Successors:
• Empty B: (1, 0).
• Pour B to A: (4, 0) (already explored, skip).
Frontier: [(3, 0), (1, 0)].
5. Dequeue (3, 0). Not the goal. Successors:
• Fill A: (4, 0) (already explored, skip).
• Empty A: (0, 0) (already explored, skip).
• Pour A to B: (0, 3) (already explored, skip).
Frontier: [(1, 0)].
6. Depth 3: Dequeue (1, 0). Not the goal. Successors:
• Fill A: (4, 0) (already explored, skip).
• Empty A: (0, 0) (already explored, skip).
• Pour A to B: (0, 1).
Frontier: [(0, 1)].
7. Depth 4: Dequeue (0, 1). Not the goal. Successors:
• Fill A: (4, 1).
• Fill B: (0, 3) (already explored, skip).
• Empty B: (0, 0) (already explored, skip).
• Pour B to A: (1, 0) (already explored, skip).
CHAPTER 3. PROBLEM SOLVING 33
Consider a 4x4 grid maze with the agent starting at (0,0) and aiming for (3,3). Walls block some
paths (e.g., at (1,1), (2,1)).
Problem Setup
Step-by-Step BFS
Uniform Cost Search (UCS) is a blind (uninformed) search algorithm used in Artificial Intelligence
to find the optimal path from a start node to a goal node in a weighted graph or state space. It is an
extension of Breadth-First Search (BFS) that accounts for varying costs of edges, ensuring the path
with the lowest total cost is found. UCS is particularly useful in problems where the cost of actions
varies, such as pathfinding in navigation systems or resource allocation problems.
UCS explores nodes in the order of their cumulative path cost from the start node. Unlike BFS,
which assumes uniform edge costs, UCS prioritizes nodes based on the total cost of the path from
the start node to the current node. It uses a priority queue to always expand the node with the lowest
cumulative cost, ensuring optimality in terms of path cost.
Key Features
• Optimality: UCS guarantees the optimal (least-cost) path to the goal in a weighted graph,
provided all edge costs are non-negative.
• Completeness: UCS is complete, meaning it will find a solution if one exists, assuming the
graph is finite and all costs are positive.
• Exploration Strategy: It explores nodes in order of increasing path cost, using a priority
queue.
• Uninformed: UCS does not use heuristic information about the goal, making it a blind search
algorithm.
Algorithm
Explanation of Steps
• Priority Queue: The priority queue 𝑄 stores nodes with their cumulative path costs 𝑔(𝑛),
where 𝑔(𝑛) is the total cost from the start node to node 𝑛. Nodes are dequeued in order of
increasing 𝑔(𝑛).
CHAPTER 3. PROBLEM SOLVING 36
• Visited Set: The set 𝑉 tracks explored nodes to avoid cycles and redundant exploration.
• Cost Function: The cost of a path to a node 𝑚 through node 𝑛 is computed as 𝑔(𝑚) =
𝑔(𝑛) + cost(𝑛, 𝑚), where cost(𝑛, 𝑚) is the edge cost between 𝑛 and 𝑚.
• Optimality Check: When a node is dequeued, its cost is the lowest possible for that node,
ensuring the first time the goal is reached, the path is optimal.
Properties of UCS
• Completeness: Guaranteed if the graph is finite and all edge costs are positive (cost(𝑛, 𝑚) > 0).
• Optimality: Finds the least-cost path if all edge costs are non-negative (cost(𝑛, 𝑚) ≥ 0).
• Time Complexity: 𝑂 (𝑏 1+⌊𝐶/𝜖⌋ ), where 𝑏 is the branching factor, 𝐶 is the cost of the optimal
path, and 𝜖 is the minimum edge cost. In practice, it can be exponential in the worst case.
• Space Complexity: 𝑂 (𝑏 1+⌊𝐶/𝜖⌋ ), as it stores all nodes in the priority queue.
Consider a graph representing cities and roads with distances as edge costs. The goal is to find the
shortest path from city 𝑆 (Start) to city 𝐺 (Goal).
Graph Description
The graph has the following nodes and edges with costs:
• 𝑆 → 𝐴: Cost = 2
CHAPTER 3. PROBLEM SOLVING 37
• 𝑆 → 𝐵: Cost = 5
• 𝐴 → 𝐵: Cost = 1
• 𝐴 → 𝐺: Cost = 6
• 𝐵 → 𝐺: Cost = 3
2 6
S A G
5 1
3
Step-by-Step Execution
Result
The optimal path is 𝑆 → 𝐵 → 𝐺 with a total cost of 6. Note that the path 𝑆 → 𝐴 → 𝐺 has a cost
of 8, which is suboptimal.
UCS can also be applied to state-space search problems, such as the 8-puzzle, where the goal is to
rearrange tiles from an initial configuration to a goal configuration. Each move (sliding a tile into
the blank space) has a cost based on the tile moved.
CHAPTER 3. PROBLEM SOLVING 38
Problem Description
1 2 3 1 2 3
Initial State: 4 0 5 , Goal State: 4 5 8
6 7 8 6 7 0
Each move’s cost is equal to the number on the tile moved (e.g., moving tile 5 costs 5).
Step-by-Step Execution
1. Initialize: Start with the initial state, 𝑔(initial) = 0. 𝑄 = [(initial, 0)], 𝑉 = {}.
2. Dequeue initial state. Possible moves: Move tile 4 (cost 4) or tile 5 (cost 5) into the blank
space. New states:
1 2 3
• State 1 (move 4 left): 0 4 5 , 𝑔 = 4.
6 7 8
1 2 3
• State 2 (move 5 right): 4 5 0 , 𝑔 = 5.
6 7 8
𝑄 = [(State 1, 4), (State 2, 5)].
3. Dequeue State 1 (𝑔 = 4). Expand to new states (e.g., move tile 1, 4, or 6). Continue exploring.
4. Eventually, State 2 leads to the goal state by moving tile 8 (cost 8): 𝑔 = 5 + 8 = 13.
Result
The optimal sequence is: Move tile 5 right (cost 5), then move tile 8 down (cost 8). Total cost = 13.
Applications of UCS
Advantages
Limitations
Depth-First Search (DFS) is a graph traversal algorithm that explores a path as deep as possible
before backtracking. It uses a stack (either explicitly or via recursion) to keep track of the nodes to
visit next.
Algorithm Steps
Key Characteristics
𝑆 . . .
. W . .
. W . .
. . . 𝐺
Graph Structure
B C
E F
D
DFS Setup
• Start Node: A
CHAPTER 3. PROBLEM SOLVING 41
• Goal Node: G
• Traversal Order: Alphabetical (e.g., B before C, E before F)
• Data Structure: Stack
Step-by-Step Execution
Result
Notes
Depth-Limited Search (DLS) is a variation of Depth-First Search that limits the depth of the search
tree. It explores paths up to a depth limit ℓ.
Key idea: Prevents infinite loops in deep or infinite graphs by stopping exploration beyond a given
depth.
Algorithm Overview
Example Graph
B C
D E
Problem Setup
• Start node: A
• Goal node: G
• Depth limit: ℓ = 3
• Expansion order: Left to right (e.g., B before C)
7. Pop F (depth 3), not goal Push G (child of F) would require depth 4 Cutoff: depth limit
exceeded → G not explored Stack: ∅
8. Search ends. Goal not found within depth 3.
Outcome
Key Properties
When to Use
What is IDS?
Iterative Deepening Search (IDS) is a search algorithm that performs repeated depth-limited
searches with increasing depth limits.
It combines:
Example Graph
B C
D E
Assumptions
Step-by-Step Iterations
Final Path
𝐴→𝐶→𝐸 →𝐹 →𝐺
• Complete: Yes
• Optimal: Yes (if step costs are uniform)
• Time Complexity: 𝑂 (𝑏 𝑑 ), same as BFS
• Space Complexity: 𝑂 (𝑑), same as DFS
• Tradeoff: Repeats nodes in early iterations but saves memory
Bidirectional Search is an algorithm to find the shortest path between a start and goal node by
simultaneously performing two searches:
The algorithm stops when the two searches meet in the middle.
Example Graph
A B C
G D E
Goal
Step-by-Step Execution
• Level 0: A
• Level 1: B, G
• Level 2: C, D, F
• Level 0: E
• Level 1: C, F
• Level 2: B, G
Final Path
Full Path: 𝐴 → 𝐺 → 𝐹 → 𝐸
CHAPTER 3. PROBLEM SOLVING 47
Key Properties
Challenges
When to Use
Definition
An informed search algorithm uses a heuristic function ℎ(𝑛) to estimate the cost of reaching the
goal from a node 𝑛. This allows the algorithm to prioritize exploring nodes that appear closer to the
goal, reducing the search space compared to uninformed methods like breadth-first search (BFS) or
depth-first search (DFS).
• Heuristic Function: ℎ(𝑛) estimates the cheapest path cost from node 𝑛 to the goal. For
example, in a route-finding problem, ℎ(𝑛) might be the straight-line distance (SLD) from node
𝑛 to the goal.
• Efficiency: By prioritizing promising paths, informed search reduces the number of nodes
explored.
• Examples: Greedy Best-First Search (GBFS), A* Search, Hill Climbing, Beam Search.
CHAPTER 3. PROBLEM SOLVING 48
Key Characteristics
Informed search algorithms typically maintain a priority queue to store nodes, where the priority
is determined by the heuristic function or a combination of heuristic and path cost. The general
process is:
Heuristic Function
The heuristic function h(n) is problem-specific and critical to the algorithm’s performance. Desirable
properties include:
• Admissibility: ℎ(𝑛) ≤ ℎ∗ (𝑛), where ℎ∗ (𝑛) is the true cost to the goal. An admissible heuristic
never overestimates the cost, ensuring optimality in algorithms like A*.
• Consistency (Monotonicity): For any nodes 𝑛 and 𝑛′, where 𝑛′ is a successor of 𝑛, ℎ(𝑛) ≤
𝑐(𝑛, 𝑛′) + ℎ(𝑛′), where 𝑐(𝑛, 𝑛′) is the cost of moving from 𝑛 to 𝑛′. Consistency ensures that
the heuristic is well-behaved across transitions.
• Informativeness: A heuristic closer to ℎ∗ (𝑛) reduces the search space more effectively.
Example Heuristics
• Route Finding: Straight-line distance (SLD) to the goal. For a city 𝑛 and goal 𝑔, ℎ(𝑛) =
SLD(𝑛, 𝑔). This is admissible because the SLD is always less than or equal to the actual path
distance (triangle inequality).
• 8-Puzzle:
• Manhattan Distance: Sum of the distances (in rows and columns) each tile is from its
goal position.
• Misplaced Tiles: Number of tiles not in their goal position.
We describe two prominent informed search algorithms: Greedy Best-First Search (GBFS) and A*
Search, with detailed pseudocode and examples.
CHAPTER 3. PROBLEM SOLVING 49
Greedy Best First Search is a graph traversal algorithm that uses a heuristic function to decide the
order in which nodes are explored.
f(n) = h(n)
where:
Note: It does not consider the path cost from the start node to the current node, unlike A*.
Algorithm Steps
1. Initialize the open list with the start node.
2. While the open list is not empty:
• Choose the node 𝑛 with the lowest h(n) value.
• If 𝑛 is the goal node, return the path.
• Else, expand 𝑛 and add its neighbors to the open list.
3. If the open list is empty and the goal is not found, return failure.
Example
B C
D E F
Node ℎ(𝑛)
A 6
B 4
C 3
D 4
E 2
F 1
G 0
Properties
• Completeness: No (can get stuck in loops)
• Optimality: No
• Time Complexity: 𝑂 (𝑏 𝑚 )
• Space Complexity: 𝑂 (𝑏 𝑚 )
Example 2
Consider the following graph where the task is to find a path from city S (Start) to city G (Goal).
A D
1 2 2 2
S C G
4 2 3 1
B E
The numbers on edges represent actual costs between nodes. The heuristic values ℎ(𝑛) (straight-line
distances to goal) are given as:
Algorithm Steps
Result
𝑆→𝐵→𝐶→𝐸 →𝐺
This path is found quickly because GBFS prioritizes the smallest heuristic value at each step.
However, note that this path is not guaranteed to be the shortest in terms of actual cost.
Observations
• GBFS is fast because it ignores the cost so far 𝑔(𝑛) and only considers ℎ(𝑛).
• It may find suboptimal paths because it is purely greedy.
• Time complexity depends heavily on the accuracy of ℎ(𝑛).
• If ℎ(𝑛) is admissible and consistent, using A* is usually better for optimality.
CHAPTER 3. PROBLEM SOLVING 52
A* (pronounced “A star”) is an informed search algorithm used for pathfinding and graph traversal.
It is widely used because it is both complete (it will always find a solution if one exists) and optimal
(it will find the least-cost path), provided the heuristic is admissible.
A* combines:
Where:
If ℎ(𝑛) is:
Pseudocode:
continue
if neighbor not in open_set or tentative_g < g(neighbor):
parent[neighbor] = current
g(neighbor) = tentative_g
f(neighbor) = g(neighbor) + h(neighbor)
if neighbor not in open_set:
add neighbor to open_set
return failure
A D
1 2 2 2
S C G
4 2 3 1
B E
Step-by-step Execution:
Goal found!
Path:
𝑆→ 𝐴→𝐶→𝐷→𝐺
Total cost: 1 + 2 + 2 + 2 = 7
Properties of A*:
• Complete: Yes, if branching factor is finite and step costs are positive.
• Optimal: Yes, if heuristic is admissible and consistent.
• Time Complexity: Exponential in worst case; depends on heuristic quality.
• Space Complexity: High (keeps all generated nodes in memory).
Advantages:
Disadvantages:
The AO* (And-Or Star) algorithm is a best-first search method for solving problems represented
as AND–OR graphs. It generalizes the A* search algorithm to handle problem decomposition where
some tasks must be solved together (AND) and others require choosing between alternatives (OR).
Motivation Many AI problems can be broken into subproblems that are either:
• OR nodes: Only one of the successors needs to be solved to solve the parent.
• AND nodes: All successors must be solved to solve the parent.
AO* searches for an optimal solution graph—a minimal-cost subgraph connecting the start node to
solved terminal states.
Start
A G3
G1 G2
CHAPTER 3. PROBLEM SOLVING 55
Algorithm Steps
Pseudocode
Advantages
Disadvantages
Applications
This ensures that only promising branches are expanded, reducing the search space.
CHAPTER 3. PROBLEM SOLVING 57
Algorithm
𝐴 𝐵 𝐶 𝐷
𝐴 0 2 9 3
𝐵 2 0 6 4
𝐶 9 6 0 1
𝐷 3 4 1 0
Properties
• Optimality: Always finds the optimal solution if lower bounds are valid.
• Memory: Requires only 𝑂 (𝑑) stack space, where 𝑑 is search depth.
• Efficiency: Strongly depends on quality of lower bound heuristics.
Applications
evaluated based on outcomes (e.g., +1 for MAX’s win, -1 for MIN’s win, 0 for a draw). The goal
is to select the move that maximizes a player’s chances of winning, assuming the opponent plays
optimally
According to Russell and Norvig (2010), the primary game search algorithms for two-player,
zero-sum games with perfect information include:
• Minimax Algorithm: Evaluates all possible moves to a fixed depth, assuming optimal play by
both players, and selects the move that maximizes the minimum payoff (best outcome against
the opponent’s best response).
• Alpha-Beta Pruning: An optimization of Minimax that prunes branches of the game tree that
cannot affect the final decision, reducing computation while maintaining optimality.
Other techniques, such as Monte Carlo Tree Search, are used for complex or imperfect-information
games, but Minimax and Alpha-Beta Pruning are the core algorithms for classical AI game playing.
The "Hill Climbing algorithm" is a foundational local search method used in artificial intelligence
for optimization. Its name comes from the analogy of climbing a hill: starting from an initial
solution, the algorithm iteratively moves in small steps toward neighboring states that improve the
solution, aiming to reach the peak (i.e., the global maximum)
How It Works
• Simple Hill Climbing: Moves to the first better neighbor encountered; fast but prone to getting
stuck in suboptimal peaks.
• Steepest-Ascent Hill Climbing: Evaluates all neighbors and moves to the best one; more
thorough but computationally heavier.
• Stochastic Hill Climbing: Randomly selects among improving neighbors, introducing
variability that may help escape poor local maxima.
CHAPTER 3. PROBLEM SOLVING 60
• Local Maximum: The algorithm may stop at a peak that isn’t globally optimal. Random
restarts—restarting from different initial states—help better explore the search space.
• Plateau: Flat regions make it difficult to detect improvement directions. Random jumps
introduce exploration to escape stalling.
• Ridge: Narrow, slanted peaks may mislead the algorithm into suboptimal paths. Multi-
directional search can help navigate these tricky landscapes :contentReference[oaicite:3]index=3.
Example: Optimize the function 𝑓 (𝑥, 𝑦) = −(𝑥 2 + 𝑦 2 ) (maximize, so negate for minimization,
seeking the global maximum at (0, 0)). Start at (1, 1), with neighbors at steps of 0.5.
𝑓 = −2
(1,1)
Move
𝑓 = −0.5
(0.5,0.5)
Move
𝑓 =0
(0,0)
Step-by-Step Process:
4. Evaluate neighbors: (0.5, 0.5), 𝑓 = −0.5; (0, 1), 𝑓 = −1; (-0.5, 0.5), 𝑓 = −0.5; (0, 0), 𝑓 = 0.
Move to (0, 0), 𝑓 = 0.
5. No better neighbors. Stop at (0, 0), global maximum.
The diagram (Figure 3.3) shows Hill Climbing moving from (1, 1) to (0, 0), reaching the global
maximum for this simple function.
Fitness / Objective
rand
random-restart hill-climb
s
ove
om-
Local maximum al m
loc
resta
by
ch
rea
rt c
ps ’t
ste can
lim
edy plateau/ridge
gre
b
Restart
Start
State (search space)
Region containing a local optimum
Figure 3.4: Hill climbing on a fitness landscape. A greedy climb (red) gets stuck at a local maximum,
while random-restart (green) can reach the global maximum.
Practical Notes: Hill Climbing is simple and memory-efficient, suitable for optimization problems
like scheduling or parameter tuning. However, random-restart or stochastic variants are often needed
to escape local maxima .
Overview: The Minimax algorithm is designed for two-player, zero-sum games with perfect
information, such as tic-tac-toe or chess. It constructs a game tree, evaluates leaf nodes using an
evaluation function, and propagates values upward to choose the optimal move. The maximizing
player (MAX) aims to maximize the payoff, while the minimizing player (MIN) aims to minimize it.
It’s like playing tic-tac-toe, where you choose the move that ensures the best outcome, assuming
your opponent counters optimally.
Mechanics: Minimax explores the game tree to a fixed depth (due to computational limits). At leaf
nodes, an evaluation function assigns values (e.g., +10 for MAX’s win, -10 for MIN’s win, 0 for
a draw). At MAX nodes, it selects the child with the highest value; at MIN nodes, it selects the
lowest. Values propagate up the tree, alternating between MAX and MIN layers, to determine the
best move from the root russell2010artificial.
Properties:
• Completeness: Complete for finite game trees, as it evaluates all possible outcomes.
• Optimality: Optimal against an optimal opponent, as it assumes both players choose the best
move.
CHAPTER 3. PROBLEM SOLVING 62
• Time Complexity: 𝑂 (𝑏 𝑚 ), where 𝑏 is the branching factor (average number of moves) and 𝑚
is the search depth.
• Space Complexity: 𝑂 (𝑏𝑚), as it uses depth-first exploration, storing only the current path.
Limitations:
• Computationally expensive for deep trees or high branching factors (e.g., chess, with 𝑏 ≈ 35).
• Requires a reliable evaluation function for non-terminal states.
• Impractical for real-time games without optimizations like pruning russell2010artificial.
Example: Consider a simplified tic-tac-toe game tree where MAX (X) moves first, and MIN (O)
responds. The tree is evaluated at depth 2, with leaf node values (+10 for MAX win, -10 for MIN
win, 0 for draw).
Root (MAX)
Move A Move B Move C
3 5 7 -2 0 6
Step-by-Step Process:
The diagram (Figure 3.5) shows Minimax selecting Move A, as it guarantees the best worst-case
outcome (value 3) against MIN’s optimal play russell2010artificial.
Algorithm (Pseudocode):
return value
else: // minimizingPlayer
value = +infinity
for each child in [Link]:
value = min(value, Minimax(child, depth-1, true))
return value
Practical Notes: Minimax is foundational for game AI but is computationally intensive for large
games like chess. It assumes optimal opponent play, making it robust but slow without optimizations
russell2010artificial.
The term ‘pruning’ refers to the removal of unnecessary branches and leaves. In Artificial
Intelligence, alpha-beta pruning involves eliminating redundant branches in decision trees. This
algorithm was independently developed by researchers in the 1900s.
Alpha-beta pruning is a search optimization technique that enhances the performance of the minimax
algorithm. The minimax algorithm is a decision-making process commonly applied in two-player,
zero-sum games such as chess. In these games, one player aims to maximize their score while the
opponent seeks to minimize it.
The minimax algorithm recursively explores all possible game states, represented as a tree structure,
and assigns values to leaf nodes based on potential game outcomes. These values are then propagated
up the tree to determine the optimal move. However, as game complexity increases, the number of
possible states grows exponentially, resulting in high computational costs.
Alpha-beta pruning addresses this challenge by reducing the number of nodes evaluated by the
minimax algorithm, pruning branches that cannot influence the final decision. This simplification
enables faster and more efficient evaluations, making it practical for real-time applications like
game-playing AI, where speed and efficiency are critical.
The core principle of alpha-beta pruning is to avoid evaluating branches of the game tree that
cannot affect the final decision, based on values already discovered during the search. It utilizes two
parameters: alpha and beta.
• Alpha: Represents the best (highest) value that the maximizing player (typically the AI) can
guarantee at that point. It serves as a lower bound, initialized to −∞.
• Beta: Represents the best (lowest) value that the minimizing player (the opponent) can
guarantee at that point. It serves as an upper bound, initialized to +∞.
CHAPTER 3. PROBLEM SOLVING 64
• As the AI traverses the tree, it tracks alpha and beta values. When evaluating a node, it
compares the node’s value against these bounds.
• If alpha becomes greater than or equal to beta, the current branch will not influence the final
decision, as the opponent will choose a better path. This branch is pruned, and the algorithm
proceeds to the next branch.
• This process allows the algorithm to skip large sections of the tree, significantly reducing the
number of nodes evaluated.
𝛼 = −∞, 𝛽 = +∞
A MAX
B C MIN
D E F G MAX
2 3 5 9 0 1 7 5 Terminal node
𝛼 = −∞, 𝛽 = +∞
A
B C
D3 E
𝛼 = 3, 𝛽 = +∞
2 3
CHAPTER 3. PROBLEM SOLVING 65
𝛼 = −∞, 𝛽 = +∞
A
𝛼 = −∞, 𝛽 = 3
B C
D3 E
2 3
𝛼 = −∞, 𝛽 = +∞
A
𝛼 = −∞, 𝛽 = 3
B C
D3 E5
2 3 5 9
Step 4: B Value
𝛼 = −∞, 𝛽 = +∞
A
𝛼 = −∞, 𝛽 = 3
B3 C
D3 E5
2 3 5 9
CHAPTER 3. PROBLEM SOLVING 66
Step 5: A Value
𝛼 = −∞, 𝛽 = +∞
A
𝛼 = −∞, 𝛽 = 3 𝛼 = 3, 𝛽 = +∞
B3 C
D3 E5
2 3 5 9
𝛼 = −∞, 𝛽 = +∞
A
𝛼 = −∞, 𝛽 = 3 𝛼 = 3, 𝛽 = +∞
B3 C
D3 E5 F1
2 3 5 9 0 1
𝛼 = −∞, 𝛽 = +∞
A
𝛼 = −∞, 𝛽 = 3 𝛼 = 3, 𝛽 = 1
B3 C1
D3 E5 F1
2 3 5 9 0 1
CHAPTER 3. PROBLEM SOLVING 67
Step 8: G Pruning
𝛼 = −∞, 𝛽 = +∞
A
𝛼 = −∞, 𝛽 = 3 𝛼 = 3, 𝛽 = 1
B3 C1
D3 E5 F1 G
2 3 5 9 0 1 7 5
𝛼 = −∞, 𝛽 = +∞
A3
𝛼 = −∞, 𝛽 = 3 𝛼 = 3, 𝛽 = 1
B3 C1
D3 E5 F1 G
2 3 5 9 0 1 7 5
1. Initialize the worst-case scenario with 𝛼 = −∞ and 𝛽 = +∞. If 𝛼 ≥ 𝛽, the node is pruned.
2. Since the initial 𝛼 < 𝛽, no pruning occurs. For the MAX player’s turn at node D, compute
𝛼 = max(2, 3) = 3.
3. At node B (MIN’s turn), set 𝛽 = min(3, ∞) = 3. Thus, at node B, 𝛼 = −∞, 𝛽 = 3. Pass these
values to node E.
4. At node E (MAX’s turn), compute 𝛼 = max(−∞, 5) = 5. Now, 𝛼 = 5, 𝛽 = 5. Since 𝛼 ≥ 𝛽,
prune the right successor of node E.
5. Return to node B, then to node A. Update node A’s 𝛼 = max(−∞, 3) = 3. Pass 𝛼 = 3, 𝛽 = +∞
to node C and node F.
6. At node F, compute 𝛼 = max(0, 3) = 3, then compare with the right child (1), so max(3, 1) = 3.
Node F’s value is 1.
CHAPTER 3. PROBLEM SOLVING 68
7. Node F passes value 1 to node C. For MIN’s turn, compute 𝛽 = min(+∞, 1) = 1. At node C,
𝛼 = 3, 𝛽 = 1. Since 𝛼 ≥ 𝛽, prune node C’s successor, node G.
8. Node C returns value 1 to node A. Compute max(1, 3) = 3 at node A.
The resulting tree includes only computed nodes, with an optimal value of 3 for the maximizer.
1. Game AI: In strategic games like chess, checkers, or Go, alpha-beta pruning enables AI to
evaluate millions of possible moves in real time. Systems like Stockfish (chess) and AlphaGo
(Go) rely on it as a core component.
2. Autonomous Systems and Robotics: Decision trees guide real-time decisions in robotics for
movement, navigation, and task execution. Alpha-beta pruning accelerates path and strategy
evaluation in dynamic environments.
3. Financial and Optimization Models: AI systems in financial forecasting, portfolio opti-
mization, and supply chain management use decision trees to evaluate scenarios. Alpha-beta
pruning enhances timely decision-making by processing large datasets efficiently.
2. Alpha-Beta Pruning: Using the same tic-tac-toe setup as Problem 1, apply Alpha-Beta
Pruning to select MAX’s move. Identify which branches are pruned and explain why. Provide
the step-by-step process and a diagram showing the pruned branches.
3. Hill Climbing Optimization: Optimize the function 𝑓 (𝑥) = −𝑥 2 + 4𝑥 (maximize) using Hill
Climbing. Start at 𝑥 = 0, with neighbors at steps of 0.5 (i.e., 𝑥 ± 0.5). Show the step-by-step
process, including function values at each step, and draw a diagram of the moves. Does the
algorithm reach the global maximum? Explain.
4. Game Tree Analysis: Given a game tree with root (MAX) and three children A, B, C (MIN),
each with two leaf nodes valued as follows: A (4, 2), B (6, -1), C (3, 5). Apply Minimax to
find the best move. Then, apply Alpha-Beta Pruning, showing which nodes are evaluated and
which are pruned. Provide diagrams for both.
5. Hill Climbing Variants: For the function 𝑓 (𝑥, 𝑦) = −(𝑥 −2) 2 − (𝑦 −1) 2 , apply Steepest-Ascent
Hill Climbing from (0, 0) with step size 0.5. Then, describe how Stochastic Hill Climbing
might differ in one iteration. Show the steps and explain if the global maximum is reached.
Chapter 4
Knowledge Representation
4.1 Overview
Knowledge Representation (KR) is a foundational pillar of Artificial Intelligence (AI) that focuses
on encoding information about the world in a structured, machine-understandable format to enable
reasoning, problem-solving, and decision-making. KR bridges the gap between raw data and
intelligent behavior by providing mechanisms to represent facts, rules, and relationships in a way
that AI systems can process. This chapter explores the role of KR in AI, differentiates knowledge
from data, highlights the importance of logic in reasoning, categorizes types of knowledge, outlines
the characteristics of effective KR systems, and surveys key KR methods, including logic-based
approaches, semantic networks, frames, scripts, and production rules.
• Reasoning and Inference: KR enables systems to derive new knowledge from existing facts
using logical inference mechanisms, such as deducing that a room needs cleaning based on its
state.
• Domain Modeling: KR captures the entities, relationships, and rules of a specific domain
(e.g., medical diagnosis, robotic navigation), enabling specialized reasoning.
• Decision-Making: Structured knowledge supports informed decisions, such as a home
automation system deciding to activate cooling based on temperature data.
• Learning and Adaptation: KR provides a foundation for machine learning by organizing
knowledge in a way that models can interpret and refine.
• Communication: KR facilitates natural language processing (NLP) by encoding linguistic
rules and ontologies, enabling human-like interaction.
70
CHAPTER 4. KNOWLEDGE REPRESENTATION 71
Example: Consider a multi-agent vacuum cleaner system (as discussed in prior contexts). The
knowledge base might include facts like “Room1 is dirty” (Dirty(Room1)) and rules like “If a room is
dirty and an agent is present, clean the room” (∀𝑥(Room(𝑥) ∧Dirty(𝑥) ∧At(Agent, 𝑥) → Clean(𝑥))).
This representation allows agents to reason about which rooms to clean and coordinate tasks
efficiently.
Key Difference: Knowledge involves structuring data into formats like logical statements, graphs, or
rules that support reasoning. For instance, a knowledge base might encode the rule “If temperature
exceeds 24°C, activate cooling” to act on temperature data.
Example from Practice: In a medical diagnosis system, raw data like “patient temperature: 38°C”
becomes knowledge when interpreted as “The patient has a fever, indicating a possible infection,”
which can trigger further reasoning about diagnosis and treatment.
CHAPTER 4. KNOWLEDGE REPRESENTATION 72
• Deduction: Deriving specific conclusions from general rules. Example: “All humans are
mortal; Socrates is human → Socrates is mortal.”
• Induction: Generalizing from specific observations. Example: “Room1 and Room2 are dirty
→ All rooms may be dirty.”
• Abduction: Inferring the best explanation for observations. Example: “The room is clean →
An agent likely cleaned it.”
Logical Framework:
• Syntax: Rules for forming valid expressions using symbols (e.g., propositions, predicates)
and connectives (e.g., ∧ (AND), ∨ (OR), ¬ (NOT), → (IMPLIES), ↔ (IF AND ONLY IF)).
• Semantics: Assigns meaning to expressions, defining their truth values in a given model.
• Inference Rules: Mechanisms like modus ponens (from 𝑃 → 𝑄 and 𝑃, infer 𝑄) enable
derivation of new knowledge.
Types of Logic:
• Propositional Logic: Deals with simple true/false statements. Example: “Room1 is dirty”
(𝑃) and “If Room1 is dirty, clean it” (𝑃 → 𝑄).
• First-Order Logic (FOL): Extends propositional logic with predicates, variables, and
quantifiers (∀, ∃). Example: ∀𝑥(Human(𝑥) → Mortal(𝑥)).
• Temporal Logic: Handles time-based reasoning. Example: ^(AllRoomsClean) (“Eventually,
all rooms will be clean”).
• Fuzzy Logic: Manages uncertainty with degrees of truth. Example: “Room is somewhat
dirty” (truth value 0.6).
• Modal Logic: Deals with necessity and possibility. Example: □𝑃 (“It is necessary that the
room is clean”).
• Description Logic: Used for ontologies, representing concepts and roles. Example: “Dirty-
Room ≡ Room ⊓ ∃[Link].”
Example: In the vacuum cleaner scenario, logic enables the system to represent rules like “If a
room is dirty, clean it” (Dirty(Room1) → Clean(Room1)) and infer actions based on current states.
CHAPTER 4. KNOWLEDGE REPRESENTATION 73
• Declarative Knowledge:
• Definition: Factual knowledge about the world, expressed as static assertions or statements.
• Characteristics: Represents “what is true” without specifying actions.
• Example: “Room1 is dirty” (Dirty(Room1)) or “All birds have feathers” (∀𝑥(Bird(𝑥) →
HasFeathers(𝑥))).
• Use: Forms the basis of knowledge bases in expert systems, ontologies, and databases.
• Procedural Knowledge:
• Definition: Knowledge about how to perform tasks or processes, often encoded as rules
or algorithms.
• Characteristics: Focuses on “how to do” something, guiding actions.
• Example: “To clean a room, move to the room, activate the vacuum, and remove dirt” or
a rule like ∀𝑥, 𝑦(Agent(𝑥) ∧ Room(𝑦) ∧ Assigned(𝑥, 𝑦) → Clean(𝑥, 𝑦)).
• Use: Critical for planning, robotic actions, and process automation.
• Meta-Knowledge:
• Definition: Knowledge about the structure, organization, or use of other knowledge.
• Characteristics: Enables systems to reflect on their own knowledge or reasoning
processes.
• Example: Knowing that a knowledge base uses first-order logic or that certain rules
prioritize cleaning dirty rooms.
• Use: Optimizes reasoning strategies, supports learning, and enhances system adaptability.
• Adequacy (Expressiveness):
CHAPTER 4. KNOWLEDGE REPRESENTATION 74
• Definition: The ability to capture all necessary aspects of a domain, including complex
relationships, hierarchies, and nuances.
• Example: A KR system for medical diagnosis must represent symptoms, diseases, and
their interconnections accurately, such as “Fever and cough indicate flu” (Fever(𝑥) ∧
Cough(𝑥) → Flu(𝑥)).
• Trade-Off: Highly expressive systems (e.g., first-order logic) may be computationally
intensive.
• Inferability (Reasoning Efficiency):
• Definition: The ability to perform fast and accurate inference to derive new knowledge
from the knowledge base.
• Example: In a vacuum cleaner system, the system must quickly infer which rooms to
clean based on rules like Dirty(Room1) → Clean(Room1).
• Trade-Off: Efficiency may be compromised in highly expressive systems, requiring
optimization.
• Clarity:
• Definition: Representations should be unambiguous and easy to understand, both for
humans and machines.
• Example: A semantic network with nodes for “Room1” and “Dirt” linked by “is-dirty” is
intuitive and clear.
• Importance: Clarity reduces errors in reasoning and maintenance.
• Extensibility:
• Definition: The system should allow easy updates or additions to the knowledge base as
new information is acquired.
• Example: Adding a new room to a vacuum cleaner’s knowledge base without disrupting
existing rules.
• Importance: Ensures adaptability to changing domains.
• Inferencing Capability:
• Definition: The system must support robust inference mechanisms, including deduction,
induction, and abduction.
• Example: Deducing “Socrates is mortal” from “All humans are mortal” and “Socrates is
human” using modus ponens.
• Importance: Enables the system to generate new knowledge dynamically.
• Additional Characteristics:
• Scalability: Handles large knowledge bases efficiently, critical for real-world applications
like ontologies.
• Consistency: Avoids contradictions within the knowledge base to ensure reliable reason-
ing.
CHAPTER 4. KNOWLEDGE REPRESENTATION 75
• Description: Represents knowledge as a graph with nodes (concepts or objects) and edges
(relationships).
• Key Features:
• Nodes represent entities (e.g., “Room1,” “Dirt”).
• Edges represent relationships (e.g., “is-dirty,” “is-a”).
• Supports inheritance, where properties of a general class apply to specific instances.
• Example: A semantic network might link “Room1” to “Dirt” via “is-dirty” and “Agent A” to
“Room1” via “is-in,” representing the vacuum cleaner’s environment.
• Advantages:
CHAPTER 4. KNOWLEDGE REPRESENTATION 76
• Frames:
• Description: Structured templates with slots for attributes, representing objects or
concepts.
• Key Features: Slots hold values or default values (e.g., a “Room” frame with slots for
“size,” “dirt level”).
• Example: A “Room” frame might specify “size: small, dirt level: high” for Room1.
• Advantages: Intuitive for structured knowledge; supports inheritance and defaults.
• Disadvantages: Limited for dynamic or procedural knowledge.
• Applications: Expert systems, cognitive modeling, object-oriented systems.
• Scripts:
• Description: Represent sequences of events or processes, typically for procedural
knowledge.
• Key Features: Describe stereotypical scenarios (e.g., a “cleaning script” with steps:
move to room, vacuum, check dirt).
• Example: A script for a vacuum cleaner might outline the sequence: “Enter room, check
dirt, vacuum if dirty, move to next room.”
• Advantages: Effective for modeling procedural knowledge and expectations.
• Disadvantages: Less flexible for novel situations or non-standard scenarios.
• Applications: Story understanding, AI planning, dialogue systems.
• Description: Represents knowledge as “if-then” rules that trigger actions based on conditions,
often implemented as Horn clauses or definite clauses.
• Key Features:
• Format: “IF condition THEN action” (e.g., “IF Room1 is dirty THEN clean Room1”).
• Supports forward chaining (data-driven) and backward chaining (goal-driven).
CHAPTER 4. KNOWLEDGE REPRESENTATION 77
• Example: A rule like Dirty(Room1) → Clean(Room1) guides the vacuum cleaner’s actions.
• Advantages:
• Simple and modular, easy to update or extend.
• Efficient for rule-based systems and decision-making.
• Disadvantages:
• Risk of rule conflicts or combinatorial explosion in large systems.
• Limited expressiveness for complex relationships or hierarchies.
• Applications: Rule-based expert systems, decision-making systems, control systems.
• Definition: A branch of logic dealing with propositions (statements) that are either true or
false, combined using connectives like ∧, ∨, ¬, →, ↔.
• Syntax: Atomic propositions (e.g., 𝑃: “Room1 is dirty”) are combined into compound
propositions (e.g., 𝑃 ∧ 𝑄).
• Semantics: Truth values are determined via truth tables.
• Example:
• 𝑃: “It is raining.” 𝑄: “The ground is wet.”
• Rule: 𝑃 → 𝑄 (“If it rains, the ground is wet”).
• Truth Table:
𝑃 𝑄 𝑃→𝑄
T T T
T F F
F T T
F F T
• Applications: Simple rule-based systems (e.g., home automation), circuit design, game-playing
AI.
• Limitations: Cannot represent relationships, uncertainty, or time-based sequences; scalability
issues for large systems.
CHAPTER 4. KNOWLEDGE REPRESENTATION 78
• Definition: Extends propositional logic with predicates, variables, constants, and quantifiers
(∀, ∃), enabling representation of complex relationships.
• Components:
• Predicates: Functions returning true/false (e.g., Parent(𝑥, 𝑦)).
• Variables: Represent objects (e.g., 𝑥, 𝑦).
• Constants: Specific objects (e.g., John).
• Quantifiers: ∀ (“for all”), ∃ (“there exists”).
• Example: ∀𝑥(Human(𝑥) → Mortal(𝑥)) (“All humans are mortal”). Given Human(Socrates),
infer Mortal(Socrates).
• Inference Mechanisms:
• Unification: Matches predicates (e.g., unifying Likes(𝑥, Cake) with Likes(Alice, 𝑦)
yields 𝑥 = Alice, 𝑦 = Cake).
• Lifting: Generalizes rules to variables (e.g., applying Parent(𝑥, 𝑦) → Loves(𝑥, 𝑦) to any
parent-child pair).
• Resolution: Proves statements by contradiction, using conjunctive normal form (CNF).
• Applications: Knowledge bases, NLP, robotic planning.
• Forward Chaining:
• Description: Data-driven reasoning that starts with known facts and applies rules to
generate new facts.
CHAPTER 4. KNOWLEDGE REPRESENTATION 79
• Expert Systems: MYCIN uses production rules to represent medical knowledge for diagnosis
(e.g., “IF fever AND cough THEN flu”).
• Robotics: A vacuum cleaner agent uses a knowledge base with facts (Dirty(Room1)), rules
(Dirty(𝑥) → Clean(𝑥)), and semantic networks to navigate and clean.
• NLP: Semantic networks and description logic encode linguistic rules and ontologies for text
understanding.
• Semantic Web: Ontologies represent structured knowledge for information sharing (e.g.,
“Human ⊑ Mortal”).
• Knowledge Base:
• Facts: Dirty(Room1), At(AgentA, Room1).
• Rules: ∀𝑥(Room(𝑥) ∧ Dirty(𝑥) ∧ At(Agent, 𝑥) → Clean(𝑥)).
• Representation Methods:
• Logic-Based: FOL rules for cleaning actions.
• Semantic Network: Nodes for “Room1,” “Dirt,” “AgentA” with edges like “is-dirty,”
“is-in.”
• Frame: A “Room” frame with slots for “dirt level,” “location.”
CHAPTER 4. KNOWLEDGE REPRESENTATION 80
Future Directions:
5.1 Overview
Propositional Logic (PL), also known as Boolean logic, is a fundamental formalism in Artificial
Intelligence (AI) for representing and reasoning about knowledge using declarative statements that
are either true or false. PL provides a structured way to encode facts and rules, enabling AI systems
to perform logical reasoning, make decisions, and solve problems. This chapter explores the syntax
and semantics of PL, inference mechanisms (entailment, validity, satisfiability, and rules like modus
ponens and modus tollens), the resolution method (including conversion to conjunctive normal form
and solving puzzles), and reasoning strategies such as forward chaining and backward chaining.
These concepts are critical for applications like expert systems, automated decision-making, and
puzzle-solving in AI.
The syntax of PL defines the rules for constructing valid logical expressions, known as well-formed
formulas (WFFs).
• Propositions: Atomic statements that represent basic facts or conditions with a binary truth
value (true or false).
• Example: 𝑃: “It is raining.” 𝑄: “The ground is wet.”
• Logical Connectives: Operators that combine propositions to form compound propositions.
• Negation (¬): Inverts the truth value of a proposition (e.g., ¬𝑃: “It is not raining”).
• Conjunction (∧): True if both propositions are true (e.g., 𝑃 ∧ 𝑄: “It is raining and the
ground is wet”).
81
CHAPTER 5. PROPOSITIONAL LOGIC 82
• Disjunction (∨): True if at least one proposition is true (e.g., 𝑃 ∨ 𝑄: “It is raining or the
ground is wet”).
• Implication (→): True unless the antecedent is true and the consequent is false (e.g.,
𝑃 → 𝑄: “If it is raining, then the ground is wet”).
• Biconditional (↔): True if both propositions have the same truth value (e.g., 𝑃 ↔ 𝑄:
“It is raining if and only if the ground is wet”).
• Compound Propositions: Formed by combining atomic propositions with connectives.
• Example: (𝑃 ∧ ¬𝑄) → 𝑅, where 𝑅: “Turn on the heater.”
• Precedence of Connectives: To avoid ambiguity, connectives are evaluated in the following
order:
1. ¬ (highest precedence)
2. ∧
3. ∨
4. →
5. ↔ (lowest precedence)
Parentheses are used to enforce specific evaluation orders (e.g., (𝑃 ∨ 𝑄) ∧ 𝑅).
The semantics of PL define the meaning of propositions and compound expressions by specifying
their truth values under different assignments.
𝑃 𝑄 𝑃∧𝑄
T T T
T F F
F T F
F F F
𝑃 𝑄 𝑅 (𝑃 ∨ 𝑄) ∧ 𝑅
T T T T
T T F F
T F T T
T F F F
F T T T
F T F F
F F T F
F F F F
5.2.3 Applications
• Home automation: “If it is hot and windows are closed, turn on the AC” ((𝑃 ∧ ¬𝑄) → 𝑅).
• Game-playing AI: Representing game states and rules (e.g., chess moves).
• Circuit design: Modeling logical gates (e.g., AND gate as 𝐴 ∧ 𝐵).
• Validity (Tautology): A proposition 𝜙 is valid if it is true under all possible truth assignments.
• Example: 𝑃 ∨ ¬𝑃 (Law of Excluded Middle).
𝑃 𝑃 ∨ ¬𝑃
T T
F T
• Testing Validity: Construct a truth table; if all rows are true, the proposition is valid.
• Satisfiability: A proposition 𝜙 is satisfiable if there exists at least one truth assignment that
makes it true. It is unsatisfiable (a contradiction) if no such assignment exists.
• Example (Satisfiable): 𝑃 ∧ 𝑄.
• True when 𝑃 = 𝑇, 𝑄 = 𝑇.
• Example (Unsatisfiable): 𝑃 ∧ ¬𝑃.
𝑃 𝑃 ∧ ¬𝑃
T F
F F
• Testing Satisfiability: Use truth tables or SAT solvers (e.g., DPLL algorithm) to find a
satisfying assignment.
Inference rules are formal methods for deriving new propositions from existing ones. Two key rules
are:
• Modus Ponens:
• Form: From 𝑃 → 𝑄 and 𝑃, infer 𝑄.
• Example: Given “If it is raining, the ground is wet” (𝑃 → 𝑄) and “It is raining” (𝑃),
infer “The ground is wet” (𝑄).
• Modus Tollens:
• Form: From 𝑃 → 𝑄 and ¬𝑄, infer ¬𝑃.
• Example: Given “If it is raining, the ground is wet” (𝑃 → 𝑄) and “The ground is not
wet” (¬𝑄), infer “It is not raining” (¬𝑃).
• Other Rules:
• Disjunctive Syllogism: From 𝑃 ∨ 𝑄 and ¬𝑃, infer 𝑄.
• Conjunction Introduction: From 𝑃 and 𝑄, infer 𝑃 ∧ 𝑄.
• Disjunction Introduction: From 𝑃, infer 𝑃 ∨ 𝑄.
CHAPTER 5. PROPOSITIONAL LOGIC 85
Resolution proves a proposition by assuming its negation and deriving a contradiction (empty
clause).
• Process:
1. Convert all premises and the negation of the goal to CNF.
2. Repeatedly apply the resolution rule: From two clauses containing complementary literals
(e.g., 𝑃 ∨ 𝐴 and ¬𝑃 ∨ 𝐵), derive a new clause (𝐴 ∨ 𝐵).
3. Continue until an empty clause ({}) is derived (indicating a contradiction) or no new
clauses can be derived.
• Example: Solving a Simple Puzzle:
• Puzzle: Given:
• 𝑃 ∨ 𝑄 (“Either it is raining or sunny”).
• ¬𝑃 (“It is not raining”).
• Prove: 𝑄 (“It is sunny”).
• Resolution Steps:
1. Premises in CNF: 𝑃 ∨ 𝑄, ¬𝑃.
CHAPTER 5. PROPOSITIONAL LOGIC 86
Algorithm
• Input: A knowledge base of facts (propositions) and rules (implications in the form 𝐴∧𝐵 → 𝐶).
• Process:
1. Identify rules whose antecedents (conditions) are satisfied by current facts.
2. Apply these rules to add their consequents (new facts) to the knowledge base.
3. Repeat until the goal is derived or no new facts can be added.
• Example:
• Facts: 𝑃: “Patient has fever.” 𝑄: “Patient has sore throat.”
• Rule: 𝑃 ∧ 𝑄 → 𝑅, where 𝑅: “Patient has flu.”
• Step 1: Check facts 𝑃 and 𝑄 are true.
• Step 2: Apply rule to infer 𝑅.
• Conclusion: The patient has flu.
Properties
Use Cases
• Rule-Based Expert Systems: Diagnosing diseases (e.g., MYCIN infers flu from symptoms).
• Troubleshooting: Identifying faults in systems based on observed conditions.
• Prediction Systems: Forecasting outcomes based on current data.
CHAPTER 5. PROPOSITIONAL LOGIC 87
Algorithm
Properties
Use Cases
• Prolog-Style Reasoning: Used in logic programming languages like Prolog for query
resolution.
• Expert Systems: Verifying diagnoses by checking if symptoms support a disease.
• Decision-Making Models: Determining if a goal (e.g., “clean room”) is achievable based on
current conditions.
CHAPTER 5. PROPOSITIONAL LOGIC 88
• Expert Systems: Representing rules for diagnosis (e.g., “If fever and sore throat, then flu”
(𝑃 ∧ 𝑄 → 𝑅)).
• Game-Playing AI: Modeling game states and rules (e.g., chess strategies).
• Home Automation: Decision rules like “If hot and windows closed, turn on AC” ((𝑃 ∧¬𝑄) →
𝑅).
• Puzzle-Solving: Using resolution to solve logical puzzles, as shown in the example above.
• Knowledge Base:
• Facts: 𝑃: “It is hot.” 𝑄: “Windows are open.”
• Rule: (𝑃 ∧ ¬𝑄) → 𝑅, where 𝑅: “Turn on AC.”
• Forward Chaining:
• Given 𝑃 = 𝑇, 𝑄 = 𝐹, apply rule to infer 𝑅 = 𝑇 (turn on AC).
• Backward Chaining:
• Goal: Prove 𝑅. Check sub-goals 𝑃 and ¬𝑄. Since 𝑃 = 𝑇, ¬𝑄 = 𝑇, conclude 𝑅 = 𝑇.
• Resolution:
• Premises: 𝑃, ¬𝑄, ¬𝑃 ∨ 𝑄 ∨ 𝑅 (CNF of (𝑃 ∧ ¬𝑄) → 𝑅).
• Negate goal: ¬𝑅.
• Resolve: Derive empty clause, proving 𝑅.
CHAPTER 5. PROPOSITIONAL LOGIC 89
• Limited Expressiveness: Cannot represent relationships or hierarchies (e.g., “All rooms are
dirty” requires first-order logic).
• No Uncertainty Handling: Works only with binary truth values, not probabilities.
• Scalability Issues: Large knowledge bases lead to complex truth tables or resolution processes.
• No Temporal Reasoning: Cannot model time-based sequences (e.g., “Room will be clean
later”).
Chapter 6
First Order Logic
6.1 Overview
First-Order Logic (FOL), also known as predicate logic, is a powerful formalism in Artificial
Intelligence (AI) that extends propositional logic by enabling the representation of complex
relationships and generalizations about objects in a domain. Unlike propositional logic, which is
limited to simple true/false statements, FOL introduces predicates, variables, constants, functions,
and quantifiers, making it suitable for modeling real-world scenarios in knowledge bases, natural
language processing, and robotic planning. This chapter explores the syntax and semantics of FOL,
reasoning patterns (universal instantiation, existential instantiation, generalized modus ponens),
resolution (including skolemization and unification), and reasoning strategies such as forward
chaining and backward chaining, with applications in knowledge bases and Prolog-style reasoning.
Whereas propositional logic assumes the world contains facts, first-order logic (like natural language)
assumes the world contains
– Objects: people, houses, numbers, colors, baseball games, wars, . . .
– Relations: red, round, prime, brother of, bigger than, part of, comes between, . . .
– Functions: father of, best friend, one more than, plus, . . .
90
CHAPTER 6. FIRST ORDER LOGIC 91
Example
The semantics of FOL define the meaning of formulas by specifying their truth values in a model,
which consists of a domain and an interpretation.
• Satisfiability: A formula is satisfiable if there exists at least one model where it is true.
• Example: ∃𝑥(Parent(𝑥, John) is satisfiable if there is a model where someone is a parent
of John.
• Validity: A formula is valid (a tautology) if it is true in all possible models.
• Example: ∀𝑥(Human(𝑥) → Human(𝑥)) is valid (true for all interpretations).
• Definition: From a universally quantified formula ∀𝑥𝑃(𝑥), infer 𝑃(𝑐) for any specific constant
𝑐 in the domain.
• Example:
• Given: ∀𝑥(Human(𝑥) → Mortal(𝑥)).
• Instantiate with 𝑐 = Socrates: Human(Socrates) → Mortal(Socrates).
• If Human(Socrates) is true, infer Mortal(Socrates).
• Use: Applies general rules to specific instances in knowledge bases.
• Definition: From an existentially quantified formula ∃𝑥𝑃(𝑥), infer 𝑃(𝑐) for some new constant
𝑐 (not previously used in the proof).
• Example:
• Given: ∃𝑥(Parent(𝑥, John).
• Introduce a new constant 𝑐: Parent(𝑐, John).
CHAPTER 6. FIRST ORDER LOGIC 93
• Definition: A generalized form of modus ponens that applies to predicates with variables,
using unification to match terms.
• Form: Given ∀𝑥(𝑃1 (𝑥) ∧ · · · ∧ 𝑃𝑛 (𝑥) → 𝑄(𝑥)) and facts 𝑃1 (𝑎), . . ., 𝑃𝑛 (𝑎), infer 𝑄(𝑎).
• Example:
• Rule: ∀𝑥(Fever(𝑥) ∧ SoreThroat(𝑥) → Flu(𝑥)).
• Facts: Fever(John), SoreThroat(John).
• Unify 𝑥 = John, infer: Flu(John).
• Use: Enables reasoning with general rules and specific instances in knowledge bases.
Skolemization
Unification Algorithm
• Definition: The process of finding a substitution that makes two logical expressions (predicates)
identical, enabling resolution.
• Algorithm:
1. Compare two predicates (e.g., P(𝑥, John) and P(Mary, 𝑦)).
2. If predicates have the same name and arity, find a substitution for variables to make
arguments match.
3. Example: Unify P(𝑥, John) and P(Mary, 𝑦).
• Substitution: 𝑥 = Mary, 𝑦 = John.
• Result: P(Mary, John).
4. Handle conflicts: If arguments cannot be unified (e.g., P(John, 𝑦) and P(Mary, 𝑦)),
unification fails.
• Use: Matches rules and facts in resolution and chaining.
• Definition: Proves a formula by assuming its negation, converting all formulas to conjunctive
normal form (CNF), and deriving a contradiction (empty clause) using resolution.
• Process:
1. Put the premises or axioms into clause form.
2. Add the negation of what is to be proved, in clause form, to the set of axioms.
3. Resolve these clauses together, producing new clauses that logically follow from them.
4. Produce a contradiction by generating the empty clause.
5. The substitutions used to produce the empty clause are those under which the opposite of
the negated goal is true.
• Example:
• Premises: ∀𝑥(Human(𝑥) → Mortal(𝑥)), Human(Socrates).
• Goal: Mortal(Socrates).
• Step 1: Convert to CNF:
• ∀𝑥(Human(𝑥) → Mortal(𝑥)) becomes ¬Human(𝑥) ∨ Mortal(𝑥).
• Human(Socrates).
• Negated goal: ¬Mortal(Socrates).
• Step 2: Resolve:
• ¬Human(𝑥)∨Mortal(𝑥) and Human(Socrates) with 𝑥 = Socrates: Yields Mortal(Socrates).
• Mortal(Socrates) and ¬Mortal(Socrates): Yields empty clause ({}).
• Conclusion: The premises entail Mortal(Socrates).
• Use: Automated theorem proving, knowledge base querying, planning.
CHAPTER 6. FIRST ORDER LOGIC 95
¬𝑃 ∨ ¬𝑄 ∨ 𝑅 (2)
¬𝑅
¬𝑃 ∨ ¬𝑄
𝑃 (1)
¬𝑄
¬𝑇 ∨ 𝑄 (4)
¬𝑄
¬𝑇
𝑇 (5)
∅
Lucky Student
1. Anyone passing his history exams and winning the lottery is happy.
2. Anyone who studies or is lucky can pass all his exams.
3. John did not study but he is lucky.
4. Anyone who is lucky wins the lottery.
Exciting Life
1. All people that are not poor and are smart are happy.
2. Those people that read are not stupid.
3. John can read and is wealthy.
4. Happy people have exciting lives.
Resolution Example
Clause Form
Algorithm
• Input: A knowledge base of facts (e.g., Fever(John)) and rules (e.g., ∀𝑥(Fever(𝑥) ∧
SoreThroat(𝑥) → Flu(𝑥))).
• Process:
1. Identify rules whose antecedents are satisfied by current facts, using unification to match
variables.
2. Apply these rules to add their consequents to the knowledge base.
3. Repeat until the goal is derived or no new facts can be added.
• Example:
• Facts: Fever(John), SoreThroat(John).
• Rule: ∀𝑥(Fever(𝑥) ∧ SoreThroat(𝑥) → Flu(𝑥)).
• Step 1: Unify 𝑥 = John, check antecedents Fever(John), SoreThroat(John) are true.
• Step 2: Infer Flu(John).
Sound and complete for first-order definite clauses (proof similar to propositional proof)
Datalog = first-order definite clauses + no functions (e.g., crime KB)
FC terminates for Datalog in poly iterations: at most 𝑝 · 𝑛 𝑘 literals
May not terminate in general if 𝛼 is not entailed
This is unavoidable: entailment with definite clauses is semidecidable
Simple observation: no need to match a rule on iteration 𝑘 if a premise wasn’t added on iteration
𝑘 −1
⇒ match each rule whose premise contains a newly added literal
Matching itself can be expensive
Database indexing allows 𝑂 (1) retrieval of known facts e.g., query Missile(x) retrieves Missile(M1)
Matching conjunctive premises against known facts is NP-hard
Forward chaining is widely used in deductive databases
The law says that it is a crime for an American to sell weapons to hostile nations. The country Nono,
an enemy of America, has some missiles, and all of its missiles were sold to it by Colonel West,
who is American.
CHAPTER 6. FIRST ORDER LOGIC 99
𝑂𝑤𝑛𝑠(𝑁𝑜𝑛𝑜, 𝑀1 ) ∧ 𝑀𝑖𝑠𝑠𝑖𝑙𝑒(𝑀1 )
𝑀𝑖𝑠𝑠𝑖𝑙𝑒(𝑥) ⇒ 𝑊 𝑒𝑎 𝑝𝑜𝑛(𝑥)
𝐴𝑚𝑒𝑟𝑖𝑐𝑎𝑛(𝑊 𝑒𝑠𝑡)
𝐸𝑛𝑒𝑚𝑦(𝑁𝑜𝑛𝑜, 𝐴𝑚𝑒𝑟𝑖𝑐𝑎)
Weapon(M1) Missile(M1)
Missile(M1)
Intermediate Steps Sells(West
Owns(Nono
Hostile(Nono) Enemy(Nono
Weapon(M1) Missile(M1)
Missile(M1)
Sells(West
Final Proof Criminal(West) Owns(Nono
Hostile(Nono) Enemy(Nono
American(West)
• Diagnosis Systems: Inferring diseases from symptoms (e.g., MYCIN infers flu from fever and
sore throat).
• Planning: Deriving action sequences in robotic systems (e.g., inferring cleaning actions for a
vacuum cleaner).
• Knowledge Base Updates: Automatically adding new facts based on rules (e.g., updating a
medical database with inferred diagnoses).
CHAPTER 6. FIRST ORDER LOGIC 101
Algorithm
• Input: A knowledge base of facts and rules, and a goal (e.g., Flu(John)).
• Process:
1. Start with the goal and identify rules whose consequents match it, using unification.
2. Treat the antecedents as sub-goals and recursively verify them against facts or other rules.
3. Continue until all sub-goals are satisfied (goal is proven) or no supporting facts are found
(goal fails).
• Example:
• Goal: Flu(John).
• Rule: ∀𝑥(Fever(𝑥) ∧ SoreThroat(𝑥) → Flu(𝑥)).
• Step 1: Unify goal with rule consequent: 𝑥 = John.
• Step 2: Sub-goals: Fever(John), SoreThroat(John).
• Step 3: Verify both sub-goals are true in the knowledge base.
• Conclusion: Flu(John) is proven.
Criminal(West)
Initial Goal
CHAPTER 6. FIRST ORDER LOGIC 102
American(West)
Weapon(y)
Goal Decomposition Criminal(West)
Sells(West
Hostile(Nono)
American(West)
Weapon(M1) Missile(M1)
Hostile(Nono) Enemy(Nono
• Knowledge Bases: Representing and querying structured knowledge (e.g., family trees,
medical databases).
• Natural Language Processing: Modeling linguistic relationships (e.g., Parent(𝑥, 𝑦) for family
relations in text).
CHAPTER 6. FIRST ORDER LOGIC 103
• Robotic Planning: Coordinating actions (e.g., vacuum cleaner agent rules like ∀𝑥(Room(𝑥) ∧
Dirty(𝑥) → Clean(𝑥))).
• Knowledge Base:
• Facts: Fever(John), SoreThroat(John).
• Rule: ∀𝑥(Fever(𝑥) ∧ SoreThroat(𝑥) → Flu(𝑥)).
• Forward Chaining:
• Unify 𝑥 = John, apply rule, infer Flu(John).
• Backward Chaining:
• Goal: Flu(John).
• Sub-goals: Fever(John), SoreThroat(John).
• Verify sub-goals, conclude Flu(John).
• Resolution:
• Convert to CNF: ¬Fever(𝑥) ∨ ¬SoreThroat(𝑥) ∨ Flu(𝑥), Fever(John), SoreThroat(John).
• Negate goal: ¬Flu(John).
• Resolve to derive empty clause, proving Flu(John).
Purpose of Planning
The purpose of planning is to find a sequence of actions that achieves a given goal when performed
starting in a given state. In other words, given a set of operator instances (defining the possible
primitive actions by the agent), an initial state description, and a goal state description or predicate,
the planning agent computes a plan. This process is depicted in Figure 7.2.
105
CHAPTER 7. PLANNING 106
Operator
Start
Instances PLAN
Final
Figure 7.2: Planning process from Start/Final through Operator Instances to a Plan.
enabling agents to make decisions and achieve objectives in complex environments. Planning
involves reasoning about actions, their preconditions, and their effects to construct a feasible path
from the starting point to the goal.
The primary goal of planning is to generate a plan, which is a sequence or partially ordered set
of actions that, when executed, achieves the desired outcome. For example, in robotics, planning
might involve determining a sequence of movements for a robot to navigate from one location to
another while avoiding obstacles.
• Initial State: The starting configuration of the system, describing the current world.
• Goal State: The desired configuration or condition the system aims to achieve.
• Actions/Operators: Operations that transform the state of the system, each with preconditions
(what must be true before the action) and effects (what changes after the action).
• Initial State: A complete description of the starting conditions, often represented as a set of
facts or predicates. For example, in a blocks world, the initial state might include facts like
on(A, Table) and clear(B).
• Goal State: A partial or complete description of the desired outcome. For instance, on(A,
B) and on(B, C) in the blocks world.
• Actions/Operators: A set of possible actions, each defined by:
• Preconditions: Conditions that must hold for the action to be executable.
• Effects: Changes to the state after the action is performed, including additions (new facts)
and deletions (facts that are no longer true).
• Initial State: Block A is on the table, Block B is on the table, and both are clear.
• Goal State: Block A is on Block B.
• Action: move(A, Table, B)
• Preconditions: clear(A), clear(B), on(A, Table).
• Effects: on(A, B), not on(A, Table), not clear(B).
Goal State
Initial State
A
A B B
Planners decompose the world into logical conditions and represent a state as a conjunction of
positive literals.
Example: In propositional logic,
𝑃∨𝑄
This is called an action schema, meaning it represents multiple actions that can be derived by
instantiating variables (𝑝, 𝐹𝑟𝑜𝑚, 𝑇 𝑜).
Action schemas generally have three parts:
Illustration of an Action
Starting in state 𝑠, executing an applicable action 𝑎 results in a new state 𝑠′ that is the same as 𝑠
except for the changes described by the action’s effects.
To illustrate how planning problems can be modeled, consider the following scenario.
• Agent is at home
• Has flour
Tasks To Do:
• Invite friends
• Buy butter
• Buy sugar
• Buy balloons
• Decorate house
• Bake cake
Start State
∧𝐿𝑜𝑐(𝑆𝑀) ∧ 𝐿𝑜𝑐(𝐻𝑊 𝑆)
CHAPTER 7. PLANNING 111
Operators
Buy(x)
PRE: 𝐴𝑡 (𝑠𝑡𝑜𝑟𝑒) ∧ 𝑆𝑒𝑙𝑙𝑠(𝑠𝑡𝑜𝑟𝑒, 𝑥)
EFF: 𝐻𝑎𝑣𝑒(𝑥)
Go(x, y)
PRE: 𝐴𝑡 (𝑥) ∧ 𝐿𝑜𝑐(𝑦)
Goal
Initial state:
Notes:
Goal state:
Variables in goals: Goals may also contain variables. For example, being at a store that
sells milk can be expressed as:
An operator schema is a general action with variables, which becomes a family of specific actions
when the variables are instantiated. Each variable must have a value, and the schema specifies:
• Effects: Describe how the state changes when the action is executed.
Classical planning assumes a deterministic, fully observable environment where the outcomes of
actions are predictable, and the agent has complete knowledge of the state. This is the focus of this
chapter. Key assumptions include:
For example, in a puzzle like the 8-puzzle, the initial configuration, possible moves (actions), and
desired configuration (goal) are fully known, and each move has a predictable outcome.
For this lecture, we focus on classical planning due to its foundational role and simplicity.
State space search explores the space of possible states by applying actions to transition from the
initial state to the goal state. It can be performed using:
• Forward Search (Progression): Start from the initial state and apply actions to reach the goal.
This is like a breadth-first or depth-first search.
• Backward Search (Regression): Start from the goal state and work backward to find actions
that lead to the initial state.
State space search is intuitive but can suffer from large state spaces, making it computationally
expensive. Heuristics, such as A* search, are often used to guide the search efficiently.
Partial order planning (POP) constructs plans by defining a partial order of actions rather than a strict
sequence. It starts with an empty plan and iteratively adds actions to satisfy preconditions, allowing
flexibility in the order of execution. POP is particularly useful when actions can be performed in
parallel or when the exact order is not critical.
For example, in a logistics problem, delivering two packages to different locations can be done in
any order, as long as both are delivered. POP allows such flexibility.
CHAPTER 7. PLANNING 114
Planning graphs are a data structure used to efficiently solve planning problems. A planning graph
consists of alternating layers of states (facts) and actions, capturing all possible actions and their
effects at each step. The graph is used to:
Planning graphs are particularly effective for problems with many parallel actions and help reduce
the search space.
State Layer 0 Action Layer 1 State Layer 1 Action Layer 2 Goal State
A state describes the configuration of the world at a specific moment. In the STRIPS framework,
states are represented as a conjunction of positive ground literals—function-free atomic propositions
that are true in that state. For example, a state might be 𝑃𝑜𝑜𝑟 ∧ 𝑈𝑛𝑘𝑛𝑜𝑤𝑛, or in a delivery robot
domain, 𝐴𝑡 (𝑅𝑜𝑏𝑜𝑡, 𝐿𝑎𝑏) ∧ ¬𝐻𝑎𝑠𝐶𝑜 𝑓 𝑓 𝑒𝑒 ∧ 𝑊 𝑎𝑛𝑡𝑠𝐶𝑜 𝑓 𝑓 𝑒𝑒(𝑆𝑎𝑚) ∧ ¬𝑀𝑎𝑖𝑙𝑊 𝑎𝑖𝑡𝑖𝑛𝑔 ∧ 𝐻𝑎𝑠𝑀𝑎𝑖𝑙.
The closed-world assumption applies: any unmentioned literal is considered false. This is illustrated
in Figure 7.7, where the state is depicted as a set of active literals with unmentioned ones implicitly
false.
States can also be viewed as assignments to features or variables, such as 𝑅𝐿𝑜𝑐 = 𝑙𝑎𝑏 and
𝑅𝐻𝐶 = 𝑓 𝑎𝑙𝑠𝑒. Advanced representations like ADL allow negative literals, enabling partial
specifications.
CHAPTER 7. PLANNING 115
State
Literals: Implicitly False:
𝐴𝑡 (𝑅𝑜𝑏𝑜𝑡, 𝐿𝑎𝑏) ¬𝐻𝑎𝑠𝐶𝑜 𝑓 𝑓 𝑒𝑒
𝐻𝑎𝑠𝑀𝑎𝑖𝑙 ¬𝑊 𝑎𝑛𝑡𝑠𝐶𝑜 𝑓 𝑓 𝑒𝑒(𝑆𝑎𝑚)
Figure 7.7: Representation of a state with active and implicitly false literals.
A goal represents the desired world configuration that the planning agent aims to achieve. It is
typically a conjunction of positive ground literals, such as 𝑅𝑖𝑐ℎ ∧ 𝐹𝑎𝑚𝑜𝑢𝑠 or 𝐴𝑡 (𝑃𝑙𝑎𝑛𝑒2, 𝑇 𝑎ℎ𝑖𝑡𝑖).
A state satisfies a goal if it includes all specified literals, as shown in Figure 7.8.
Goal
Target Literals: Satisfied by State with
𝐴𝑡 (𝑃𝑙𝑎𝑛𝑒2, 𝑇 𝑎ℎ𝑖𝑡𝑖) 𝑅𝑖𝑐ℎ ∧ 𝐴𝑡 (𝑃𝑙𝑎𝑛𝑒2, 𝑇 𝑎ℎ𝑖𝑡𝑖) ∧ 𝐹𝑎𝑚𝑜𝑢𝑠
𝑅𝑖𝑐ℎ
Goals can be achievement goals (true in the final state), maintenance goals (true throughout),
transient goals (true at some point), or resource goals (optimizing resources). ADL extends this
with disjunctions (e.g., 𝑃 ∨ 𝑄), negations, and quantified variables.
Actions transform states and are defined using schemas in STRIPS, consisting of:
- Action Name and Parameters: E.g., 𝐹𝑙 𝑦( 𝑝, 𝑓 𝑟𝑜𝑚, 𝑡𝑜).
- Preconditions: Literals that must hold, e.g., 𝐴𝑡 ( 𝑝, 𝑓 𝑟𝑜𝑚) ∧ 𝑃𝑙𝑎𝑛𝑒( 𝑝) ∧ 𝐴𝑖𝑟 𝑝𝑜𝑟𝑡 ( 𝑓 𝑟𝑜𝑚) ∧
𝐴𝑖𝑟 𝑝𝑜𝑟𝑡 (𝑡𝑜).
- Effects: Changes to the state, e.g., ¬𝐴𝑡 ( 𝑝, 𝑓 𝑟𝑜𝑚) ∧ 𝐴𝑡 ( 𝑝, 𝑡𝑜), split into add and delete lists.
An action schema represents ground actions (e.g., 𝐹𝑙 𝑦(𝑃𝑙𝑎𝑛𝑒1, 𝐽𝐹𝐾, 𝐿 𝐴𝑋)). Applicability requires
all preconditions to be true, with the resulting state adding positive effects and removing matching
negative effects.
- Frame Problem: Unchanged elements persist unless deleted (see Figure 7.10). - Qualification
Problem: All preconditions must be listed. - Ramification Problem: Indirect effects are hard to
capture.
Figure 7.10: Illustration of the frame problem where unchanged literals persist.
Extensions and Enhancements - ADL: Supports negative preconditions, conditional effects, and
types. - PDDL: Adds numeric fluents and durative actions.
Example in Blocks World - Action: 𝑠𝑡𝑎𝑐𝑘 (𝑥, 𝑦) - Precond: ℎ𝑜𝑙𝑑𝑖𝑛𝑔(𝑥) ∧ 𝑐𝑙𝑒𝑎𝑟 (𝑦) - Effect:
¬ℎ𝑜𝑙𝑑𝑖𝑛𝑔(𝑥) ∧ ¬𝑐𝑙𝑒𝑎𝑟 (𝑦) ∧ 𝑜𝑛(𝑥, 𝑦) ∧ 𝑐𝑙𝑒𝑎𝑟 (𝑥) ∧ ℎ𝑎𝑛𝑑𝑒𝑚 𝑝𝑡𝑦
Figure 7.11 shows this action’s transformation.
𝑠𝑡𝑎𝑐𝑘 (𝑥, 𝑦)
𝑥 𝑥
𝑦 𝑦
Preconditions: Effects:
ℎ𝑜𝑙𝑑𝑖𝑛𝑔(𝑥) ∧ 𝑐𝑙𝑒𝑎𝑟 (𝑦) 𝑜𝑛(𝑥, 𝑦) ∧ ¬𝑐𝑙𝑒𝑎𝑟 (𝑦)
8.1 Introduction
Machine learning is like teaching a computer to learn from examples, much like how you learn
by practicing or observing. There are different ways machines learn, called learning paradigms,
each suited for specific tasks, like recognizing images, grouping similar items, or making decisions
in games. This chapter explains four main types—supervised, unsupervised, semi-supervised,
and reinforcement learning—in a way that’s easy for students to understand. We’ll use real-world
examples, including recent trends involving ChatGPT and ShellGPT, to show how these methods
work. Key terms and tips are highlighted in colorful boxes to make them stand out.
Machine Learning
Machine learning is a part of artificial intelligence where computers learn patterns from data
to make predictions or decisions, like recognizing a photo or suggesting a movie, without
being explicitly told what to do.
[title= Why Learning Types Matter] Each learning type fits different problems. For example, if
you have clear examples with answers (like labeled photos), supervised learning is best. If you’re
exploring data without labels, like grouping customers, unsupervised learning helps. Choosing the
right type saves time and improves results.
Supervised Learning
Supervised learning uses a dataset with inputs and their correct answers (labels) to train a
model. The model learns to predict answers for new inputs, like identifying if an email is
spam based on examples.
117
CHAPTER 8. FORMS OF LEARNING 118
[title= Need for Labeled Data] Supervised learning needs lots of labeled data, which can be expensive
to create. For example, labeling medical images for cancer detection requires doctors’ expertise,
making it time-consuming.
The model looks at examples, finds patterns, and creates a rule to predict answers. For instance, to
classify emails, it might notice that spam emails often contain words like "win" or "free." Common
tools include decision trees (like a flowchart), neural networks (like a brain), and logistic regression
(for yes/no predictions).
8.2.2 Examples
1. ChatGPT for Text Classification (2025): ChatGPT, built by OpenAI, uses supervised learning
to classify text, like detecting if a tweet is positive or negative. It’s trained on massive datasets
of labeled text (e.g., reviews marked as "happy" or "sad"). In 2025, ChatGPT Plus users use this
for real-time sentiment analysis on social media. [Link][]([Link]
2. ShellGPT for Command Prediction (2025): ShellGPT, a command-line tool, predicts
Linux commands from natural language inputs, like "find all text files." It’s trained on
labeled pairs of user prompts and correct commands, helping developers work faster. For
example, typing "rename file [Link] to [Link]" gives the command mv [Link] [Link].
[Link][]([Link]
3. Image Recognition: Apps like Google Photos use supervised learning to tag photos as "beach"
or "party" by training on labeled images.
4. House Price Prediction: Zillow predicts house prices using features like size and location,
trained on past sale prices.
5. Spam Email Filtering: Gmail filters spam by learning from emails users mark as spam or not.
6. Medical Diagnosis: Models predict diseases (e.g., diabetes) from labeled patient data like
blood tests.
7. Stock Price Prediction: Predicting stock prices based on historical data and trends.
8. Fraud Detection: Banks flag suspicious credit card transactions using labeled fraud data.
9. Handwriting Recognition: Converting handwritten notes to text, trained on labeled hand-
writing samples.
10. Weather Forecasting: Predicting tomorrow’s temperature using past weather data.
Unsupervised Learning
Unsupervised learning analyzes data without labels to find patterns, like grouping similar
customers or reducing data complexity, without being told what to look for.
No Labels Needed
Unsupervised learning is perfect when you don’t have labeled data or want to explore unknown
patterns. For example, retailers use it to find customer groups without knowing what those
groups are.
The model groups data (clustering) or simplifies it (dimensionality reduction). For clustering, it
might group customers by shopping habits. For dimensionality reduction, it shrinks data while
keeping important details, like turning 100 features into 2 for visualization. Common tools include
K-Means (for grouping) and PCA (for simplifying data).
8.3.2 Examples
1. ChatGPT for Topic Discovery (2025): ChatGPT uses unsupervised learning to find
topics in large text datasets, like grouping news articles into politics or sports with-
out labels. In 2025, it helps researchers analyze trends in social media posts. chat-
[Link][]([Link]
2. ShellGPT for Log Analysis (2025): ShellGPT analyzes server logs without labels to find
patterns, like grouping similar errors. For example, piping cat /var/log/[Link] |
sgpt "find error patterns" groups errors like "connection timeout" for developers to
fix. [Link][]([Link]
3. Customer Segmentation: Amazon groups shoppers by purchase habits for targeted ads.
4. Anomaly Detection: Cybersecurity tools spot unusual network activity, like hacks.
5. Recommendation Systems: Netflix suggests movies by grouping users with similar tastes.
6. Image Compression: Reducing image file sizes while keeping quality.
7. Market Basket Analysis: Finding items often bought together, like bread and butter.
8. Genomic Clustering: Grouping genes with similar patterns for medical research.
9. Social Network Analysis: Finding friend groups in social media networks.
10. Topic Modeling: Grouping blog posts into themes like travel or tech.
CHAPTER 8. FORMS OF LEARNING 120
Semi-Supervised Learning
Semi-supervised learning uses a small set of labeled data and a large set of unlabeled data to
train a model, combining the strengths of supervised and unsupervised learning to improve
predictions.
[title= Cost-Effective] This method saves time and money by needing fewer labels. For example,
labeling a few images for a self-driving car can guide the model to learn from thousands of unlabeled
images.
The model starts with labeled data to learn basic patterns, then uses unlabeled data to refine them.
It might assume similar data points have similar labels (smoothness) or form clusters. Techniques
include self-training (guessing labels for unlabeled data) and graph-based methods (spreading labels
like a network).
8.4.2 Examples
1. ChatGPT for Text Labeling (2025): ChatGPT uses semi-supervised learning to clas-
sify text with few labeled examples. For instance, with a few labeled reviews, it la-
bels thousands of unlabeled ones, used in 2025 for analyzing product feedback. chat-
[Link][]([Link]
2. ShellGPT for Command Suggestions (2025): ShellGPT learns from a few labeled command-
prompt pairs and many unlabeled user inputs to suggest commands, like suggesting ls for "list
files." [Link][]([Link]
3. Image Segmentation: Labeling a few pixels in medical images to segment tumors in many
images.
4. Speech Recognition: Using a small set of transcribed audio to improve recognition on
unlabeled audio.
5. Web Page Classification: Labeling a few websites to categorize many others.
6. Protein Structure Prediction: Using a few known protein structures to predict others.
7. Document Classification: Labeling key documents to classify an archive.
8. Medical Image Analysis: Labeling a few scans to guide analysis of many.
CHAPTER 8. FORMS OF LEARNING 121
Reinforcement Learning
Reinforcement learning involves an agent learning to make decisions by trying actions in an
environment, receiving rewards or penalties, and optimizing for the highest total reward.
[title= Learning by Doing] This method mimics trial-and-error learning, like a child learning to ride
a bike. It’s powerful for tasks like games or robotics but needs careful reward design to avoid bad
habits.
The agent observes the environment’s state, chooses an action, and gets a reward. It learns a strategy
(policy) to pick actions that maximize future rewards. Tools include Q-Learning (tracking action
values) and Deep Q-Networks (using neural networks for complex tasks).
8.5.2 Examples
1. ChatGPT for Interactive Learning (2025): ChatGPT’s conversational abilities are enhanced
by reinforcement learning, where it learns from user feedback (e.g., thumbs-up on good
responses) to improve answers. In 2025, this powers real-time voice chats on its app.
[Link][]([Link]
2. ShellGPT for Workflow Optimization (2025): ShellGPT learns to suggest efficient command
sequences by trial and error, rewarded for faster task completion, like optimizing a script for
file processing. [Link][]([Link]
3. Game Playing: AlphaGo learned to play Go by playing millions of games, rewarded for wins.
4. Robotics: A robot arm learns to pick up objects, rewarded for success.
5. Autonomous Driving: Cars learn to navigate safely, rewarded for avoiding crashes.
6. Stock Trading: Agents trade stocks, rewarded for profit.
7. Recommendation Systems: Suggesting products based on user clicks.
8. Drone Navigation: Drones learn to avoid obstacles, rewarded for reaching goals.
CHAPTER 8. FORMS OF LEARNING 122
Uniform Cost Search (UCS) differs from Breadth-First Search (BFS) as it accounts for varying costs of edges, ensuring the path with the lowest total cost is found, which makes it more efficient for pathfinding in weighted graphs. While BFS assumes uniform edge costs and is optimal for unweighted graphs, UCS's ability to find the least-cost path is particularly useful in navigation systems and resource allocation problems where costs vary .
AI applications enhance decision-making in healthcare by improving diagnostic capabilities, such as using AI systems like AlphaFold to solve complex problems like protein folding, and developing diagnostic tools that outperform human doctors in detecting certain cancers, thus increasing accuracy and speed in patient care .
Autonomous navigation systems, like NASA’s Perseverance rover and Waymo’s self-driving cars, navigate complex environments by processing sensor data to avoid obstacles and select paths. The systems use AI to analyze real-time data for decision-making and to facilitate safe and efficient travel .
AI in facial recognition systems faces ethical challenges primarily related to bias and misidentification errors. These challenges impact society by potentially leading to wrongful identification and subsequent unfair treatment of individuals, raising concerns about privacy and civil rights .
Narrow AI is designed to perform specific tasks with high proficiency but lacks general intelligence, operating within a predefined scope. In contrast, general AI aims to replicate human-like intelligence, capable of performing a wide array of tasks. The distinction is significant as it highlights the current capabilities and limitations of AI, guiding future research and development .
AlphaGo, developed by DeepMind, demonstrated the capabilities of reinforcement learning and strategic planning by defeating Go champion Lee Sedol. This was achieved through learning optimal strategies and making strategic decisions during gameplay, exemplifying advanced AI capabilities in complex tasks .
Learning agents have the advantage of adaptability to dynamic or unknown environments, improving accuracy and decision-making over time as they learn from experience. They are particularly useful in domains such as robotics, autonomous vehicles, and recommendation systems, where adaptability and continuous improvement are crucial .
Explainable AI plays a crucial role in military and medical applications by providing transparency in AI decision-making processes. This transparency is essential for building trust and understanding in critical situations where AI decisions impact safety and outcomes, such as in military operations and medical diagnostics .
The societal impacts of job displacement due to AI advancements include potential economic challenges as workers in automated roles might face unemployment, necessitating retraining and reskilling programs, and could shift societal dynamics by widening the gap between technology-adaptive businesses and traditional industries .
Generative AI tools like ChatGPT and DALL-E have significantly influenced the creative industries by enabling the creation of human-like text and images, which are now used in various applications such as customer service and AI-generated art featured in galleries .