0% found this document useful (0 votes)
17 views52 pages

UNIT I-Part 1

This document provides an introduction to machine learning (ML) and its subfields, including supervised learning, unsupervised learning, and reinforcement learning. It discusses key concepts, types, and applications of ML, such as image recognition, natural language processing, and predictive analytics in various industries. Additionally, it highlights the importance of data science and the role of algorithms and models in training and evaluating machine learning systems.

Uploaded by

janarthana9789
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views52 pages

UNIT I-Part 1

This document provides an introduction to machine learning (ML) and its subfields, including supervised learning, unsupervised learning, and reinforcement learning. It discusses key concepts, types, and applications of ML, such as image recognition, natural language processing, and predictive analytics in various industries. Additionally, it highlights the importance of data science and the role of algorithms and models in training and evaluating machine learning systems.

Uploaded by

janarthana9789
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 52

UNIT I

Introduction to Machine
Learning and Supervised
Learning
Code:U18CST7002
Presented by: Nivetha R
Department: CSE
Introduction to Machine Learning
Introduction to Machine Learning

The broadest field Examples include natural


Artificial Intelligence encompassing all aspects perform tasks requiring language processing,
(AI): of creating systems that human intelligence. robotics, and expert
can systems.

A subset of AI focusing Techniques include


on systems that learn without being explicitly supervised,
Machine Learning (ML):
and improve from programmed. unsupervised, and
experience reinforcement learning.

A subset of ML involving Applications include


neural networks with patterns in large image recognition,
Deep Learning (DL):
many layers to model datasets. speech processing, and
complex autonomous driving.
Introduction to Machine Learning
•Data Science:
•An interdisciplinary field combining domain expertise,
programming skills,
•and knowledge of mathematics and statistics to extract
meaningful insights from data.
•Involves data manipulation, analysis, visualization, and machine
learning.
Introduction to Machine Learning
• Machine Learning is the study of algorithms and
statistical models that computer systems use to perform
tasks without explicit instructions.
• Basic Concepts:
• - Data: The core of machine learning.
• - Algorithms: Procedures or formulas for solving
problems.
• - Models: The representation of the algorithm's solution.
• Training Data: Data used to train the model.
• Test Data: Data used to evaluate the model
• Features: Individual measurable properties of the data.
• Labels: Output or target variable in supervised learning.
Machine Learning
Applications
• Examples:
• - Image Recognition: Identifying objects in images.
• - Natural Language Processing (NLP): Language
translation, sentiment analysis.
• - Autonomous Vehicles: Self-driving cars.
• - Healthcare: Predictive diagnostics, personalized
treatment plans.
Machine Learning - Types
Machine Learning Types
Supervised Learning :Learn from labeled
data to make predictions or classify new data
points.
Spam detection, image recognition, medical
diagnosis.
Predicting house prices, stock prices, or any
continuous values.
Unsupervised Learning :Find hidden patterns
or intrinsic structures in unlabeled data
Customer segmentation, grouping similar
items, anomaly detection.
Reinforcement Learning: Learn by
interacting with an environment to maximize
cumulative rewards.
Chess, Go, video game
Supervised Learning
• It is an algorithm that learns from labeled training data to help you
predict outcomes for unforeseen data.
Labeled data

• Regression and Classification are supervised learning problems


Input:x
Ouput:y
Classification Regression
Task: To learn the mapping function fromy
xClass
to(eg:0/1)
y Number (price of a car)

• Assume a model upto a set of parameters


g(.) Discriminate Regression function
function
• y=g(x|) , g-model, -parameters
9
Classification
Predict the categories(classes)

1. Banking – Predicting risk associated with a loan


• Input: Info about customer,
• classes: 2 (low risk & high risk)
• ML model is trained with past data, then it is used to
calculate the risk of new application and also
decides to accept or refuse it
Ex: classification rule
• IF income > 1 AND savings > 2 THEN low-risk ELSE high-risk
• for suitable values of 1 and 2. This is called Discriminate (function
which separates different classes)
• Prediction – 0/1 (Low risk or high risk)
• Sometimes, It can be given as association – P(Y=1/X=x)=0.8 ->
customer has 80% probability of being high-risk or 20% probability of
being low risk

10
Classification
2. Pattern Recognition
• Character recognition
• Input: character image, classes: as many number of characters
• recognizing char code from images.
• ML is used to learn sequences and model dependencies
• Face recognition
• Input: image, classes: people to be recognized
• ML algorithm should learn to associate the face image to identities
• Medical diagnosis
• Input: info about patient, classes: various illness
• Model may identify the type of disease
• Speech recognition
• Input: acoustic, classes: words
• Model should learn the association from an acoustic signal to a word of some language
• Biometrics
• Input: Physiological(face,fingerprint,iris,palm) /behavioural characteristics(voice,gait,key
stroke)
• Decision: Accept/Reject
11
Regression
Predicting the output
1. Price of used car
Input: car attributes-brand,year, mileage, engine,etc.,
Output: price
x-denote attributes of car,
y - price of the car
By using training data (past transactions) ML model fits a function to this
data to learn y as a function of x.
y=wx+w0
2. Price of House
Input: size in feet
Output: price
3. Navigation of Mobile Robot
Input: from sensors-GPS,camera,etc
Output: angle by which steering wheel should be turned
12
Unsupervised Learning
• Only input data is available (unlabeled)
• Aim is to find the regularities/pattern/structure in the input data
• Problems
• Clustering
• Association

13
Clustering Places the data into diff erent
clusters
1. Customer Dataset analysis
• Company wants to know the distribution of the profile of its
customers
• Clustering model allocates customers similar in their attributes to the
same group
• Also possible to identify outliers-who are different from other
customers
2. Image Compression
• Input –image pixels represented as RGB values
• Clustering program groups pixels with similar colors in the same
group (colors that are frequently occurring in image)
3. Document clustering
• Ex: News reports: clusters: Politics, sports, entertainment, etc.,
• Document-bag of words-predefine lexicon of N words
4. Bioinformatics (DNA is a sequence of bases, protein is a
sequence of amino acids)
• Alignment- Matching one sequence to another 14
Association
Finding association between products

• Basket Analysis
• Finding associations between products bought by customer
• If people who buy X typically also buy Y and if there is a
customer who buys X but not Y, then he is the potential Y
customer
• Target the potential customer for cross selling
• Association rule – conditional probability
• P(Y/X)
• Ex: P(Jam/Bread) = 0.8 -> 80 % of customer who
buy bread also buy Jam
• To make a distinction among customer,
• P(Y/X,D) D is set of attributes like gender, age, marital
status, etc.,

15
Reinforcement Learning
Closest to human learning

• It is a type of machine learning


method where an intelligent agent
(computer program) interacts with
the environment and learns to act
within that
1. Game Playing
• It is concerned with how intelligent
agents ought to take actions in an Sequence of right moves
environment in order to maximize A move is good if it is part of good
the notion of cumulative reward. policy
• In some application, o/p of a system 2. Robot Navigation
is sequence of actions (policy) to It should learn correct sequence of
reach the goal
actions to reach the goal state from
initial state
Other Examples: Autonomous
16
cars,
Comparison
Feature Supervised Unsupervised Reinforcement

Definition Method in which machine is Machine is trained with unlabelled data Agent interacts with the environment
trained with labelled data without any guidance through actions & discovers errors or
rewards

Types of Problems Regression, classification Association, clustering, dimensionality Trial and error
reduction

Types of data Labeled Unlabeled No predefined data

Training External supervision No supervision No supervision

Aim Forecast outcomes Discover patterns Learn series of action

Approach Map input to output Understand pattern and discover output Follow trail and error method

Output Feedback Direct feedback No feedback Reward system

Popular algorithms Linear, Logistic regression, SVM, K-means, Apriori Q-learning, SARSA
KNN, Random Forest

Example Applications Risk Evaluation, Forecast sales Recommendation system, Anomaly Self driving cars, Gaming
detection

Reference: Supervised vs Unsupervised vs Reinforcement Learning | Data Science Certification Training | Edureka - YouTube

17
Examples of Machine Learning
Applications
Learning Associations:
• Retail (Basket Analysis): Identifying
associations between products bought together,
e.g., customers who buy beer also buy chips
(70% association).
• Booksellers: Cross-selling based on purchase
history of books or authors.
• Web Portals: Predicting links a user is likely to
click, preloading pages for faster access.
Classification:
• Predicting loan default risk based on customer
attributes like income, savings, and past financial
history.
• Classifying patient symptoms to diagnose
diseases.
Examples of Machine Learning App
Regression:
• Steering angle prediction for navigating
without hitting obstacles.
• Optimizing coffee quality based on various
settings (temperature, time, bean type).
Unsupervised Learning:
• Grouping similar data points without labeled
outcomes.
• Analyzing customer demographics and
transactions to identify common profiles.
Reinforcement Learning:
• Learning optimal strategies for games by trial
and error.
• Developing sequences of actions to achieve a
Use Case in Smart
phones
Did you know that machine learning powers most of the
features on your smartphone?
Voice Assistants

Based on concept of speech recognition


• These voice assistants recognize speech (the words we say) using
Natural Language Processing (NLP), convert them into numbers
using machine learning, and formulate a response accordingly.
Smartphone Cameras
• Object detection to
locate and single out
the object(s) (or
human) in the image
• Filling in the missing
parts in a picture
• Using a certain type
of neural network to
enhance the image
or even extend its
boundaries by
imagining what the
image would look
like, etc.
Face Unlock – Smartphones
• Smartphones use a technique called facial recognition to do
this.
• Core idea behind facial recognition is powered by – machine
learning.
• The example applications of facial recognition are:
• Facebook uses it to identify the people in images
• Governments are using it to identify and catch criminals
• Airports are using to verify passengers and crew members,
and so on
App Store and Play Store Recommendations
Machine Learning Use
Cases in Transportation
ML in Transportation

• The application of machine learning in the transport


industry has gone to an entirely different level in
the last decade.
• This coincides with the rise of ride-hailing apps like
Uber, Lyft, Ola, etc.
• These companies use machine learning throughout
their many products, from planning optimal routes
to deciding prices for the ride we take.
Google Maps
• Google uses a ton of machine learning
algorithms to produce all these
features.
• Machine learning is deeply embedded
in Google Maps and that’s why the
routes are getting smarter with each
update.
How ML in Google Maps?
• Routes: Go from point A to point B
• Estimated time to travel this route
• Traffic along the route
• The ‘Explore Nearby’ feature:
Restaurants, petrol pumps, ATMs,
Hotels, Shopping Centres, etc.
Machine Learning Use
Cases in Popular Web
Services
Email filtering-SPAM OR NOT

• They’re using machine learning to parse through the


email’s subject line and categorize it accordingly.
• Take Gmail for example.
• The machine learning algorithm Google uses has
been trained on millions of emails so it can work
seamlessly for the end-user (us).
• While Gmail allows us to customize labels, the service
offers default labels:
• Primary
• Social
• Promotions
Google Search
• ML in Google Search!!!
• Google uses machine learning to power its
Search engine.
• The amount of data Google has to constantly
train and refine its algorithms
Google Translate

• Google uses machine learning to understand


the sentence(s) sent by the user, convert
them to the requested language, and show
the output.
• Machine learning is deeply embedded in
Google’s ecosystem and we are all
benefitting from that.
• Concept: NLP
• Google uses machine learning to power it’s
Translate engine
LinkedIn and Facebook
recommendations and ads
• Social media platforms are classic use cases of machine
learning. Like Google, these platforms have integrated
machine learning into their very fabric.
• From your home feed to the kind of ads you see, all of
these features work on machine learning.
• A feature which we regularly see if ‘People you may know’.
• This is a common feature across all social media
platforms, Twitter, Facebook, LinkedIn, etc.
• These companies use machine learning algorithms to look
at your profile, your interests, your current friends, their
friends, and a whole host of other variables.
Machine Learning Use Cases
in Sales and Marketing
Recommendation Engines

• If you log on to a site and it recommends products and


services to me based on my taste and previous
browsing history.
• Some popular examples of recommendation engines:
• E-commerce sites like Amazon and Flipkart
• Book sites like Goodreads
• Movie services like IMDb and Netflix
• Hospitality sites like MakeMyTrip, Booking.com, etc.
• Retail services like StitchFix
• Food aggregators like Zomato and Uber Eats
Personalized Marketing

• How many calls do you get from credit card or loan


companies offering their services “for free”?
• These calls offer the same services without
understanding what you want (or don’t want). It’s
traditional marketing that is now outdated and well
behind the digital revolution.
• The meaning of this concept is in the name itself –
it is a type of marketing technique tailored to an
individual’s need.
• Recommendation engines are part of an overall
umbrella concept called personalized marketing.
Supervised Learning
• Definition: A type of machine learning where
the model is trained on labeled data.
• Labeled Data: The dataset includes input-
output pairs.
• Training and Testing: The data is typically
split into a training set and a testing set.
• Predictive Modeling: The algorithm
predicts the output for new inputs based on
the learned mapping.
• Examples : Linear Regression, Logistic
Regression,Support Vector Machines
(SVM),Decision Trees,Random Forests ,Neural
Networks
Supervised Learning
Loan prediction Example:
Classification rule based on Association:
Lables : 0/1 (low-risk/high-risk)
IF income>θ1 AND savings>θ2 THEN low-risk ELSE high-
risk

P(Y =1|X =x)=0.8,

The customer has an


80 percent probability
of being high-risk, or
equivalently a 20
percent probability of
being low-risk.
Learning a Class from
Examples
• Learning from examples involves training a model
using a dataset where each example is labeled with
the correct output.
• The model identifies patterns and relationships
between the input features and the output labels.
Once trained, the model can predict the output for
new, unseen inputs.
• Positive and Negative Examples
• Positive Examples: Instances that belong to the
class we want to learn.
• Negative Examples: Instances that do not belong to
the class.
• Example Scenario: Learning the class of "family
cars":
• Show various car models to people and label
them as either family cars (positive examples) or
not (negative examples).
Learning a Class from Examples

Input Representation
• Attributes: Choose relevant features to represent each
example.
• Example: Price (x1x_1x1​) and Engine Power (x2x_2x2​).
• Data Points: Each car is represented by a pair of
values (price, engine power).
• Example Plot: Training set plotted in a 2D space with
x1​(Price) and x2(Engine Power).

Hypothesis Class:
A set of possible hypotheses to describe the class.
For family cars, the hypothesis class can be a
rectangle in the price-engine power space:
Hypothesis Class (H):(p1​≤price≤p2​) AND (e1​
≤engine power≤e2​)
Learning a Class from Examples
Learning a Class from
Examples
Learning Process
• Finding Hypothesis: The goal is to find the
hypothesis h∈H that best approximates the class C.
• Empirical Error: Proportion of training instances
where the predictions of h do not match the true
labels.
Learning a Class from Examples

• Most Specific Hypothesis (S): Tightest rectangle


that includes all positive examples and no negative
examples.
• Most General Hypothesis (G): Largest rectangle
that includes all positive examples and no negative
examples.
• Version Space: All hypotheses between SS and
GG.
Noise in Data
• Noise refers to any unwanted anomaly in the data
that makes the class more difficult to learn and may
make zero error infeasible with a simple hypothesis
class.
Types of Noise:
1.Imprecision in Recording: Errors in recording
input attributes that shift data points in the input
space.
2.Labeling Errors (Teacher Noise): Incorrect
labeling of data points, resulting in misclassified
instances.
3.Hidden Attributes: Unobserved attributes
affecting the label of instances, modeled as a
random component and included in noise.
Noise in Data
Noise in Data
Handling Noise with Hypothesis Classes

• Simple Hypothesis Class (Rectangle):


• Defined by four parameters.
• Easier to use, train, and explain.
• Less variance but may have higher bias.
• Better generalization due to simplicity (Occam's
Razor)
• Complex Hypothesis Class (Arbitrary Shape):
• Defined by many parameters.
• Can perfectly fit noisy data but may overfit.
• Higher variance, less robust to new data.
Noise in Data
Noise in Data
Noise in Data
Noise in Data
Bias and Variance
• Bias
Bias refers to the error introduced by approximating
a real-world problem, which may be complex, by a
simplified model.
High bias means the model is too simple and does
not capture the underlying patterns in the data well.
This usually leads to underfitting.

• Variance
Variance refers to the model's sensitivity to small
fluctuations in the training data. A model with high
variance pays too much attention to the training
data, including the noise, and might not perform well
on new, unseen data. This usually leads to
overfitting.
Noise in Data
Aspect Bias Variance
Error introduced by a Error due to model's
Definition
simplified model sensitivity to data
Systematic errors in High variability in
Effect
predictions predictions
Characteristic Captures noise along with
Misses underlying patterns
s patterns
Strategy to
Use more complex models Use simpler models
Reduce
Cross-validation, Cross-validation,
Techniques Regularization, Ensemble Regularization, Ensemble
Methods Methods
Visual
Underfitting (high bias, low Overfitting (low bias, high
Representatio
variance) Variance)
n
References
• 1. Ethem Alpaydin, “Introduction to Machine
Learning”, Second Edition, MIT Press, 2013.
• 2. Tom M. Mitchell, “Machine Learning”, McGraw-Hill
Education, 2013.
• 3. Stephen Marsland, “Machine Learning: An
Algorithmic Perspective”, CRC Press, 2009.
• 4. Y. S. Abu-Mostafa, M. Magdon-Ismail, and H.-T.
Lin, “Learning from Data”, AML Book Publishers,
2012.
• 5. K. P. Murphy, “Machine Learning: A Probabilistic
Perspective”, MIT Press, 2012.
• 6. M. Mohri, A. Rostamizadeh, and A. Talwalkar,
“Foundations of Machine Learning”, MIT Press,
2012.

You might also like