0% found this document useful (0 votes)
30 views

Chapter 1

Intelligent Systems The Machine Learning Landscape General definition Machine Learning is the science (and art) of programming computers so they can learn from data. Here is a slightly more general definition: [Machine Learning is the] field of study that gives computers the ability to learn without being explicitly programmed. Arthur Samuel, 1959 And a more engineering-oriented one: A computer program is said to learn from experience E with respect to some task T and some performance measure

Uploaded by

a9ooly
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Chapter 1

Intelligent Systems The Machine Learning Landscape General definition Machine Learning is the science (and art) of programming computers so they can learn from data. Here is a slightly more general definition: [Machine Learning is the] field of study that gives computers the ability to learn without being explicitly programmed. Arthur Samuel, 1959 And a more engineering-oriented one: A computer program is said to learn from experience E with respect to some task T and some performance measure

Uploaded by

a9ooly
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 40

INTELLIGENT SYSTEMS

Lecture 1

Chapter 1 The Machine Learning Landscape


2

What is Machine Learning?


General definition
• Machine Learning is the science (and art) of programming computers so they can learn
from data.
Here is a slightly more general definition:
• [Machine Learning is the] field of study that gives computers the ability to learn without
being explicitly programmed. Arthur Samuel, 1959
And a more engineering-oriented one:
• A computer program is said to learn from experience E with respect to some task T and
some performance measure P, if its performance on T, as measured by P, improves with
experience E. Tom Mitchell, 1997
3

Example of Engineering-Oriented Definition


The spam filter is a ML program that can learn to flag spam given examples of:
• spam emails (e.g., flagged by users)
• regular (nonspam, also called “ham”) emails.
• The examples that the system uses to learn are called the training set.
• Each training example is called a training instance (or sample).
• The task T is to flag spam for new emails. A computer program is said to learn from
experience E with respect to some task T
• The experience E is the training data and some performance measure P, if its
performance on T, as measured by P,
• The performance measure P needs to be defined; improves with experience E. Tom Mitchell, 1997
• for example, you can use the ratio of correctly classified emails. This performance measure is
called accuracy and it is often used in classification tasks.
4

Why use Machine Learning?

1. Machine Learning is great for problems for which existing solutions require
a lot of hand-tuning or long lists of rules: one Machine Learning algorithm
can often simplify code and perform better.
5

Example of Problem Requires Long Lists of


Rules
The traditional approach of Spam Filter
1. Look at the spam and identify some words or
phrases, patterns ( “4U,” “credit card,” “free,”
and “amazing”).
2. Write a detection algorithm for each of the
patterns that you noticed, and your program
would flag emails as spam if a number of
these patterns are detected.
3. You would test your program, and repeat
steps 1 and 2 until it is good enough. Long list of complex rules. Pretty hard to
maintain.
6

Example of Problem Requires Long Lists of


Rules
Machine Learning Approach: Automatically learns which words and phrases
are good predictors of spam by detecting unusually frequent patterns of words
in the spam examples compared to the ham examples.

The program is much shorter, easier to Automatically adapting to change


maintain, and most likely more accurate.
7

Why use Machine Learning?


1. Machine Learning is great for problems for which existing solutions require
a lot of hand-tuning or long lists of rules: one Machine Learning algorithm
can often simplify code and perform better.
2. Complex problems for which there is no good solution at all using a
traditional approach: the best Machine Learning techniques can find a
solution.
8

Example of Complex Problems


• Speech Recognition: write a program capable of distinguishing the words
“one” and “two”.
• Hardcode an algorithm that measures high-pitch sound intensity and use that
to distinguish ones and twos.
• Will not scale to thousands of words spoken by millions of very different
people in noisy environments and in dozens of languages.
• The best solution is to write an algorithm that learns by itself, given many
example recordings for each word.
9

Why use Machine Learning?


1. Machine Learning is great for problems for which existing solutions require
a lot of hand-tuning or long lists of rules: one Machine Learning algorithm
can often simplify code and perform better.
2. Complex problems for which there is no good solution at all using a
traditional approach: the best Machine Learning techniques can find a
solution.
3. Fluctuating environments: a Machine Learning system can adapt to new
data.
4. Getting insights about complex problems and large amounts of data.
10

Example of Helping Humans Learn


• Once the spam filter has been trained on enough spam, it can easily be
inspected to reveal the list of words and combinations of words that it believes
are the best predictors of spam.
11

Difference Between Data Mining and Machine


Learning
• Data Mining is about using Statistics as well as other programming methods
to find patterns hidden in the data so that you can explain some phenomenon.

• Machine Learning uses Data Mining techniques and other learning


algorithms to build models of what is happening behind some data so that it
can predict future outcomes.
12

Machine learning algorithms


• Supervised learning algorithms: learn from input/output pairs. Examples
see next slides.
• Unsupervised learning algorithms: only the input data is known, and no
known output data is given to the algorithm. Examples see next slides.
13

Machine Learning Algorithms (Supervised)


• Identifying the zip code from handwritten on an envelop (easy &cheap)
• Input: scan of the handwriting
• Output: the actual digit
• You collect many envelops to build database, read the zip codes yourself
• Determining whether a tumor is benign on a medical image (expensive,
privacy)
• Input: the images
• Output: whether the tumor is benign
• You collect medical images and an expert give you his/her opinion
• Detecting fraudulent activity in credit card transactions (much simpler)
• Input: record of credit card transaction
• Output: whether it is likely to be fraudulent or not
• You store all transactions and recording if a user reports and transaction as a fraudulent
14

Machine Learning Algorithms (unsupervised)


• Identifying topics in a set of blog posts
• Input: large collection of text data.
• Output: summarize text and find prevalent themes in it.
• unknow: what topics are, how many topics, no known output.
• Segmenting customers into groups with similar
preferences
• Input: set of customer records.
• Output: identify which customers are similar and if there are groups of
customers with similar preferences.
• unknow: what these groups might be, how many, no known output.
15

Think of your data as a table


• Row is sample, data point, or observation
• Column is feature.
• In Machine Learning an attribute is a data type (e.g., “mileage”), while
a feature means an attribute plus its value (e.g., “mileage = 15,000”).
16

Types of Machine Learning Systems


1. Whether or not they are trained with human supervision (supervised,
unsupervised, semisupervised, and Reinforcement Learning).

2. Whether or not they can learn incrementally on the fly (online versus batch
learning).

3. Whether they work by simply comparing new data points to known data
points, or instead detect patterns in the training data and build a predictive
model, much like scientists do (instance-based versus model-based
learning)
Spam Filter may learn on the fly using a deep neural network
model trained using examples of spam and ham; this makes
it an online, model-based, supervised learning system.
Supervised, Unsupervised,
Semisupervised, and
Reinforcement Learning

17
18

Supervised Learning
• The training data you feed to the algorithm includes the desired solutions,
called labels.
• Classification: The Spam Filter is trained with many example emails along
with their class (spam or ham), and it must learn how to classify new emails.
19

Supervised Learning Cont.


• Regression: For predicting a target numeric value, such as the price of a car,
given a set of features (mileage, age, brand, etc.) called predictors.
• To train the system, you need to give it many examples of cars, including both
their predictors and their labels (i.e., their prices).
20

Supervised Learning Cont.


Most important supervised learning algorithms:
• k-Nearest Neighbors
• Linear Regression
• Logistic Regression
• Support Vector Machines (SVMs)
• Decision Trees and Random Forests
• Neural networks

https://www.mdpi.com/2078-2489/10/4/150/htm
21

Unsupervised Learning
• The training data is unlabeled. The system tries to learn without a teacher.
22

Unsupervised Learning Cont.


Some of the most important unsupervised learning algorithms:
• Clustering
• k-Means
• Hierarchical Cluster Analysis (HCA)
• Expectation Maximization
• Visualization and dimensionality reduction
• Principal Component Analysis (PCA)
• Kernel PCA
• Locally-Linear Embedding (LLE)
• t-distributed Stochastic Neighbor Embedding (t-SNE)
• Association rule learning
• Apriori
• Eclat
23

Example of Clustering: Blog’s Visitors


• Run Clustering algorithm to detect groups of similar
visitors:
• No need to tell the algorithm which group a visitor belongs
to.
• It finds those connections without your help.
• Findings:
• it might notice that 40% of your visitors are males who love If you use a hierarchical
comic books and generally read your blog in the evening. clustering algorithm, it may also
• while 20% are young sci-fi lovers who visit during the subdivide each group into
weekends, and so on. smaller groups
• This may help you target your posts for each
group.
24

Example of Visualization: Highlighting Semantic


Clusters

Notice:
1. how animals are rather well separated from vehicles.
2. how horses are close to deer but far from birds, and so on.
25

Dimensionality Reduction
The goal is to simplify the data without losing too much information.
• Merge several correlated features into one (called Feature Extraction).
• For example, a car’s mileage may be very correlated with its age.
• The dimensionality reduction algorithm will merge Mileage and age into one
feature that represents the car’s wear and tear. This is called feature
extraction
26

Anomaly Detection
Detecting unusual credit card transactions to prevent fraud, catching
manufacturing defects, or automatically removing outliers from a dataset
before feeding it to another learning algorithm.
27

Association Rule Learning


• The goal is to dig into large amounts of data and discover interesting relations
between attributes.
• For example (supermarket): Running an association rule on sales logs may
reveal that:
• people who purchase barbecue sauce and potato chips also tend to buy steak.
• Thus, you may want to place these items close to each other.
28

Semi-supervised Learning
• Partially labeled training data, usually a lot of unlabeled data and a little bit of
labeled data.
• Google Photos (photo-hosting services):
29

Reinforcement Learning
• Many robots implement Reinforcement Learning algorithms to learn how to walk.
Batch and Online
Learning

30
31

Batch Learning
• The system is incapable of learning incrementally: it must be trained using all the
available data.
• It takes a lot of time and computing resources.
• It is typically done offline.
• Offline Learning: the system is trained, launched into production and runs without
learning anymore (applies what it has learned).

• Updating batch learning system:


• Get the new data (such as a new type of spam).
• Train a new version of the system from scratch on the full dataset (not just the new data, but also the
old data),
• Then stop the old system and replace it with the new one.

• The whole process of training, evaluating, and launching a Machine Learning system
can be automated fairly easily.
32

Batch Learning Cont.


• The solution is simple and often works fine, but training using the full set of
data can take many hours.
• Train a new system only every 24 hours or even just weekly.
• If your system needs to adapt to rapidly changing data (e.g., to predict stock
prices), then you need a more reactive solution.
• Training on the full set of data requires a lot of computing resources (CPU,
memory space, disk space, disk I/O, network I/O, etc.).
• If you have a lot of data and you automate your system to train from scratch
every day, it will end up costing you a lot of money.
• If the amount of data is huge, it may even be impossible to use a batch
learning algorithm.
33

Online Learning
• The system is trained incrementally by feeding it data instances sequentially,
either individually or by small groups called mini-batches.
• Each learning step is fast and cheap, so the system can learn about new data
on the fly, as it arrives.
34

Online Learning Cont.


• It is a great for systems that receive
data as a continuous flow (e.g., stock
prices) and need to adapt to change
rapidly or autonomously.
• It is also a good option if you have
limited computing resources.
• can be used to train systems on huge
datasets that cannot fit in one
This whole process is usually done
machine’s main memory (this is called
offline (i.e., not on the live system), so
out-of-core learning). online learning can be a confusing
name. Think of it as incremental
learning.
35

Online Learning Cont.


• Learning Rate (important parameter of online learning systems): how fast
they should adapt to changing data.
• a high learning rate
• a low learning rate
• A big challenge: if bad data is fed to the system, the system’s performance
will gradually decline. If we are talking about a live system, your clients will
notice.
• To reduce this risk, you need to monitor your system closely and promptly
switch learning off (and possibly revert to a previously working state) if you
detect a drop in performance.
Instance-Based
Versus Model-Based
Learning
How Machine Learning Algorithms Generalize

Perform well on new instances

36
37

Instance-based Learning
• The system learns the examples by heart, then generalizes to new cases
using a similarity measure.
38

Model-based Learning
• Another way to generalize from a set of examples is to build a model of these
examples, then use that model to make predictions.
Model-based Learning
• Suppose you want to know if money makes
people happy.
• Question: Does money make people

• You decide to model life satisfaction as a linear


function of GDP per capita.

• This step is called model selection: you


selected a linear model of life satisfaction with
just one attribute, GDP per capita.
40

References
• Pages 3-18 of Hands-On Machine Learning with Scikit-Learn and
TensorFlow.
• Pages 1-13 of Introduction to Machine Learning with Python.
• https://learning.oreilly.com/library/view/hands-on-machine-learning/97814920
32632/ch01.html

You might also like