Large Scale Machine Learning
Large Scale Machine Learning
machine learning
Learning with
large datasets
Machine Learning
Machine learning and data
Classify between confusable words.
E.g., {to, two, too}, {then, than}.
Accuracy
For breakfast I ate _____ eggs.
error
(training set size) (training set size) Andrew Ng
Large scale
machine learning
Stochastic
gradient descent
Machine Learning
Linear regression with gradient descent
Repeat {
(for every )
}
Andrew Ng
Linear regression with gradient descent
Repeat {
(for every )
}
Andrew Ng
Batch gradient descent Stochastic gradient descent
Repeat {
(for every )
}
Andrew Ng
Stochastic gradient descent
1. Randomly shuffle (reorder)
training examples
2. Repeat {
for {
(for every )
}
}
Andrew Ng
Large scale
machine learning
Mini-batch
gradient descent
Machine Learning
Mini-batch gradient descent
Batch gradient descent: Use all examples in each iteration
Stochastic gradient descent: Use 1 example in each iteration
Mini-batch gradient descent: Use examples in each iteration
Andrew Ng
Mini-batch gradient descent
Say .
Repeat {
for {
(for every )
}
}
Andrew Ng
Large scale
machine learning
Stochastic
gradient descent
convergence
Machine Learning
Checking for convergence
Batch gradient descent:
Plot as a function of the number of iterations of
gradient descent.
(for )
}
}
Learning rate is typically held constant. Can slowly decrease
const1
over time if we want to converge. (E.g. iterationNumber + const2 )
Andrew Ng
Stochastic gradient descent
(for )
}
}
Learning rate is typically held constant. Can slowly decrease
const1
over time if we want to converge. (E.g. iterationNumber + const2 )
Andrew Ng
Large scale
machine learning
Online learning
Machine Learning
Online learning
Shipping service website where user comes, specifies origin and
destination, you offer to ship their package for some asking price,
and users sometimes choose to use your shipping service ( ),
sometimes not ( ).
Features capture properties of user, of origin/destination and
asking price. We want to learn to optimize price.
Andrew Ng
Other online learning example:
Product search (learning to search)
User searches for “Android phone 1080p camera”
Have 100 phones in store. Will return 10 results.
features of phone, how many words in user query match
name of phone, how many words in query match description
of phone, etc.
if user clicks on link. otherwise.
Learn .
Use to show user the 10 phones they’re most likely to click on.
Other examples: Choosing special offers to show user; customized
selection of news articles; product recommendation; …
Andrew Ng
Large scale
machine learning
Map-reduce and
data parallelism
Machine Learning
Map-reduce
Batch gradient descent:
Machine 1: Use
Machine 2: Use
Machine 3: Use
Machine 4: Use
Computer 1
Training set
Computer 2 Combine results
Computer 3
[http://openclipart.org/detail/17924/computer-by-aj]
Computer 4
Andrew Ng
Map-reduce and summation over the training set
Many learning algorithms can be expressed as computing sums of
functions over the training set.
Andrew Ng
Multi-core machines
Core 1
Training set
Core 2 Combine results
Core 3
[http://openclipart.org/detail/100267/cpu- Core 4
(central-processing-unit)-by-ivak-100267] Andrew Ng