ML-01
ML-01
(CSO851)
Lecture - 01
Acknowledgement
https://www.cmpe.boun.edu.tr/~ethem/i2ml3e/
https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/
https://nptel.ac.in/courses/106/106/106106202/
Overview of the Course
Introduction to Machine Learning
Foundations of Machine Learning
Machine Learning Algorithms
Regressions (Linear, Polynomial, Step-wise)
Logistic Regression
Gradient Descent
Decision Trees
Bayes Theorem and Bayes Classification
Support Vector Machines
Principal Component Analysis
Linear Discriminant Analysis
Nearest Neighbors
Density Estimation
Neural Networks
Evolution of Machine Learning
1950 — Alan Turing creates the “Turing Test” to determine if
a computer has real intelligence. To pass the test, a computer
must be able to fool a human into believing it is also human.
1952 — Arthur Samuel wrote the first computer learning
program. The program was the game of checkers.
1957 — Frank Rosenblatt designed the first neural network for
computers (the perceptron), which simulate the thought
processes of the human brain.
1967 — The “nearest neighbor” algorithm was written,
allowing computers to begin using very basic pattern
recognition.
1979 — Students at Stanford University invent the “Stanford
Cart” which can navigate obstacles in a room on its own.
https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-
manager-should-read/?sh=413d1dcc15e7
Evolution of Machine Learning
1981 — Gerald Dejong introduces the concept of Explanation Based Learning
(EBL), in which a computer analyses training data and creates a general rule
it can follow by discarding unimportant data.
1985 — Terry Sejnowski invents NetTalk, which learns to pronounce words
the same way a baby does.
1990s — Work on machine learning shifts from a knowledge-driven approach
to a data-driven approach.
1997 — IBM’s Deep Blue beats the world champion at chess.
2006 — Geoffrey Hinton coins the term “deep learning” to explain new
algorithms that let computers “see” and distinguish objects and text in
images and videos.
2010 — The Microsoft Kinect can track 20 human features at a rate of 30
times per second, allowing people to interact with the computer via
movements and gestures.
Evolution of Machine Learning
2011 — IBM’s Watson beats its human competitors at Jeopardy.
2011 — Google Brain is developed, and its deep neural network can learn to
discover and categorize objects much the way a cat does.
2012 – Google’s X Lab develops a machine learning algorithm that is able to
autonomously browse YouTube videos to identify the videos that contain
cats.
2014 – Facebook develops DeepFace, a software algorithm that is able to
recognize or verify individuals on photos to the same level as humans can.
2015 – Amazon launches its own machine learning platform.
2015 – Microsoft creates the Distributed Machine Learning Toolkit.
2016 – Google’s artificial intelligence algorithm beats a professional player
at the Chinese board game Go, which is considered the world’s most
complex board game and is many times harder than chess.
Introduction to Machine Learning
Machine Learning: According to Arthur Samuel, Machine Learning algorithms
enable the computers to learn from data, and even improve themselves, without
being explicitly programmed.
A computer program is said to learn from experience E with respect to some class of tasks
T and performance measure P, if its performance at tasks in T, as measured by P, improves
with experience E.
A checkers learning problem:
Task T: playing checkers
Performance measure P: percent of games won against opponents
Training experience E: playing practice games against itself
A handwriting recognition learning problem:
Task T: recognizing and classifying handwritten words within images
Performance measure P: percent of words correctly classified
Training experience E: a database of handwritten words with given classifications
When Learning is Needed?
P (Y|X) probability that somebody who buys X also buys Y where X and Y are
products/services.
Example: P (coffee|snacks) = 0.7
Moderna was one of the first to introduce a COVID-19 vaccination during COVID
period. Moderna utilized artificial intelligence to aid in the development of
mRNA sequences.
Practical Uses of Supervised Learning: Classification
Differentiating between low-risk and high-
risk customers from their income and savings
https://skepticalscience.com/its-a-climate-shift-step-function-caused-by-natural-cycles.htm
Practical Uses of Unsupervised Learning
Practical Uses of Reinforcement Learning
A great example is the use of AI agents by Deepmind
to keep cool Google Data Centers. This led to a 40%
reduction in energy spending. The centers are fully
controlled with the AI system without the need for
human intervention. There is obviously still
supervision from data center experts. The system
works in the following way:
https://deepmind.com/blog/article/safety-first-ai-autonomous-data-centre-cooling-and-industrial-control
Practical Uses of Reinforcement Learning
Industry automation
Trading and finance
Natural Language Processing
Healthcare
Engineering
News recommendation
Gaming
Marketing and advertising
Robotics manipulation
https://neptune.ai/blog/reinforcement-learning-applications
Foundations of Machine Learning
The problem of learning
Basic Terminologies
Unknown functions
Training and Selection of Models
Design a learning system
Determine the training experience
Determine the target function
Determine a representation for target function
Determine a function approximation algorithm
The final design
The Problem of Learning & Basic Terminology
There are a known set and an unknown function on . Given data, construct a
good approximation of . This is called learning .
Domain: The set called feature space and an element is called feature
vector or an input. The coordinates of are called features. Individual features
may take values in a continuum, a discrete, ordered set or a discrete, unordered
set.
Range: The range is usually either a finite, unordered set, in which case
learning is called classification or it is a continuum, in which case learning is called
regression. An element is called a class in classification and a response in
regression.
Data: In principle, data are random draws from a probability distribution
on . Depending on the problem at hand, the data which are observed may
either consist of domain-range pairs or just a domain values : learning is
called supervised in the former case and unsupervised in the later case.
Basic Terminology
In supervised learning, the data are
Such data are called unmarked data. The range is assumed to be finite, in which
case unsupervised learning is called clustering.
If the range is [0, ) and the function f to be learned is the mass or density
function of the marginal distribution of the features,