0% found this document useful (0 votes)
387 views

01 - Introduction To Machine Learning

The document provides an introduction to a Machine Learning course. It outlines the course details including objectives, structure, schedule, and assessment criteria. The course covers supervised, unsupervised, and reinforcement learning techniques over 14 weeks. Lectures will be accompanied by practical tasks to apply the concepts. Students will need prerequisite knowledge in algorithms, programming, statistics, and artificial intelligence.

Uploaded by

Fadila Zain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
387 views

01 - Introduction To Machine Learning

The document provides an introduction to a Machine Learning course. It outlines the course details including objectives, structure, schedule, and assessment criteria. The course covers supervised, unsupervised, and reinforcement learning techniques over 14 weeks. Lectures will be accompanied by practical tasks to apply the concepts. Students will need prerequisite knowledge in algorithms, programming, statistics, and artificial intelligence.

Uploaded by

Fadila Zain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 72

Machine Learning

First Meet – Introduction


ADF

1 4/24/2018
Today’s Agenda
Introduction

Brief history of Machine Learning

2 4/24/2018 Machine Learning


Course Introduction
Course Name: Machine Learning
– Supervised, Unsupervised,
– Reinforcement, Ensembles

Course Code: CSH3L3


Credits : 3

14 weeks of each:
– 3 x 50’ class
– 3 x 50’ structured tasks
– 3 x 50’ self study

3 4/24/2018 Machine Learning


Introduction
Anditya Arifianto
NIP: 14890028
HP: 085295464439
Email:
[email protected]
[email protected]
– http://anditya.staff.telkomuniversity.ac.id
Can be found at:
– AI Lab (E107)
Rules
Be responsible with your attendance

Any kind of Cheating and Plagiarism is not acceptable and


will not be forgiven

Maintain the appropriate condition in the class


– Mind your manners and attitudes
– You may bring snacks and drinks
but please no noisy or strong-smelling ones
– Drowsy? Just sleep quietly

Language?

5 4/24/2018 Machine Learning


Pre-requisite
Proficiency in Algorithm and Programming
– Basic Algorithm and Programming
– Algorithm Analysis and Design
– Calculus

Linear Algebra

Probabilistic and Statistics

Artificial Intelligence

6 4/24/2018 Machine Learning


Course Objectives
Students are able to explain the concept of each machine
learning method.

Students are able to identify, model, analyze and solve


problems using machine learning methods.

Students are able to implement machine learning methods


using programming languages to solve problems.

7 4/24/2018 Machine Learning


Syllabus
Introduction
– Motivation, taxonomy

Supervised Learning
– Regression, SVM, ANN, PNN, Naïve Bayes

Unsupervised Learning
– Clustering, SOM

Reinforcement Learning

Ensemble methods

8 4/24/2018 Machine Learning


Week Date Material Task
1 15-Jan Introduction to Machine Learning
2 22-Jan Regression
3 29-Jan Naïve Bayes
4 5-Feb Artificial Neural Network Task 1: Regression,
5 12-Feb Probabilistic Neural Network SVM, PNN
6 19-Feb Support Vector Machine
7 26-Feb Multi-Class SVM
5-Mar
UJIAN TENGAH SEMESTER
12-Mar
Introduction to Clustering,
8 19-Mar
Proximity Measures
Task 2: K-Means
Partitional-based Clustering,
9 26-Mar Clustering
Hierarchical Clustering
10 2-Apr Self-Organizing Maps
Reinforcement Learning:
11 9-Apr
Markov Decision, Bellman Equation Task 3: Reinforcement
Reinforcement Learning: Learning
12 16-Apr
Value Iteration, Q-Learning
Ensemble Methods:
13 23-Apr
Bagging, Boosting Task 4: Ensemble
Ensemble Methods: Methods
14 30-Apr
Random Forest
7-May
UJIAN AKHIR SEMESTER
14-May
9 4/24/2018 Machine Learning
Scoring
[35%] CLO1 :
– able to explain the concept of each machine learning method

[25%] CLO2 :
– able to identify, model, analyze and solve
problems using machine learning methods

[40%] CLO3 :
– able to implement machine learning methods
using programming languages to solve
problems

10 4/24/2018 Machine Learning


Scoring – CLO perspective
CLO1: CLO2: CLO3:
Explain Design Implement
Regression 5%
SVM 5%
Task 1 30%
Naïve Bayes 5%
PNN 5% 10%
Hierarchical 5%
Task 2 SOM 5% 30%
K-Means 5% 5% 10%
Task 3 Reinforcement 5% 5% 10% 20%
Task 4 Random Forest 5% 5% 10% 20%

Total 35% 25% 40% 100%

NO MID/FINAL EXAM

11 4/24/2018 Machine Learning


Final Points

A 80 … 100.0
AB 75 … 79.99
B 70 … 74.99
BC 60 … 69.99
C 50 … 59.99
D 40 … 49.99
E 0 ... 39.99

12 4/24/2018 Machine Learning


References
Peter Flach: Machine learning: The Art and Science of Algorithms
that Make Sense of Data. Cambridge University Press 2012

Tan, Steinbach, Kumar. Introduction to Data Mining. Addison-


Wesley. 2006.

Course Slides: Introduction to Machine Learning, University of


Helsinki.

Suyanto, Data Mining, INFORMATIKA: Bandung, 2017.

Mohamad Syahrul Mubarok dan Suyanto, Pengantar Machine


Learning, INFORMATIKA: Bandung, 2018.

13 4/24/2018 Machine Learning


Introduction to Machine Learning

14 4/24/2018 Machine Learning


What is Machine Learning?
Machine
– computer, computer program (in this course)

Learning
– improving performance on a given task,
based on experience / examples

Machine Learning
– concerned with computer programs that automatically improve
their performance through experience

15 4/24/2018 Machine Learning


What is Machine Learning?
Instead of the programmer writing explicit rules for how to
solve a given problem,
the programmer instructs the computer
how to learn from examples

In many cases the computer program


can even become better at the task
than the programmer is!

16 4/24/2018 Machine Learning


Learn from Data
One definition of machine learning:
A computer program improves its performance on a given
task with experience
(i.e. examples, data).

Thus we need to separate:


– Task: What is the problem that the program is solving?
– Performance measure: How is the performance of the
program (when solving the given task) evaluated?
– Experience: What is the data (examples) that the program is
using to improve its performance?

17 4/24/2018 Machine Learning


Availability of data
These days it is very easy to
– collect data (sensors are cheap, much information digital)
– store data (hard drives are big and cheap)
– transmit data (essentially free on the internet).

But how to benefit from it?

Analysis is becoming key!

20 4/24/2018 Machine Learning


Related Scientific Disciplines

21 4/24/2018 Machine Learning


Related Scientific Disciplines
Artificial Intelligence (AI)
– Machine learning can be seen as `one approach' towards implementing
`intelligent' machines
(or at least machines that behave in a seemingly intelligent way).

Artificial neural networks, computational neuroscience


– Inspired by and trying to mimic the function of biological brains, in order
to make computers that learn from experience.
– Modern machine learning really grew out of the neural networks boom in
the 1980's and early 1990's.

Pattern recognition
– Recognizing objects and identifying people in controlled or uncontrolled
settings, from images, audio, etc. Such tasks typically require machine
learning techniques.

22 4/24/2018 Machine Learning


Related Scientific Disciplines
Data mining
– Trying to identify interesting and useful associations and patterns in huge
datasets
– Focus on scalable algorithms
– Example: On the order of 3 million people grocery shopping twice a week
in just two main chains in Finland ) each chain would collect hundreds of
thousands of transaction receipts per day!

Statistics
– Traditionally: focus on testing hypotheses based on theory
– Has contributed a lot to data mining and machine learning, and has also
evolved by incorporating ideas derived from these fields

23 4/24/2018 Machine Learning


Deep Learning Example:
Shallow
Example: autoencoders Example: Example:
MLPs Logistic Knowledge
regression bases

Representation
Learning

Machine Learning

AI
24 4/24/2018 Machine Learning
Data Mining, Data Science and
Artificial Intelligence

25 4/24/2018 Machine Learning


Machine Learning Taxonomy

26 4/24/2018 Machine Learning


Supervised Learning
we have a data set that includes the target values (the values we
wish to predict).

try to learn a function that correctly maps the target values from
the other features, which can then be used to make predictions
about other examples.

Example: classification, regression

27 4/24/2018 Machine Learning


28 4/24/2018 Machine Learning
Unsupervised Learning
we have a dataset but there is no target to be predicted.

try to learn a model that might have generated that set.

Example: clustering, density estimation, noise reduction.

Semi-Supervised

29 4/24/2018 Machine Learning


30 4/24/2018 Machine Learning
Reinforcement Learning
a setting where we have a sequential decision problem. Making a
decision now influences what decisions we can make in the future.

A reward and punishment function is provided that tells us how


“good” certain states are.

Ensembles Method
use multiple learning algorithms to obtain better predictive
performance than could be obtained from any of the constituent
learning algorithms alone

31 4/24/2018 Machine Learning


32 4/24/2018 Machine Learning
33 4/24/2018 Machine Learning
Examples

34 4/24/2018 Machine Learning


Tic-Tac-Toe
How to program the computer to play tic-tac-toe?

Option A: The programmer writes explicit rules, e.g. `if the


opponent has two in a row, and the third is free, stop it by
placing your mark there', etc.
(lots of work, difficult, not at all scalable!)

Option B: Go through the game tree, choose optimally


(for non-trivial games, must be combined with some
heuristics to restrict tree size)

35 4/24/2018 Machine Learning


Tic-Tac-Toe
How to program the computer to play tic-tac-toe?

Option C:
Let the computer try out various strategies by playing
against itself and others, and noting which strategies lead to
winning and which to losing (=`machine learning')

36 4/24/2018 Machine Learning


Checker Game
Arthur Samuel (50's and 60's):

Computer program that learns to play checkers

Program plays against itself thousands of times, learns


which positions are good and which are bad
(i.e. which lead to winning and which to losing)

The computer program eventually


becomes much better than
the programmer.

37 4/24/2018 Machine Learning


Deep Mind’s Alpha Go
the first computer program to defeat a professional human
Go player

1st version: used a dataset of more than 100,000 Go games


as a starting point for its own knowledge

AlphaGo Zero,
– a version without human data
and stronger than any previous
human-champion-defeating version
only by playing games against itself

38 4/24/2018 Machine Learning


Spam Filtering
Traditional: Programmer writes rules:
– ex: If it contains 'viagra' then it is spam.
– (difficult, not user-adaptive)

Machine Learning:
– The user marks which mails are spam, which are legit, and the
computer learns itself what words are predictive

39 4/24/2018 Machine Learning


Face Recognition
Traditional: Programmer writes rules:
– ex: If short dark hair, big nose, then it is Alex
– (impossible! how do we judge the size of the nose?!)

Machine Learning:
– The computer is shown many (image, name) example pairs, and
the computer learns which features of the images are predictive
(difficult, but not impossible)

40 4/24/2018 Machine Learning


Prediction of search queries
Traditional:
– The programmer provides a standard dictionary
– (words and expressions change!)

Machine Learning:
– Previous search queries
are used as examples!

41 4/24/2018 Machine Learning


Ranking search results
Traditional:
– Various criteria for
ranking results

Machine Learning:
– learn what users are
looking for by collecting
queries and
the resulting clicks.

42 4/24/2018 Machine Learning


Detecting credit card fraud
Credit card companies typically end up paying for fraud
(stolen cards, stolen card numbers)

Useful to try to detect fraud, for instance large transactions

Important to be adaptive to the behaviors of customers,


i.e. learn from existing data how users normally behave,
and try to detect `unusual' transactions

43 4/24/2018 Machine Learning


Self-driving cars
Sensors (radars, cameras) superior to humans

How to make the computer react appropriately to the sensor


data?

44 4/24/2018 Machine Learning


Character recognition
Automatically sorting mail (handwritten characters)

Digitizing old books and newspapers into easily searchable


format (printed characters)

45 4/24/2018 Machine Learning


Social Media Analysis
Mining chat and discussion forums
– Breaking news
– Detecting outbreaks of infectious disease
– Tracking consumer sentiment about companies / products
– Prediction of friends in Facebook, or prediction of who you’d like
to follow on Twitter.

46 4/24/2018
Machine Translation
Traditional:
– Dictionary and explicit grammar

Machine Learning:
– statistical machine translation based on example data is
increasingly being used

47 4/24/2018 Machine Learning


Collaborative Filtering
Recommendation systems
– Amazon: "Customers who bought X also bought Y "...
– Netflix: "Based on your movie ratings, you might enjoy..."

48 4/24/2018 Machine Learning


Online Store Website Optimization
What items to present,
what layout?
What colors to use?

Can significantly affect sales volume


– Experiment, and analyze the results!
– (lots of decisions on how exactly to
experiment and how to ensure
meaningful results)

49 4/24/2018 Machine Learning


Data

50 4/24/2018 Machine Learning


Attributes
Types:
– Binary, Finite Discrete, Infinite Discrete, Continuous

Asymmetric type
– Sparse attributes
– Attributes with ‘special’ states

Measurement scales
– Nominal (categorical), Ordinal, Interval, Ratio

51 4/24/2018 Machine Learning


52 4/24/2018 Machine Learning
General characteristics of data
Number of data points vs dimensionality
(number of attributes)
– Most traditional data analysis methods assume (many) more
data points than dimensions
– Many interesting datasets have extremely high dimensions and
few data objects

53 4/24/2018 Machine Learning


General characteristics of data
Sparsity (relatively few non-zero values)
– Efficient storage and computation, for some methods
– May be important for modeling

Resolution (spatial, temporal, . . . )


– How small details can the sensors reliably detect
– How large datasets can the methods handle

54 4/24/2018 Machine Learning


Data Example
Record Data,
– Matrix, sparse matrix

Graph Data,
– Relation between objects

Natural Ordering
– Sequential data, sequences, time series, spatial data

55 4/24/2018 Machine Learning


Data Quality
Measurement error (`noise')
– Continuous data: Gaussian, Student's t distribution, . . .
– Binary data: Bernoulli, . . .

Outliers i.e. `anomalous' objects:


– (Objects that are very different from the others)
– Noise?
– Data collection error?
– Legitimate, interesting objects?

56 4/24/2018 Machine Learning


Data Quality
Precision and bias
– Precision: closeness of repeated measurements to each other
– Bias: systematic deviation from the true underlying value
– Variance: reciprocal of Precision
– Left: high precision, high bias. Right: low precision, low bias.

57 4/24/2018 Machine Learning


Data Quality
Low Variance High Variance

Low Bias
High Bias

High Precision Low Precision

58 4/24/2018 Machine Learning


Data Quality
Missing values:
– (Some attribute values are missing for some data objects)
– Missing at random? Need to model the process?
– Just eliminating such data objects or attributes?
– Estimating and imputing missing values?
– Ignoring or explicitly taking them into account?

Duplicate data, inconsistent data

Timeliness, relevance, application-specific prior knowledge

59 4/24/2018 Machine Learning


Data Preprocessing
Aggregation
– fewer, less noisy, data points
– Example: monthly analysis of daily (or hourly) data
– Loss of resolution

Sampling (with/without replacement, stratified or not)


– fewer data points
– Visualization, applying computationally demanding data analysis
procedures
– Loss of precision of data statistics

60 4/24/2018 Machine Learning


Data Preprocessing

61 4/24/2018 Machine Learning


The Curse of Dimensionality
As the dimensionality grows, the data becomes increasingly
sparse in the space

(e.g. n dimensions of binary variables has 2n joint states)

62 4/24/2018 Machine Learning


The Curse of Dimensionality
Less (dimension) is More

Learning from a high-dimensional feature space requires an


enormous amount of training to ensure that there are
several samples with each combination of values.

With a fixed number of training instances, the predictive


power reduces as the dimensionality increases.

As a counter-measure, many dimensionality reduction


techniques have been proposed, and it has been shown that
when done properly, the properties or structures of the
objects can be well preserved even in the lower dimensions

63 4/24/2018 Machine Learning


Dimension Reduction
Feature subset selection
– (selecting only some of the attributes, manually or
automatically)

Feature extraction
– (computing new `attributes' to replace existing ones, e.g. image
color/structure features)

Dimensionality reduction
– (principal component analysis, other unsupervised methods)

64 4/24/2018 Machine Learning


Dimension Reduction
Discretization of continuous attributes
(required for some machine learning algorithms)
– e.g. R is divided into −∞, 𝑥1 , 𝑥1 , 𝑥2 , 𝑥2 , 𝑥3 , (𝑥3 , ∞)
– Unsupervised or supervised
– Ideally: use application-specific knowledge

65 4/24/2018 Machine Learning


Dimension Reduction
Binarization
– Some algorithms require binary attributes
– How NOT to represent nominal variables:

66 4/24/2018 Machine Learning


Dimension Reduction
Binarization
– Some algorithms require binary attributes
– How better to represent nominal variables

– Useful even when binary variables are not strictly required!

67 4/24/2018 Machine Learning


Dimension Reduction
Variable transformations
– Simple functions (e.g. log(x) often used for strictly positive
variables)
– Ideally: use application-specific knowledge
– Normalization / standardization:

– where x is the mean of the attribute values and sx is the


standard deviation. This yields x0 with zero mean and a
standard deviation of 1.
– Useful to ensure that the scale (units) of attribute values do not
affect the results.

68 4/24/2018 Machine Learning


The Curse of Dimensionality
Less (dimension) is More ??

Nevertheless, naively applying dimensionality reduction can


lead to pathological results.

While dimensionality reduction is an important tool in


machine learning/data mining, we must always be aware
that it can distort the data in misleading ways.

Next is a two dimensional projection of an intrinsically three


dimensional world….

69 4/24/2018 Machine Learning


The Curse of Dimensionality

Original photographer unknown


See also www.cs.gmu.edu/~jessica/DimReducDanger.htm (c) eamonn keogh
70 4/24/2018 Machine Learning
The Curse of Dimensionality

71 4/24/2018 Machine Learning


Less is More?
In the past the published advice was that high
dimensionality is dangerous.

But, Reducing dimensionality reduces the amount of


information available for prediction.

Today: try going in the opposite direction: Instead of


reducing dimensionality, increase it by adding many
functions of the predictor variables.

The higher the dimensionality of the set of features, the


more likely it is that separation occurs.

72 4/24/2018
Question?

73 4/24/2018 Machine Learning


THANK YOU
4/24/2018
74

You might also like