0% found this document useful (0 votes)

57 views

ML:Introduction What Is Machine Learning?: Continuous and Discrete Data

This document provides an introduction to machine learning, including definitions, examples, and explanations of supervised vs. unsupervised learning. It also covers linear regression models, specifically: 1) Linear regression aims to fit a linear function to input-output data to predict a continuous output. 2) The hypothesis function takes the form of y=θ0+θ1x, representing a straight line that tries to best fit the data. 3) The cost function measures the accuracy by calculating the average squared differences between predictions and actual outputs. Minimizing this cost function improves the model fit.

Uploaded by

re_alvaro

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views

ML:Introduction What Is Machine Learning?: Continuous and Discrete Data

Uploaded by

re_alvaro

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

ML:Introduction

What is Machine Learning?

Two definitions of Machine Learning are offered. Arthur Samuel described it as: "the
field of study that gives computers the ability to learn without being explicitly
programmed." This is an older, informal definition.
Tom Mitchell provides a more modern definition: "A computer program is said to learn
from experience E with respect to some class of tasks T and performance measure P, if
its performance at tasks in T, as measured by P, improves with experience E."
Example: playing checkers.

E = the experience of playing many games of checkers

T = the task of playing checkers.

P = the probability that the program will win the next game.

In general, any machine learning problem can be assigned to one of two broad
classifications:

supervised learning, OR

unsupervised learning.

Supervised Learning
In supervised learning, we are given a data set and already know what our correct
output should look like, having the idea that there is a relationship between the input
and the output.
Supervised learning problems are categorized into "regression" and "classification"
problems. In a regression problem, we are trying to predict results within
a continuous output, meaning that we are trying to map input variables to
some continuous function. In a classification problem, we are instead trying to
predict results in a discrete output. In other words, we are trying to map input
variables into discrete categories. Here is a description on Math is Fun on Continuous
and Discrete Data.
Example 1:
Given data about the size of houses on the real estate market, try to predict their
price. Price as a function of size is a continuous output, so this is a regression problem.

We could turn this example into a classification problem by instead making our output
about whether the house "sells for more or less than the asking price." Here we are
classifying the houses based on price into two discrete categories.
Example 2: (a)Regression - Given a picture of Male/Female, We have to predict
his/her age on the basis of given picture. (b)Classification - Given a picture of
Male/Female, We have to predict Whether He/She is of High school, College, Graduate
age. Another Example for Classification - Banks have to decide whether or not to give
a loan to someone on the basis of his credit history.

Unsupervised Learning
Unsupervised learning, on the other hand, allows us to approach problems with little or
no idea what our results should look like. We can derive structure from data where we
don't necessarily know the effect of the variables.
We can derive this structure by clustering the data based on relationships among the
variables in the data.
With unsupervised learning there is no feedback based on the prediction results, i.e.,
there is no teacher to correct you.
Example:
Clustering: Take a collection of 1000 essays written on the US Economy, and find a way
to automatically group these essays into a small number that are somehow similar or
related by different variables, such as word frequency, sentence length, page count,
and so on.
Non-clustering: The "Cocktail Party Algorithm", which can find structure in messy data
(such as the identification of individual voices and music from a mesh of sounds at
a cocktail party). Here is an answer on Quora to enhance your understanding. : What
is the difference between supervised and unsupervised learning algorithms?

Basic difference in layman terms :

In supervised learning, the output datasets are provided which are used to
train the machine and get the desired outputs whereas in unsupervised
learning no datasets are provided, instead the data is clustered into different
classes .

Example : Face recognition

Supervised learning : Learn by examples as to what a face is in terms of
structure, color, etc so that after several iterations it learns to define a face
Unsupervised learning : since there is no desired output in this case that is
provided therefore categorization is done so that the algorithm differentiates
correctly between the face of a horse, cat or human (clustering of data)
PARTE 2

Model Representation
Recall that in regression problems, we are taking input variables and trying to fit the
output onto a continuous expected result function.
Linear regression with one variable is also known as "univariate linear regression."
Univariate linear regression is used when you want to predict a single
output value

y from a single input value x. We're doing supervised learning here,

so that means we already have an idea about what the input/output cause and effect
should be.

The Hypothesis Function

Our hypothesis function has the general form:

y=h(x)=0+1x
Note that this is like the equation of a straight line. We give to
for 0 and 1 to get our estimated output
function called

h(x) values

y. In other words, we are trying to create a

h that is trying to map our input data (the x's) to our output data (the

y's).

Example:
Suppose we have the following set of training data:

input

output

Now we can make a random guess about our

hypothesis function becomes

h function: 0=2 and 1=2. The

h(x)=2+2x.

So for input of 1 to our hypothesis, y will be 4. This is off by 3. Note that we will be
trying out various values of 0 and 1 to try to find values which provide the best
possible "fit" or the most representative "straight line" through the data points mapped
on the x-y plane.

Cost Function
We can measure the accuracy of our hypothesis function by using a cost function.
This takes an average (actually a fancier version of an average) of all the results of the
hypothesis with inputs from x's compared to the actual output y's.

J(0,1)=1/2mi=1m(yiyi)2=1/2mi=1m(h(xi)yi)2
To break it apart, it is 12x where

x is the mean of the squares of h(xi)yi , or the

difference between the predicted value and the actual value.

This function is otherwise called the "Squared error function", or "Mean squared error".
The mean is halved

(12m) as a convenience for the computation of the gradient

descent, as the derivative term of the square function will cancel out the 12 term.
Now we are able to concretely measure the accuracy of our predictor function against
the correct results we have so that we can predict new results we don't have.

If we try to think of it in visual terms, our training data set is scattered on the x-y
plane. We are trying to make straight line (defined by

h(x)) which passes through this

scattered set of data. Our objective is to get the best possible line. The best possible
line will be such so that the average squared vertical distances of the scattered points
from the line will be the least. In the best case, the line should pass through all the
points of our training data set. In such a case the value of

J(0,1) will be 0.

Frequently Asked Questions

Q: Why is the cost function about the sum of squares, rather than the sum of
cubes (or just the

(h(x)y) or abs(h(x)y) ) ?

A: It might be easier to think of this as measuring the distance of two points. In this
case, we are measuring the distance of two multi-dimensional values (i.e. the observed
output value yi and the estimated output value
distance of two points

yi). We all know how to measure the

(x1,y1) and (x2,y2), which is (x1x2)2+(y1y2)2

. If we have n-dimension then we want the positive square root

ni=1(xiyi)2. That's where the sum of squares comes from. (see also Euclidean

distance)
The sum of squares isnt the only possible cost function, but it has many nice
properties. Squaring the error means that an overestimate is "punished" just the same
as an underestimate: an error of -1 is treated just like +1, and the two equal but
opposite errors cant cancel each other. If we cube the error (or just use the
difference), we lose this property. Also in the case of cubing, big errors are punished
more than small ones, so an error of 2 becomes 8.
The squaring function is smooth (can be differentiated) and yields linear forms after
differentiation, which is nice for optimization. It also has the property of being
convex. A convex cost function guarantees there will be a global minimum, so our
algorithms will converge.

If you throw in absolute value, then you get a non-differentiable function. If you try to
take the derivative of abs(x) and set it equal to zero to find the minimum, you won't
get any answers since it's undefined in 0.
Q: Why cant I use 4th powers in the cost function? Dont they have the nice
properties of squares?
A: Imagine that you are throwing darts at a dartboard, or firing arrows at a target. If
you use the sum of squares as the error (where the center of the bulls-eye is the origin
of the coordinate system), the error is the distance from the center. Now rotate the
coordinates by 30 degree, or 45 degrees, or anything. The distance, and hence the
error, remains unchanged. 4th powers lack this property, which is known as rotational
invariance.
Q: Why does 1/(2 * m) make the math easier?
A: When we differentiate the cost to calculate the gradient, we get a factor of 2 in the
numerator, due to the exponent inside the sum. This '2' in the numerator cancels-out
with the '2' in the denominator, saving us one math operation in the formula.

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
CS607 - Midterm Solved Mcqs With References by Moaaz PDF
50% (2)
CS607 - Midterm Solved Mcqs With References by Moaaz PDF
20 pages
Curriculum Design
100% (6)
Curriculum Design
15 pages
The Practically Cheating Calculus Handbook
From Everand
The Practically Cheating Calculus Handbook
S. Deviant
3.5/5 (7)
Learn & Talk I: at The Bottoms of The Following Pages: TI Teaching Instructions T Teacher S Student
No ratings yet
Learn & Talk I: at The Bottoms of The Following Pages: TI Teaching Instructions T Teacher S Student
12 pages
Focusing Oriented Art Therapy
100% (2)
Focusing Oriented Art Therapy
17 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
5 pages
What Is Machine Learning by Coursera
No ratings yet
What Is Machine Learning by Coursera
47 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
8 pages
ML: Introduction 1. What Is Machine Learning?
No ratings yet
ML: Introduction 1. What Is Machine Learning?
38 pages
Machine Learning
No ratings yet
Machine Learning
60 pages
Tom Mitchell Provides A More Modern Definition
No ratings yet
Tom Mitchell Provides A More Modern Definition
10 pages
Week 1
No ratings yet
Week 1
9 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
43 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
15 pages
ML Coursera
No ratings yet
ML Coursera
10 pages
Machine Learning - SoS 2017
No ratings yet
Machine Learning - SoS 2017
15 pages
Week 1 Lecture Notes
No ratings yet
Week 1 Lecture Notes
7 pages
(Machine Learning Coursera) Lecture Note Week 1
No ratings yet
(Machine Learning Coursera) Lecture Note Week 1
8 pages
What Is Machine Learning?
No ratings yet
What Is Machine Learning?
12 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
10 pages
Machine Learning-2
No ratings yet
Machine Learning-2
16 pages
Ai Unit V
No ratings yet
Ai Unit V
18 pages
Lecture 2: Basics and Definitions: Networks As Data Models
No ratings yet
Lecture 2: Basics and Definitions: Networks As Data Models
28 pages
Commonly Used Machine Learning Algorithms
No ratings yet
Commonly Used Machine Learning Algorithms
27 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Review - 1 Machine Learning: - D.Malakondaiah Chowdary (160050051)
No ratings yet
Review - 1 Machine Learning: - D.Malakondaiah Chowdary (160050051)
12 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
UNIT-VI Learning
No ratings yet
UNIT-VI Learning
19 pages
Classification Algorithms I
No ratings yet
Classification Algorithms I
14 pages
Summer of Science-Final Report
100% (1)
Summer of Science-Final Report
7 pages
Machine Learning - Week 1
No ratings yet
Machine Learning - Week 1
1 page
Lecture-2-1 Model Representation 20220301
No ratings yet
Lecture-2-1 Model Representation 20220301
10 pages
Broadly, There Are 3 Types of Machine Learning Algorithms.
No ratings yet
Broadly, There Are 3 Types of Machine Learning Algorithms.
33 pages
What Is Machine Learning?: Example 1
No ratings yet
What Is Machine Learning?: Example 1
5 pages
ML by Andrew NG
No ratings yet
ML by Andrew NG
2 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
R Programming Student Lab Manual-52-63-3-12
No ratings yet
R Programming Student Lab Manual-52-63-3-12
10 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
Theory of Computing
No ratings yet
Theory of Computing
12 pages
An Intuitive Guide To Linear Algebra - BetterExplained
No ratings yet
An Intuitive Guide To Linear Algebra - BetterExplained
6 pages
Machine Learing Algorithms
No ratings yet
Machine Learing Algorithms
13 pages
Arithmetic Mean - Definition, Formula, Properties, and Examples
No ratings yet
Arithmetic Mean - Definition, Formula, Properties, and Examples
14 pages
lecture19
No ratings yet
lecture19
8 pages
Chapter 1
No ratings yet
Chapter 1
3 pages
ML Model Paper 2 Solution
No ratings yet
ML Model Paper 2 Solution
15 pages
3 - Intro To Predictive Modeling
No ratings yet
3 - Intro To Predictive Modeling
40 pages
Lecture 2 - Supervised Learning
No ratings yet
Lecture 2 - Supervised Learning
6 pages
HW 1
No ratings yet
HW 1
6 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
2 pages
Engineering Mathematics Mathematics For Construction: Analytical or Problem-Solving Skills
No ratings yet
Engineering Mathematics Mathematics For Construction: Analytical or Problem-Solving Skills
8 pages
ML Model Paper 2 Solution
No ratings yet
ML Model Paper 2 Solution
15 pages
Machine Learning
No ratings yet
Machine Learning
53 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
20 pages
ML QB
No ratings yet
ML QB
13 pages
A Study of Fuzzy, Super Super Fuzzy Matrix Theory
No ratings yet
A Study of Fuzzy, Super Super Fuzzy Matrix Theory
163 pages
Supervised Learning
No ratings yet
Supervised Learning
3 pages
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Deep Learning Fundamentals in Python
From Everand
Deep Learning Fundamentals in Python
LazyProgrammer
4/5 (9)
Calculus
From Everand
Calculus
Jagdish Krishanlal Arora
No ratings yet
Calculus Fundamentals Explained
From Everand
Calculus Fundamentals Explained
Samuel Horelick
3/5 (3)
SAT Math: Master the Skills in 40 Pages
From Everand
SAT Math: Master the Skills in 40 Pages
Jennifer L Johnson
No ratings yet
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Spe 179603 Pa
No ratings yet
Spe 179603 Pa
10 pages
Spe 174541 Pa
No ratings yet
Spe 174541 Pa
11 pages
Compress Ibility
No ratings yet
Compress Ibility
2 pages
Decline Curve Analysis For Composite Reservoirs M. Issaka A. Ambastha
No ratings yet
Decline Curve Analysis For Composite Reservoirs M. Issaka A. Ambastha
11 pages
Petsoc 2005 113 Mattar DMB
No ratings yet
Petsoc 2005 113 Mattar DMB
10 pages
OTC-28468-MS Optimizing Well Test Design Using Integrated Rate and Pressure Transient Analysis in Fracture Basement Reservoirs
No ratings yet
OTC-28468-MS Optimizing Well Test Design Using Integrated Rate and Pressure Transient Analysis in Fracture Basement Reservoirs
8 pages
Annex-B: Additional Information About Production System Integration
No ratings yet
Annex-B: Additional Information About Production System Integration
3 pages
Annex-A: Additional Cash Flow Information
No ratings yet
Annex-A: Additional Cash Flow Information
1 page
SPE 120897 Predicting Water Influx From Common Shared Aquifers
No ratings yet
SPE 120897 Predicting Water Influx From Common Shared Aquifers
20 pages
Petrel - LAS Loading
No ratings yet
Petrel - LAS Loading
11 pages
Petrel - Contacts at Differents Depths
No ratings yet
Petrel - Contacts at Differents Depths
4 pages
Date:: Petroleum Engineering 324 - Well Performance Daily Summary Sheet Spring 2009 - Blasingame/Ilk
No ratings yet
Date:: Petroleum Engineering 324 - Well Performance Daily Summary Sheet Spring 2009 - Blasingame/Ilk
21 pages
Date:: Petroleum Engineering 324 - Well Performance Daily Summary Sheet Spring 2009 - Blasingame/Ilk
No ratings yet
Date:: Petroleum Engineering 324 - Well Performance Daily Summary Sheet Spring 2009 - Blasingame/Ilk
12 pages
APA Sistem
No ratings yet
APA Sistem
7 pages
English and Reflective Writing Skills in Medicine - A Guide For Medical Students and Doctors PDF
100% (2)
English and Reflective Writing Skills in Medicine - A Guide For Medical Students and Doctors PDF
176 pages
The Psychological Reality of The Paragraph1
No ratings yet
The Psychological Reality of The Paragraph1
5 pages
Lecture 2 - Binary Numbers, Python Basics PDF
No ratings yet
Lecture 2 - Binary Numbers, Python Basics PDF
62 pages
Developmental Psych
No ratings yet
Developmental Psych
1 page
Research Methodology: Rebecca Pera, PHD
No ratings yet
Research Methodology: Rebecca Pera, PHD
35 pages
Sai Vidhyalaya CBSE - Lalgudi English - Grammar Worksheet Grade - IV
100% (1)
Sai Vidhyalaya CBSE - Lalgudi English - Grammar Worksheet Grade - IV
19 pages
Module 3A: Designing Instruction in The Different Learning Delivery Modalities
No ratings yet
Module 3A: Designing Instruction in The Different Learning Delivery Modalities
20 pages
Experimental and Non-Experimental Studies
No ratings yet
Experimental and Non-Experimental Studies
4 pages
Informational Text Features Lesson Plan
100% (1)
Informational Text Features Lesson Plan
5 pages
KINDER AND GR1 CURRICULUM MAPPING 2024-2025 NEW
No ratings yet
KINDER AND GR1 CURRICULUM MAPPING 2024-2025 NEW
12 pages
Learning lessons from the past
No ratings yet
Learning lessons from the past
6 pages
DEV MODULE 5 WEEK 5 Brain Parts Processes and Functions
No ratings yet
DEV MODULE 5 WEEK 5 Brain Parts Processes and Functions
34 pages
Change
100% (1)
Change
13 pages
Learning Styles & Strategies
No ratings yet
Learning Styles & Strategies
13 pages
English Exams
0% (1)
English Exams
5 pages
Unit 27
No ratings yet
Unit 27
24 pages
O'Dowd 2021 - Virtual Exchange
No ratings yet
O'Dowd 2021 - Virtual Exchange
17 pages
Explorers Lesson Plan
No ratings yet
Explorers Lesson Plan
6 pages
Key Instructional Activities Theatre
No ratings yet
Key Instructional Activities Theatre
2 pages
Chapter 5
No ratings yet
Chapter 5
14 pages
MIS500 Assessment Two - IS Report - FINAL
No ratings yet
MIS500 Assessment Two - IS Report - FINAL
10 pages
An Assignment ON: Human Resource Management
No ratings yet
An Assignment ON: Human Resource Management
3 pages
Deetz - Theory As A Way of Seeing and Thinking
No ratings yet
Deetz - Theory As A Way of Seeing and Thinking
22 pages
4 Distracted Driving Post Test Answer Key
No ratings yet
4 Distracted Driving Post Test Answer Key
2 pages
1531 Music Transformer Generating M
No ratings yet
1531 Music Transformer Generating M
15 pages