0% found this document useful (0 votes)
5 views32 pages

SL LMRG

The document outlines various machine learning techniques, focusing on supervised learning methods such as regression, classification, and clustering. It details linear regression models, including simple, polynomial, and multiple regression, along with regularization techniques like Ridge and Lasso regression. Additionally, it provides exercises for applying these concepts using real data, emphasizing the importance of feature engineering and model evaluation.

Uploaded by

hanyeelovesgod
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views32 pages

SL LMRG

The document outlines various machine learning techniques, focusing on supervised learning methods such as regression, classification, and clustering. It details linear regression models, including simple, polynomial, and multiple regression, along with regularization techniques like Ridge and Lasso regression. Additionally, it provides exercises for applying these concepts using real data, emphasizing the importance of feature engineering and model evaluation.

Uploaded by

hanyeelovesgod
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

035.

001 Spring, 2024

Digital Computer Concept and Practice


Supervised Learning (3)

Soohyun Yang

College of Engineering
Department of Civil and Environmental Engineering
Types of ML techniques – All learning is learning!
Our scope


Classification
“Presence of labels”
Advertisement popularity •
“Absence of labels”
Recommender systems (YT) •
“Behavior-driven : feedback loop”
Learning to play games (AlphaGo)
• Spam classification • Clustering
Buying habits (group customers) • Industrial simulation
• Regression
Face recognition • Grouping user logs • Resource management

https://towardsdatascience.com/what-are-the-types-of-machine-learning-e2b9e5d1756f
Regression
 A statistical method to determine the relationship between
a dependent variable (target) and one or more independent variable
s (features), predicting a target value on a continuous scale for a giv
en new data.

 Algorithms of our scope


• K-nearest neighbor (KNN)
• Linear regression (LR) => Simple, Polynomial, Multiple
• Ridge regression
Regularization
• Lasso regression
• Decision trees === // Ensemble // ===> Random forest
Linear Regression (LR) models
 Describe a continuous target variable as a linear combination of one
or more features.
 Aim to find a set of model parameters (coefficients and y-intercept),
which minimizes the sum of squared residuals (a.k.a. offset).
 Example :

Target

A feature Raschka & Mirjalili (2019);


LR models (con’t)
 Simple LR [단순 선형회귀]:
A LR model with a single feature variable x.
 Polynomial LR [다항 선형회귀]:
A LR model with an n-th degree polynomial in one feature x.

https://www.javatpoint.com/machine-learning-polynomial-regression
LR models (con’t)
 Multiple LR [다중 선형회귀]:
A LR using more than one feature (xn; n>1)to predict a target y.

y = w0 + w1 x1 + w2 x2 +  + wn xn

https://www.shiksha.com/online-courses/articles/multiple-linear-regression/
LR models (con’t)
 Feature engineering [특성 공학]:
The process of selecting, manipulating, and transforming raw
features into desired features to obtain better performance in
supervised learning.
 Example: 2nd-degree with two raw features (x1 and x2)
2 2
y =w0 + w1 x1 + w2 x2 + w3 x1 x2 + w x + w5 x2
4 1
Exercise 1: Simple LR approach
 Let’s apply for the simple LR algorithm
to resolve a regression problem.
 1. Data preparation & import :
InClassData_Traffic_Reg.csv
Input

Feature 1 Feature 2 Feature 3 Target

Samples
Exercise 1: Simple LR approach (con’t)
 Let’s apply for the simple LR algorithm
to resolve a regression problem.
 1*. Visualize the whole data to
understand it easily
Traffic volume distribution
Exercise 1: Simple LR approach (con’t)
 2. Select a specific feature for the
simple LR analysis (here, “Feature 1”)
Input
Feature 1 Target

Samples
Exercise 1: Simple LR approach (con’t)
 3. Data separation into the
training and test sets
• random_state [integer] : A parameter
for the random number generator.
• DO NOT NEED ‘Stratification’ process
for regression problem.

 4. Reshape 1-D training sets


as 2-D array
Exercise 1: Simple LR approach (con’t)
 5. Import ‘LinearRegression’ class
and create its instance.
 6. Fit the regression model using
the training set (fit method).
 7. Make predictions on the test
data (predict method).
 8. Evaluate the model’s
performance (score method
=> via R2, 결정계수).
Exercise 1: Simple LR approach (con’t)
 9. Check the resultant coefficients
and the y-intercept
• coef_ : Estimated coefficients for the
linear regression problem, from the
highest to the 1st orders. => Array type
• intercept_ : Estimated y-intercept.
=> Float or array type

Y = 191.96 X – 441.7
Does the regression
make sense?
Exercise 2: Polynomial LR approach
 1. Let’s introduce a second-degree
polynomial function, defined as:
Target = w0 + w1F + w2F2
(where F = feature 1)
>> Note : We intend to make ‘two’ features
as [F, F2]. A dataset should be defined to
contain the newly formulated features.
 2. Import ‘LinearRegression’ class
and create its instance.
 3. Execute the Fit-Predict-Score
methods.
 4. Yield the model’s performance.
Exercise 2: Polynomial LR approach (con’t)
 5. Check the resultant
coefficients and the y-
intercept
=> T = 25.6F2 – 48.6F + 43.9

 6. Visual examination
Exercise 3-1: Multiple LR approach (2nd-degree)
 1. Set a set of multiple
Three features features to express all
possible combinations.
 2. Data separation into the
training and test sets
 3. Import
2nd-degree is
‘PolynomialFeatures’ class
the default option!
and create its instance.
• Include_bias = False : The
intercept term is not included in
the output features.
 4. Create the transformed
training & test sets
(fit_transform method).
Exercise 3-1: Multiple LR approach (con’t)
 5. Execute the Fit-Predict-Score
methods.
 6. Yield the model’s performance.

Same order of coefficients!


Exercise 3-2: Multiple LR approach (5th-degree)
 1. Set a set of multiple
features to express all
possible combinations.
55 variables !
 2. Data separation into the
training and test sets
 3. Implement
‘PolynomialFeatures’ class
with a specific option
“degree = 5” and create its
instance.
 4. Create the transformed
training & test sets
(fit_transform method).
Exercise 3-2: Multiple LR approach (5th-degree)
Summarized R2
 Simple LR model (with one feature, Feature 1)

 Polynomial LR model (2nd-degree with one feature, Feature 1)

 Multiple LR model (2nd-degree with three features, Features 1~3)

 Multiple LR model (5th-degree with three features, Features 1~3)


=> Extremely overfitted model to the trained data
Regularization
 A technique is used to reduce errors by fitting the function
appropriately on the given training set and avoiding overfitting.
(i.e., reduce model complexity)
 Is controlled by ‘Hyperparameter[하이퍼파라미터]’
- A parameter is not learned by a model, Our goal : To find the combination of
but assigned by a user. weight coefficients that minimize the
cost function for the training data

 Techniques:
• Ridge regression – L2 regularization
• Lasso regression – L1 regularization

Raschka & Mirjalili (2019);


• Elastic Net regression – L1 and L2 regularization

Cost function
Ridge regression
 An L2 penalized model where we simply add the squared sum of the
weights to our least-squares cost function.

Raschka & Mirjalili (2019);


 Greater the value of hyperparameter 𝜆𝜆
=> Increase the regularization strength
=> shrink the weights of our model.
>> Note : Don't regularize the y-intercept term, 𝑤𝑤0.
Example : Ridge regression

Default : Alpha = 1
Example : Ridge regression (con’t)
Example : Ridge regression (con’t)
Lasso regression
 Lasso (Least Absolute Shrinkage and Selection Operator)
 Depending on the regularization strength, certain weights can
become zero, being useful to select supervised features.

Raschka & Mirjalili (2019);


Example : Lasso regression
Default : Alpha = 1
Example : Lasso regression (con’t)
Example : Lasso regression (con’t)
Summarize the results
Polynomial Features (with 5th-degree):
Coefficients 2.7.E+02 3.2.E+02 -1.1.E+03 8.0.E+02 1.4.E+03
-1.0.E+03 2.2.E+03 -1.5.E+03 3.4.E+02 1.6.E+03
2.8.E+03 -9.1.E+02 4.8.E+03 -1.1.E+03 2.7.E+02
7.8.E+03 -2.1.E+03 1.7.E+02 -1.8.E+01 -6.8.E+02
-8.6.E+02 2.5.E+01 -6.7.E+02 -2.6.E+02 1.1.E+02
7.7.E+01 7.7.E+01 -6.6.E-01 -1.2.E+01 9.1.E+02 Variables :
-4.4.E+02 7.0.E+01 -2.8.E+00 3.7.E-01 4.8.E+03 x0 x1 x2 x0^2 x0 x1
2.1.E+03 -1.8.E+03 -1.7.E+03 2.9.E+02 1.9.E+02 x0 x2 x1^2 x1 x2 x2^2 x0^3
-5.3.E+03 2.0.E+03 -2.5.E+02 -2.9.E+00 -5.3.E+03 x0^2 x1 x0^2 x2 x0 x1^2 x0 x1 x2 x0 x2^2
2.3.E+03 -4.0.E+02 3.5.E+01 -6.8.E-01 4.5.E+03 x1^3 x1^2 x2 x1 x2^2 x2^3 x0^4
-1.8.E+03 2.7.E+02 -1.9.E+01 4.0.E-01 -1.1.E-03 x0^3 x1 x0^3 x2 x0^2 x1^2 x0^2 x1 x2 x0^2 x2^2
Intercept -6.5.E+04 x0 x1^3 x0 x1^2 x2 x0 x1 x2^2 x0 x2^3 x1^4
x1^3 x2 x1^2 x2^2 x1 x2^3 x2^4 x0^5
x0^4 x1 x0^4 x2 x0^3 x1^2 x0^3 x1 x2 x0^3 x2^2
Ridge Regression :
x0^2 x1^3 x0^2 x1^2 x2 x0^2 x1 x2^2 x0^2 x2^3 x0 x1^4
Coefficients 4.5.E+00 1.3.E+01 -7.6.E+00 1.7.E+01 1.6.E+01
x0 x1^3 x2 x0 x1^2 x2^2 x0 x1 x2^3 x0 x2^4 x1^5
6.2.E+00 2.3.E+01 1.6.E+01 -1.9.E+00 2.0.E+01
x1^4 x2 x1^3 x2^2 x1^2 x2^3 x1 x2^4 x2^5
1.4.E+01 1.1.E+01 1.4.E+01 1.3.E+01 5.4.E+00
2.3.E+01 2.3.E+01 1.6.E+01 2.1.E+00 1.5.E+01
5.2.E+00 8.3.E+00 7.2.E-01 4.6.E+00 4.2.E+00
3.5.E+00 7.9.E+00 8.1.E+00 3.0.E+00 1.5.E+01
All 55 features are considered..!
2.0.E+01 2.1.E+01 1.6.E+01 4.5.E+00 6.2.E+00 Too much complicated… 
-7.3.E+00 -6.2.E-01 -1.6.E+01 -8.4.E+00 -3.8.E+00
-1.8.E+01 -1.0.E+01 -5.1.E+00 -3.7.E+00 -1.2.E+01
-4.2.E+00 1.0.E+00 2.6.E+00 -4.2.E-01 1.4.E+00
1.0.E+01 1.6.E+01 1.7.E+01 1.4.E+01 5.5.E+00
Intercept 4.0.E+02
Summarize the results (con’t)
Lasso Regression : Variables :
Coefficients 0 0 0 6.0.E+01 0 x0 x1 x2 x0^2 x0 x1
0 5.3.E+01 0 0 2.2.E+01 x0 x2 x1^2 x1 x2 x2^2 x0^3
0 0 0 0 0 x0^2 x1 x0^2 x2 x0 x1^2 x0 x1 x2 x0 x2^2
0 1.1.E+02 0 0 0 x1^3 x1^2 x2 x1 x2^2 x2^3 x0^4
0 0 0 0 0 x0^3 x1 x0^3 x2 x0^2 x1^2 x0^2 x1 x2 x0^2 x2^2
0 0 0 0 0 x0 x1^3 x0 x1^2 x2 x0 x1 x2^2 x0 x2^3 x1^4
0 5.1.E+01 2.1.E+01 0 0 x1^3 x2 x1^2 x2^2 x1 x2^3 x2^4 x0^5
0 0 0 0 0 x0^4 x1 x0^4 x2 x0^3 x1^2 x0^3 x1 x2 x0^3 x2^2
0 0 0 0 0 x0^2 x1^3 x0^2 x1^2 x2 x0^2 x1 x2^2 x0^2 x2^3 x0 x1^4
0 0 0 0 0 x0 x1^3 x2 x0 x1^2 x2^2 x0 x1 x2^3 x0 x2^4 x1^5
0 0 0 3.1.E+01 0 x1^4 x2 x1^3 x2^2 x1^2 x2^3 x1 x2^4 x2^5
Intercept 4.0.E+02

The 7 features with non-zero coefficients


are considered the most influential or
informative for the predictive model..!
Take-home points (THPs)
-
-
-
…

You might also like