SL LMRG
SL LMRG
Soohyun Yang
College of Engineering
Department of Civil and Environmental Engineering
Types of ML techniques – All learning is learning!
Our scope
•
Classification
“Presence of labels”
Advertisement popularity •
“Absence of labels”
Recommender systems (YT) •
“Behavior-driven : feedback loop”
Learning to play games (AlphaGo)
• Spam classification • Clustering
Buying habits (group customers) • Industrial simulation
• Regression
Face recognition • Grouping user logs • Resource management
https://towardsdatascience.com/what-are-the-types-of-machine-learning-e2b9e5d1756f
Regression
A statistical method to determine the relationship between
a dependent variable (target) and one or more independent variable
s (features), predicting a target value on a continuous scale for a giv
en new data.
Target
https://www.javatpoint.com/machine-learning-polynomial-regression
LR models (con’t)
Multiple LR [다중 선형회귀]:
A LR using more than one feature (xn; n>1)to predict a target y.
y = w0 + w1 x1 + w2 x2 + + wn xn
https://www.shiksha.com/online-courses/articles/multiple-linear-regression/
LR models (con’t)
Feature engineering [특성 공학]:
The process of selecting, manipulating, and transforming raw
features into desired features to obtain better performance in
supervised learning.
Example: 2nd-degree with two raw features (x1 and x2)
2 2
y =w0 + w1 x1 + w2 x2 + w3 x1 x2 + w x + w5 x2
4 1
Exercise 1: Simple LR approach
Let’s apply for the simple LR algorithm
to resolve a regression problem.
1. Data preparation & import :
InClassData_Traffic_Reg.csv
Input
Samples
Exercise 1: Simple LR approach (con’t)
Let’s apply for the simple LR algorithm
to resolve a regression problem.
1*. Visualize the whole data to
understand it easily
Traffic volume distribution
Exercise 1: Simple LR approach (con’t)
2. Select a specific feature for the
simple LR analysis (here, “Feature 1”)
Input
Feature 1 Target
Samples
Exercise 1: Simple LR approach (con’t)
3. Data separation into the
training and test sets
• random_state [integer] : A parameter
for the random number generator.
• DO NOT NEED ‘Stratification’ process
for regression problem.
Y = 191.96 X – 441.7
Does the regression
make sense?
Exercise 2: Polynomial LR approach
1. Let’s introduce a second-degree
polynomial function, defined as:
Target = w0 + w1F + w2F2
(where F = feature 1)
>> Note : We intend to make ‘two’ features
as [F, F2]. A dataset should be defined to
contain the newly formulated features.
2. Import ‘LinearRegression’ class
and create its instance.
3. Execute the Fit-Predict-Score
methods.
4. Yield the model’s performance.
Exercise 2: Polynomial LR approach (con’t)
5. Check the resultant
coefficients and the y-
intercept
=> T = 25.6F2 – 48.6F + 43.9
6. Visual examination
Exercise 3-1: Multiple LR approach (2nd-degree)
1. Set a set of multiple
Three features features to express all
possible combinations.
2. Data separation into the
training and test sets
3. Import
2nd-degree is
‘PolynomialFeatures’ class
the default option!
and create its instance.
• Include_bias = False : The
intercept term is not included in
the output features.
4. Create the transformed
training & test sets
(fit_transform method).
Exercise 3-1: Multiple LR approach (con’t)
5. Execute the Fit-Predict-Score
methods.
6. Yield the model’s performance.
Techniques:
• Ridge regression – L2 regularization
• Lasso regression – L1 regularization
Cost function
Ridge regression
An L2 penalized model where we simply add the squared sum of the
weights to our least-squares cost function.
Default : Alpha = 1
Example : Ridge regression (con’t)
Example : Ridge regression (con’t)
Lasso regression
Lasso (Least Absolute Shrinkage and Selection Operator)
Depending on the regularization strength, certain weights can
become zero, being useful to select supervised features.