0% found this document useful (0 votes)
17 views

Lecture 4 Intro To ML 27 03 2023 27032023 041559pm

This document discusses techniques for multiple linear regression analysis. It explains what multiple linear regression is and what it can be used for. Several examples are provided to illustrate how to perform multiple linear regression. The document also discusses strategies for selecting independent variables, including all-in, backward elimination, forward selection, bi-directional elimination, and evaluating all possible combinations. Key aspects like the regression coefficients, t-statistic, p-value, and significance levels are defined.

Uploaded by

Gaylethunder007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Lecture 4 Intro To ML 27 03 2023 27032023 041559pm

This document discusses techniques for multiple linear regression analysis. It explains what multiple linear regression is and what it can be used for. Several examples are provided to illustrate how to perform multiple linear regression. The document also discusses strategies for selecting independent variables, including all-in, backward elimination, forward selection, bi-directional elimination, and evaluating all possible combinations. Key aspects like the regression coefficients, t-statistic, p-value, and significance levels are defined.

Uploaded by

Gaylethunder007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Artificial Intelligence for Internet of

Things

Fall 2021
Lecture#05

1
Logistic Regression

2
Logistic Regression (2)

3
4
5
Linear vs Logistic

6
Example-1

7
Example-2 (1)

8
Example-2 (2)

9
Example-3

10
Example-3 (2)

11
Example-3 (3)

12
Example-3 (4)

13
Example-3 (5)

14
Example-5

15
Multiple Regression Algorithm

16
What is MLR?

Multiple linear regression is used to estimate the


relationship between two or more independent
variables and one dependent variable.

17
What MLR Can Do?

• How strong the relationship is between two or more


independent variables and one dependent variable (e.g. how
rainfall, temperature, and amount of fertilizer added affect
crop growth).
• The value of the dependent variable at a certain value of the
independent variables (e.g. the expected yield of a crop at
certain levels of rainfall, temperature, and fertilizer addition).

18
19
MRA

20
Example

21
Example

22
MRA

23
MRA

24
Example

25
Prep Work Required!

26
27
28
Design Requirements

29
best-fit line

• To find the best-fit line for each independent variable, multiple linear regression
calculates three things:
• The regression coefficients that lead to the smallest overall model error.
• The t-statistic of the overall model.
• The associated p-value (how likely it is that the t-statistic would have occurred by chance if
the null hypothesis of no relationship between the independent and dependent variables
was true).
T-test
• In statistics, the t-statistic is the ratio of
the departure of the estimated value of
a parameter from its hypothesized
value to its standard error’
• It is meant for evaluating whether the
two sets of data are statistically
significantly different from each other.
• Q.1: Find the t-test value for the following given two sets of values:
• A = 7, 2, 9, 8 and
• B = 1, 2, 3, 4?
• Solution: For first data set:
• Number of terms in first set i.e. n_1 = 4
• Calculate mean value for first data set using formula:
Higher values of the t-value, also called t-score,
indicate that a large difference exists between the
two sample sets. The smaller the t-value, the more
similarity exists between the two sample sets.
P-Value Test

• P-value is the lowest significance level that results in rejecting the null hypothesis.
Example

Example
• Coin toss
• Two possible outcomes
• H0 = This is a fair coin
• H1 = This is not a fair coin
• The P-value test will assume that the H0 hypothesis is true i.e., the
coin is fair
• Let us assume our threshold value to be 5% i.e., 0.05
• Let us assume the output is
• First toss output is Tail (probability = 0.5)
• First toss output is Tail and second toss output is also Tail (probability
= 0.25)
• First two outputs same as before, third toss output is also Tail
(probability = 0.125)
• First three outputs same as before, fourth toss output is also Tail
(probability = 0.0625)
• First four outputs same as before, fifth toss output is also Tail
(probability = 0.03)
• First five outputs same as before, sixth toss output is also Tail
(probability = 0.01)
• After the fourth output the statistical test is significant. Since P-value
of less than 5% indicates that the hypothesis H0 is rejected and
hypothesis H1 is accepted i.e., the coin is not fair
Selecting the independent variables being used

• Five strategies are available for selecting the independent variables


• All in
• Backward Elimination
• Forward Selection
• Bi-directional elimination
• Score Comparison (All possible combinations)
All in

• Use all features


• Prior knowledge (Data domain expert) tell you which features to keep and which to
discard
Backward Elimination

1. Select a significance level (SL) for P-value e.g. 5% (0.05)


2. Fit the model will all predictors
3. Consider the predictor with highest P-value. If P>SL go to step
4, otherwise include the predictor in your feature set
4. Remove variable with P>SL
5. Fit model without the variable and go to step 3 if all features
have not been exhausted. Otherwise terminate
Forward selection

Select a significance level (SL) for P-value e.g.


Select
5% (0.05)

Fit Fit all the predictors y->xn one at a time and select one with the lowest P-value

Keep this variable and fit all possible models with one extra predictor i.e., add one predictor to the
Keep variables you already have.

Consider the predictor with the least P-value. If P<SL, go to step 3, otherwise finish. (keep the
Consider previous model)
Select a significance level to enter (SL_Enter) and stay
(SL_stay) in the model.

Perform the next step of forward selection (new


variables must enter if P < SL_enter)
Bi-directional
Pefrom all steps of backward elimination (old variables
Elimination must have P<SL_stay to stay in the model)

No new variables can enter and no old variables can


exit

FIN: model is ready


All possible
1. models
Select a criterion of goodness of fit
2. Construct all possible models. If N variables the 2𝑁 −
1 𝑚𝑜𝑑𝑒𝑙𝑠
3. Select the model with the best criterion
4. Model is ready

• Very computationally intense !!!


• We will be using backward elimination strategy

You might also like