Lecture 4 Intro To ML 27 03 2023 27032023 041559pm
Lecture 4 Intro To ML 27 03 2023 27032023 041559pm
Things
Fall 2021
Lecture#05
1
Logistic Regression
2
Logistic Regression (2)
3
4
5
Linear vs Logistic
6
Example-1
7
Example-2 (1)
8
Example-2 (2)
9
Example-3
10
Example-3 (2)
11
Example-3 (3)
12
Example-3 (4)
13
Example-3 (5)
14
Example-5
15
Multiple Regression Algorithm
16
What is MLR?
17
What MLR Can Do?
18
19
MRA
20
Example
21
Example
22
MRA
23
MRA
24
Example
25
Prep Work Required!
26
27
28
Design Requirements
29
best-fit line
• To find the best-fit line for each independent variable, multiple linear regression
calculates three things:
• The regression coefficients that lead to the smallest overall model error.
• The t-statistic of the overall model.
• The associated p-value (how likely it is that the t-statistic would have occurred by chance if
the null hypothesis of no relationship between the independent and dependent variables
was true).
T-test
• In statistics, the t-statistic is the ratio of
the departure of the estimated value of
a parameter from its hypothesized
value to its standard error’
• It is meant for evaluating whether the
two sets of data are statistically
significantly different from each other.
• Q.1: Find the t-test value for the following given two sets of values:
• A = 7, 2, 9, 8 and
• B = 1, 2, 3, 4?
• Solution: For first data set:
• Number of terms in first set i.e. n_1 = 4
• Calculate mean value for first data set using formula:
Higher values of the t-value, also called t-score,
indicate that a large difference exists between the
two sample sets. The smaller the t-value, the more
similarity exists between the two sample sets.
P-Value Test
• P-value is the lowest significance level that results in rejecting the null hypothesis.
Example
Example
• Coin toss
• Two possible outcomes
• H0 = This is a fair coin
• H1 = This is not a fair coin
• The P-value test will assume that the H0 hypothesis is true i.e., the
coin is fair
• Let us assume our threshold value to be 5% i.e., 0.05
• Let us assume the output is
• First toss output is Tail (probability = 0.5)
• First toss output is Tail and second toss output is also Tail (probability
= 0.25)
• First two outputs same as before, third toss output is also Tail
(probability = 0.125)
• First three outputs same as before, fourth toss output is also Tail
(probability = 0.0625)
• First four outputs same as before, fifth toss output is also Tail
(probability = 0.03)
• First five outputs same as before, sixth toss output is also Tail
(probability = 0.01)
• After the fourth output the statistical test is significant. Since P-value
of less than 5% indicates that the hypothesis H0 is rejected and
hypothesis H1 is accepted i.e., the coin is not fair
Selecting the independent variables being used
Fit Fit all the predictors y->xn one at a time and select one with the lowest P-value
Keep this variable and fit all possible models with one extra predictor i.e., add one predictor to the
Keep variables you already have.
Consider the predictor with the least P-value. If P<SL, go to step 3, otherwise finish. (keep the
Consider previous model)
Select a significance level to enter (SL_Enter) and stay
(SL_stay) in the model.