0% found this document useful (0 votes)

5 views

SubjectiveQuestions

Uploaded by

amitkumarshah8447452701

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

SubjectiveQuestions

Uploaded by

amitkumarshah8447452701

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Question 1

What is the optimal value of alpha for ridge and lasso regression? What will be the changes
in the model if you choose double the value of alpha for both ridge and lasso? What will be
the most important predictor variables after the change is implemented?

Answer:

Ridge regression:
Ridge Regression is a technique for analysing multiple regression data that suffer from
multicollinearity. When multicollinearity occurs, least squares estimates are unbiased, but
their variances are large so they may be far from the true value.

For ridge regression, the optimal value of alpha is 20.

Lasso Regression :
Lasso regression is a type of linear regression that uses shrinkage. Shrinkage is where data
values are shrunk towards a central point, like the mean. The lasso procedure encourages
simple, sparse models (i.e. models with fewer parameters).

In this case of Lasso regression, the optimal value for alpha is 1.

If we choose double the value of alpha for both ridge and lasso regression, model complexity
will have a greater contribution to the cost. Because the minimum cost hypothesis is selected,
this means that higher λ will bias the selection toward models with lower complexity.

After second model is build we compare the r square value of new model with the old one.
The model which is having high r square of test and train dataset , we will select the
features/variables from that model. And the variable is selected based on the high coefficient
value.

Question 2

You have determined the optimal value of lambda for ridge and lasso regression during the
assignment. Now, which one will you choose to apply and why?

Answer:
Lasso regression would be a better option it would help in feature elimination and the model
will be more robust. Because

§ In the ridge, the coefficients of the linear transformation are normal distributed and in
the lasso they are Laplace distributed. In the lasso, this makes it easier for the
coefficients to be zero and therefore easier to eliminate some of your input variable as
not contributing to the output.
§ Ridge regression can't zero out coefficients; thus, you either end up including all the
coefficients in the model, or none of them. In contrast, the LASSO does both
parameter shrinkage and variable selection automatically.

§ Lasso regression can produce many solutions to the same problem.

§ Ridge regression can only produce one solution to one problem.

Question 3

After building the model, you realised that the five most important predictor variables in the
lasso model are not available in the incoming data. You will now have to create another
model excluding the five most important predictor variables. Which are the five most
important predictor variables now?

Answer:

Statistical measures can show the relative importance of the different predictor variables.
However, these measures can't determine whether the variables are important in a practical
sense. To determine practical importance, you'll need to use your subject area knowledge.

How you collect and measure your sample can bias the apparent importance of the variables
in your sample compared to their true importance in the population.
If you randomly sample your observations, the variability of the predictor values in your
sample likely reflects the variability in the population. In this case, the standardized
coefficients and the change in R-squared values are likely to reflect their population values.

However, if you select a restricted range of predictor values for your sample, both statistics
tend to underestimate the importance of that predictor. Conversely, if the sample variability
for a predictor is greater than the variability in the population, the statistics tend to
overestimate the importance of that predictor.

Also, consider the accuracy and precision of the measurements for your predictors because
this can affect their apparent importance. For example, lower-quality measurements can
cause a variable to appear less predictive than it truly is.

How you define “most important” often depends on your goals and subject area. While
statistics can help you identify the most important variables in a regression model, applying
subject area expertise to all aspects of statistical analysis is crucial. Real world issues are
likely to influence which variable you identify as the most important in a regression model.

For example, if your goal is to change predictor values in order to change the response, use
your expertise to determine which variables are the most feasible to change. There may be
variables that are harder, or more expensive, to change. Some variables may be impossible to
change. Sometimes a large change in one variable may be more practical than a small change
in another variable.

“Most important” is a subjective, context sensitive characteristic. You can use statistics to
help identify candidates for the most important variable in a regression model, but you’ll
likely need to use your subject area expertise as well.

Question 4

How can you make sure that a model is robust and generalisable? What are the implications
of the same for the accuracy of the model and why?

Answer:

A model needs to be made robust and generalizable so that they are not impacted by outliers
in the training data. The model should also be generalisable so that the test accuracy is not
lesser than the training score. The model should be accurate for datasets other than the ones
which were used during training. Too much weightage should not given to the outliers so that
the accuracy predicted by the model is high. To ensure that this is not the case, the outlier
analysis needs to be done and only those which are relevant to the dataset need to be retained.
Those outliers which it does not make sense to keep must be removed from the dataset. This
would help increase the accuracy of the predictions made by the model. Confidence intervals
can be used. This would help standardize the predictions made by the model. If the model is
not robust , it cannot be trusted for predictive analysis.

The best accuracy is 100% indicating that all the predictions are correct. For an imbalanced
dataset, accuracy is not a valid measure of model performance. For a dataset where the
default rate is 5%, even if all the records are predicted as 0, the model will still have an
accuracy of 95%

Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
From Everand
Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
Jim Frost
5/5 (4)
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Question 1
No ratings yet
Question 1
2 pages
Data Science Module 5 q & A
No ratings yet
Data Science Module 5 q & A
8 pages
1. Lecture+Notes+-+Advanced+Regression
No ratings yet
1. Lecture+Notes+-+Advanced+Regression
12 pages
0 Regularization PDF
No ratings yet
0 Regularization PDF
88 pages
Advanced Regression Assignment
No ratings yet
Advanced Regression Assignment
5 pages
Gale Researcher Guide for: Econometric Models
From Everand
Gale Researcher Guide for: Econometric Models
Chupp
No ratings yet
Chapter 5
No ratings yet
Chapter 5
30 pages
Revision235
No ratings yet
Revision235
8 pages
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Accuracy Assessment and Confusion Matrix
No ratings yet
Accuracy Assessment and Confusion Matrix
23 pages
Unit-III (Data Analytics)
100% (1)
Unit-III (Data Analytics)
15 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
Regression
No ratings yet
Regression
45 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
049 Stat 326 Regression Final Paper
No ratings yet
049 Stat 326 Regression Final Paper
17 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
15 pages
UNIT - III
No ratings yet
UNIT - III
9 pages
EDA 4th Module
No ratings yet
EDA 4th Module
26 pages
Business Analytics
No ratings yet
Business Analytics
19 pages
Statistical Modelling: Regression: Choosing The Independent Variables
No ratings yet
Statistical Modelling: Regression: Choosing The Independent Variables
14 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
Session 12: Regression, Forecasting Techniques: January 2015 - April 2015
No ratings yet
Session 12: Regression, Forecasting Techniques: January 2015 - April 2015
9 pages
Regression Analysis in Machine Learning: Context
No ratings yet
Regression Analysis in Machine Learning: Context
16 pages
lec3
No ratings yet
lec3
69 pages
ML Assignment
No ratings yet
ML Assignment
5 pages
DA Unit-3
No ratings yet
DA Unit-3
14 pages
Unit III
No ratings yet
Unit III
18 pages
Problems2 Solutions
No ratings yet
Problems2 Solutions
4 pages
Econometrics
No ratings yet
Econometrics
13 pages
Data Analytics Unit 2
No ratings yet
Data Analytics Unit 2
13 pages
Model Development
No ratings yet
Model Development
80 pages
w4 Generalisation
No ratings yet
w4 Generalisation
42 pages
Unit -3_ML_24
No ratings yet
Unit -3_ML_24
41 pages
21csc305p Ml Unit 2 Ppt
No ratings yet
21csc305p Ml Unit 2 Ppt
115 pages
Statistical Testing and Prediction Using Linear Regression: Abstract
No ratings yet
Statistical Testing and Prediction Using Linear Regression: Abstract
10 pages
Lecture 13 - Reguralization
No ratings yet
Lecture 13 - Reguralization
33 pages
STATISTIC%20AND%20DATA%20SCIENCE%20II.pdf
No ratings yet
STATISTIC%20AND%20DATA%20SCIENCE%20II.pdf
37 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
Block 5 ST3189
No ratings yet
Block 5 ST3189
6 pages
6 Ridge Regression (2)
No ratings yet
6 Ridge Regression (2)
7 pages
Unit 3
No ratings yet
Unit 3
24 pages
Unit - 1
No ratings yet
Unit - 1
8 pages
Data Science Q&A - Latest Ed (2020) - 3 - 1
No ratings yet
Data Science Q&A - Latest Ed (2020) - 3 - 1
2 pages
ML Linear Model
No ratings yet
ML Linear Model
10 pages
Regression Notes
No ratings yet
Regression Notes
7 pages
3 Unit - Dspu
No ratings yet
3 Unit - Dspu
23 pages
Regression Notes
No ratings yet
Regression Notes
6 pages
Multiple Regression
No ratings yet
Multiple Regression
49 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Questions Stats and Trix
No ratings yet
Questions Stats and Trix
39 pages
Econometrics Session
No ratings yet
Econometrics Session
43 pages
Machine Learning. Supervised Learning Techniques and Tools: Nonlinear Models Exercises with R, SAS, Stata, Eviews and SPSS
From Everand
Machine Learning. Supervised Learning Techniques and Tools: Nonlinear Models Exercises with R, SAS, Stata, Eviews and SPSS
César Pérez López
No ratings yet
Data Analysis (27 Questions) : 1. (Given A Dataset) Analyze This Dataset and Tell Me What You Can Learn From It
No ratings yet
Data Analysis (27 Questions) : 1. (Given A Dataset) Analyze This Dataset and Tell Me What You Can Learn From It
28 pages
FDA UNIT 5
No ratings yet
FDA UNIT 5
20 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
Eco Trix
No ratings yet
Eco Trix
16 pages
Chapter 6 (Part Ii)
No ratings yet
Chapter 6 (Part Ii)
41 pages
Lecture 19
No ratings yet
Lecture 19
25 pages
Ip Practical List 2022-23
No ratings yet
Ip Practical List 2022-23
3 pages
Hospital Management System
No ratings yet
Hospital Management System
3 pages
Event PPT FINAL Org
No ratings yet
Event PPT FINAL Org
11 pages
1child Birth Records Management System
No ratings yet
1child Birth Records Management System
53 pages
System Analysis Chap 1 MCQ
No ratings yet
System Analysis Chap 1 MCQ
4 pages
CS8481 DBMS Lab QN
No ratings yet
CS8481 DBMS Lab QN
9 pages
Appian Interview Questions
No ratings yet
Appian Interview Questions
10 pages
SHARE DDF Presentation
No ratings yet
SHARE DDF Presentation
44 pages
Samatha Sap Basis
No ratings yet
Samatha Sap Basis
2 pages
Audit in A CIS Environment 2
No ratings yet
Audit in A CIS Environment 2
3 pages
DMBI Unit-1
No ratings yet
DMBI Unit-1
37 pages
Microsoft Power Bi Full Course Tutorial_17102024
No ratings yet
Microsoft Power Bi Full Course Tutorial_17102024
176 pages
Chapter Two
No ratings yet
Chapter Two
23 pages
AWS Certified Solutions Architect Associate Practice Test 04
No ratings yet
AWS Certified Solutions Architect Associate Practice Test 04
74 pages
FactoryTalk View Site Edition Tips and Best Practices TOC
No ratings yet
FactoryTalk View Site Edition Tips and Best Practices TOC
7 pages
Database Systems Handbook
No ratings yet
Database Systems Handbook
293 pages
FETCH and OPEN CURSOR Analysis - ABAP Development - Community Wiki
No ratings yet
FETCH and OPEN CURSOR Analysis - ABAP Development - Community Wiki
26 pages
Index: S.No. Experiment Date Signature Remarks
No ratings yet
Index: S.No. Experiment Date Signature Remarks
8 pages
1
No ratings yet
1
15 pages
Outer Join Flag
No ratings yet
Outer Join Flag
2 pages
Tourism Database Management System
No ratings yet
Tourism Database Management System
22 pages
AltaCV Template
No ratings yet
AltaCV Template
1 page
WQD7005 Case Study - 17219402
No ratings yet
WQD7005 Case Study - 17219402
21 pages
Technical Data TECAM SMART For MS Windows XP, Vista
100% (1)
Technical Data TECAM SMART For MS Windows XP, Vista
1 page
Imd213 Individual Assignment (Cataloguing Process)
No ratings yet
Imd213 Individual Assignment (Cataloguing Process)
23 pages
MongoDB - Data Modelling
No ratings yet
MongoDB - Data Modelling
3 pages
Lecture1519 - 15813 - Effort Estimation
No ratings yet
Lecture1519 - 15813 - Effort Estimation
32 pages
AD0-E137 Adobe Experience Manager Sites Developer Expert Exam Free Dumps
No ratings yet
AD0-E137 Adobe Experience Manager Sites Developer Expert Exam Free Dumps
5 pages
DR - Osama Lab 2 Creating and Deleting Tables
No ratings yet
DR - Osama Lab 2 Creating and Deleting Tables
17 pages
Symantec Data Loss Prevention Upgrade Guide For Windows: February 4, 2021
No ratings yet
Symantec Data Loss Prevention Upgrade Guide For Windows: February 4, 2021
74 pages

SubjectiveQuestions

Uploaded by

SubjectiveQuestions

Uploaded by

Question 1

For ridge regression, the optimal value of alpha is 20.

In this case of Lasso regression, the optimal value for alpha is 1.

§ Lasso regression can produce many solutions to the same problem.

§ Ridge regression can only produce one solution to one problem.

You might also like