0% found this document useful (0 votes)

2 views42 pages

BA unit3

The document discusses trendlines and regression analysis, focusing on the importance of mathematical functions in predictive analytics. It covers concepts such as simple and multiple linear regression, least squares regression, hypothesis testing for regression coefficients, and evaluation metrics like R-squared and RMSE. Additionally, it addresses regression with categorical variables and non-linear regression applications.

Uploaded by

Shaik Ashraaf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views42 pages

BA unit3

Uploaded by

Shaik Ashraaf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Unit 3

Trendlines and Regression

BUSINESS ANALYTICS
B.Tech(CSE) IV Year - I Semester
Open Elective - III

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering, Andhra

University
Prof. S.Adinarayana, Dept of CS&SE, College
of Engineering, Andhra University
1
Unit 3
Trendlines and Regression

Prof. S.Adinarayana, Dept of CS&SE,

College of Engineering, Andhra University

2
Modeling Relationships and Trends in Data
• Mathematics and the descriptive properties of different functional
relationships are important in building predictive analytical models.
• Common types of mathematical functions used in predictive analytical
models include the following:

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 3
Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,
Andhra University 4
• R2 (R-squared) is a measure of the “fit” of the line to the data.
• The value of R2 will be between 0 and 1.
• The larger the value of R2 the better the fit.
• Trendlines can be used to model relationships between variables and
understand how the dependent variable behaves as the independent
variable changes.
• For example, the demand-prediction models would generally be
developed by analyzing data.
Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,
Andhra University 5
Simple Linear Regression

• Regression analysis is a tool for building mathematical and statistical models that
characterize relationships between a dependent variable (which must be a ratio
variable and not categorical) and one or more independent, or explanatory, variables,
all of which are numerical (but may be either ratio or categorical).
• Two broad categories of regression models are used often in business settings: (1)
regression models of cross-sectional data and (2) regression models of time-series
data, in which the independent variables are time or some function of time and the
focus is on predicting the future.
• Time-series regression is an important tool in forecasting.
Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,
Andhra University
6
simple linear regression

• A regression model that involves a single independent variable is called

simple regression.
• A regression model that involves two or more independent variables is
called multiple regression.
• Simple linear regression involves finding a linear relationship between
one independent variable, X, and one dependent variable, Y.
• The relationship between two variables can assume many forms, as
illustrated in Figure.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 7
Linear regression

• Linear Regression is to identify the linear relationship

between target variables and explanatory variables.
• Here, the variables that are going to be predicted are
considered target variables, and the variables that are going to
help in predicting the target variables are called explanatory
variables.
• With the linear relationship, we can identify the impact of a
change in explanatory variables on the target variable.

Dr.S.Adinarayana,Professor,CS&SE,Andhra University 8
Least squares regression
• The mathematical basis for the best-fitting regression line is called least-
squares regression.
• In regression analysis, we assume that the values of the dependent
variable, Y, in the sample data, are drawn from some unknown population
for each value of the independent variable, X.
• Imagine we have a list of people’s study hours and test scores. In the
scatterplot, we can see a positive relationship exists between study time
and test scores. Statistical software can display the least squares regression
line and its equation.

From the above points, we

know that this line minimizes
the squared distance
between the line and the Prof. S.Adinarayana, Dept of CS&SE, College of
Engineering, Andhra University
data points. 9
Least Squares Regression Line Formula
y = b + mx
Where:
•y is the dependent variable.
•x is the independent variable. Where N is no. of observations

•b is the y-intercept.
•m is the slope of the line.

The slope represents the mean change in the dependent variable for a
one-unit change in the independent variable.

Prof. S.Adinarayana, Dept of CS&SE, College of

Engineering, Andhra University 10
Example-Least Square Regression

Let’s take the data from the hours of studying example.

We’ll use the least squares regression line formulas to find the slope
and constant for our model.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 11
regression on analysis of variance

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering, Andhra University 12

Testing hypothesis for regression coefficients

It involves confirming if the estimated coefficients have statistical

significance. Two common approaches are used:
1.Confidence interval approach: Determines if the confidence interval for the
coefficient includes zero.
2.t-test approach: Calculates a t-statistic by dividing the estimated coefficient
by its standard error, indicating how many standard-error units the
coefficient is away from zero.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University
13
Confidence Interval
• Confidence intervals provide a systematic approach to quantifying
the uncertainty associated with sample statistics, offering a range
within which population parameters are likely to reside.
• Confidence Interval is a range where we are certain that true value
exists.
• The selection of a confidence level for an interval determines the
probability that the confidence interval will contain the true
parameter value.
• This range of values is generally used to deal with population-
based data, extracting specific, valuable information with a certain
amount of confidence, hence the term ‘Confidence Interval’.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 14
Prof. S.Adinarayana, Dept of CS&SE, College of Engineering, Andhra University
15
Types of Confidence Intervals

1. Confidence Interval for the Mean of Normally Distributed Data

A confidence interval for the mean of normally distributed data is often
calculated using the t-distribution.
2. Confidence Interval for Proportions
For proportions, a confidence interval estimates the likely range of values for the
true population proportion. Typically, the normal approximation or the binomial
distribution is used, depending on the sample size.

3. Confidence Interval for Non-Normally Distributed Data

When dealing with non-normally distributed data or unknown distributions,
bootstrap methods offer a flexible approach. Bootstrap confidence intervals
involve resampling from the dataset to create multiple samples, allowing for the
estimation of the parameter distribution.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 16
t-test approach-Testing hypothesis for regression coefficients

• T-tests are statistical hypothesis tests that you use to analyze one
or two sample means.
• Depending on the t-test that you use, you can compare a sample
mean to a hypothesized value, the means of two independent
samples, or the difference between paired samples.
• t-Tests Use t-Values and t-Distributions to Calculate Probabilities.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 17
• T-values are a type of test statistic. Hypothesis tests use the test
statistic that is calculated from your sample to compare your
sample to the null hypothesis.

A single t-test produces a single t-value. suppose we repeat our

study many times by drawing many random samples of the same
size from this population. Perform t-tests on all of the samples
and plot the distribution of the t-values.
This distribution is known as a sampling distribution, which is a
type of probability distribution(t-distribution).
Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,
Andhra University
18
Residual Analysis
• It is a statistical technique used to evaluate the performance of a linear
regression model by analyzing residuals.
• As the linear regression model is not always appropriate for the data,
you should assess the appropriateness of the model by defining
residuals and examining/analyzing residual plots.
• The difference between the observed value of the dependent variable
(y) and the predicted value (ŷ) is called the residual (e). Each data
point has one residual.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University
19
Residual Analysis

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 20
Residual Plots
• A residual plot is a graph that shows the residuals on the vertical axis
and the independent variable on the horizontal axis.
• If the points in a residual plot are randomly dispersed around the
horizontal axis, a linear regression model is appropriate for the data;
otherwise, a nonlinear model is more suitable.
• The table below shows inputs and outputs from a simple linear
regression analysis.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 21
• And the chart below displays the residual (e) and independent
variable (X) as a residual plot.

The residual plot shows a fairly random pattern - the first residual is
positive, the next two are negative, the fourth is positive, and the last
residual is negative. This random pattern indicates that a linear model
provides a decent fit to the data.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 22
Regression assumptions
• Linear regression is a useful statistical method to understand the
relationship between two variables, x and y.
• Before conducting linear regression, we must first make sure that
four assumptions are to be satisfied.
1. Linear relationship 2. Independence 3. Homoscedasticity and
4. Normality

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 23
1. Linear relationship: There exists a linear relationship between the
independent variable, x, and the dependent variable, y.
2. Independence: The residuals are independent. In particular, there is
no correlation between consecutive residuals in time series data.
3. Homoscedasticity: The residuals have constant variance at every
level of x.
4. Normality: The residuals of the model are normally distributed.
If one or more of these assumptions are violated, then the results of our
linear regression may be unreliable or even misleading.
Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,
Andhra University 24
Multiple Linear Regression
• Multiple Linear Regression is one of the important regression
algorithms that models the linear relationship between a single
dependent continuous variable Y and more than one independent
variables xi.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 25
Multiple Regression Analysis- An example

• Suppose we have the following dataset with one response

variable y and two predictor variables X1 and X2.

• steps to fit a multiple linear regression model to this dataset.

1. Calcúlate X12, X22, X1y, X2y and X1X2.
2. Calculate Regression Sums.
3. Calculate b0, b1, and b2
4. Place b0, b1, and b2 in the estimated linear regression equation.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 26
1. Calcúlate X12, X22, X1y, X2y and X1X2.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University

27
2. Calculate Regression Sums

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 28
3. Calculate b0, b1, and b2

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 29
4. Place b0, b1, and b2 in the estimated linear regression equation.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

30
Andhra University
Interpret a Multiple Linear Regression Equation

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

31
Andhra University
Evaluation Metrics for regression models
The evaluation metrics for a Linear Regression model are:
1.Coefficient of Determination or R-Squared (R2)
2.Root Mean Squared Error (RSME)
R-Squared
• R-squared describes the amount of variation that is captured by the developed model. It always
ranges between 0 and 1. The higher the value of R-squared, the better the model fits with the data.

Root Mean Squared Error(RMSE)

• RMSE measures the average magnitude of the errors or residuals between the predicted values
generated by a model and the actual observed values in a dataset.
• It always ranges between 0 and positive infinity. Lower RMSE values indicate better predictive
performance.
Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,
Andhra University 32
Regression with categorical independent variables with two or more levels

• Categorical variables with two levels may be directly entered as

predictor or predicted variables in a multiple regression model.
• Their use in multiple regression is a straightforward extension of
their use in simple linear regression.
• When entered as predictor variables, the interpretation
of regression weights depends upon how the variable is coded.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 33
Example:-Regression with Categorical Variables

• Consider the effect of (self-reported) exercise on weight in college

students.
• The students were asked the question: how often do you exercise in
a regular week?

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 34
• Let’s take a look at how many observations we have our each level of this
variable.

• boxplot of this data

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

35
Andhra University
• who did not answer this question? They will need to be removed from
consideration.: 13
• Notice that only the first three options were reported on in this data
set (nobody answered with the 4 or 5 options in the survey).
• To build our regression model we want something of the form:

• The works out daily (exercise==1) describes everyone who doesn’t work out 2-3 times or
once a week and is therefore included in the α term.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University
36
Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,
37
Andhra University
confidence interval

• This confidence interval shows us that we can’t conclude we have

any difference in the average weight of these three categories as the
confidence intervals contain both positive and negative values.
• It also gives us a confidence interval for the average weight of those
in category 1 (exercise every day), as this is the intercept.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 38
Regression with nonlinear terms

• Non-linear regression is a general description of statistical

techniques used to model the relationship between a dependent
variable and one or more independent variables.
• Unlike linear regression, which assumes a linear relationship
between the independent features and dependent labels, non-linear
regression allows for more complex relationships to be modeled.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University
39
Applications of Nonlinear Regression
• Many real-world data sets won’t follow a linear relationship, there are
many applications of nonlinear regression.
• These applications include predictive modeling, time series
forecasting, function approximation, and unraveling intricate
relationships between variables.
• Non-linear regression algorithms are machine learning techniques used to
model and predict non-linear relationships between input variables and
target variables.
• These algorithms aim to capture complex patterns and interactions that
cannot be effectively represented by a linear model.
Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,
Andhra University 40
Types of Nonlinear Regression
• There are many types of nonlinear regression. They are:
1. Simple Linear Regression: This model involves one independent variable used to predict the
dependent variable. It’s a basic yet powerful tool in understanding relationships between
variables.
2. Multiple Linear Regression: Unlike simple linear regression, this model incorporates
multiple independent variables to predict the dependent variable. It provides a full analysis
by considering various factors simultaneously.
3. Polynomial Regression: This model fits a curve to the data points. It’s useful when the
relationship between the independent and dependent variables is non-linear.
4. Logistic Regression: Primarily used for binary classification problems, logistic regression
predicts the probability of occurrence of an event by fitting data to a logistic curve.
5. Ridge Regression and Lasso Regression: These are regularization techniques used to prevent
overfitting in predictive models by adding a penalty term to the loss function.
6. Time Series Regression: This model is ideal for looking at data points collected over time to
identify trends, seasonality, and other patterns.
7. Ordinal Regression: It’s used when the dependent variable is ordinal, i.e., it has ordered
categories.
Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,
Andhra University 41
Advanced Techniques in Regression Analysis
When it comes to regression analysis, there are advanced techniques that can
take your models to the next level. They are:

1. Regularization: helps prevent overfitting by adding a penalty for complex

models.
2. Gradient Boosting: a powerful ensemble technique that builds models
sequentially to correct errors made by previous models.
3. Neural Networks: a complex modeling technique that can capture complex
patterns in data, though it requires a large amount of data.
4. Time Series Analysis: useful for modeling and forecasting time-dependent
data.
5. Support Vector Machines (SVM): effective in high-dimensional spaces and
ideal for cases where the data is not linearly separable.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Andhra University 42

Quant Interview Cheat Sheet
No ratings yet
Quant Interview Cheat Sheet
13 pages
Correlation, Causation, Motivation, and Second Language Acquisition PDF
No ratings yet
Correlation, Causation, Motivation, and Second Language Acquisition PDF
15 pages
Copula Modeling: An Introduction For Practitioners: Pravin K. Trivedi and David M. Zimmer
No ratings yet
Copula Modeling: An Introduction For Practitioners: Pravin K. Trivedi and David M. Zimmer
111 pages
DA-3rd unit
No ratings yet
DA-3rd unit
16 pages
Chapter 6
No ratings yet
Chapter 6
58 pages
STAT22209 - Chapter 02-Regression Analyisis - 2022
No ratings yet
STAT22209 - Chapter 02-Regression Analyisis - 2022
41 pages
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
No ratings yet
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
16 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
DA_UNIT_3_R22
No ratings yet
DA_UNIT_3_R22
15 pages
Module 2 Part 1 - Types of Forecasting Models and Simple Linear Regression
No ratings yet
Module 2 Part 1 - Types of Forecasting Models and Simple Linear Regression
71 pages
Regression Analysis and Forecasting Models
No ratings yet
Regression Analysis and Forecasting Models
28 pages
Data Analytics Unit 3
No ratings yet
Data Analytics Unit 3
104 pages
Unit1 - Data Science - SPPU
No ratings yet
Unit1 - Data Science - SPPU
15 pages
Regression Coeffient
No ratings yet
Regression Coeffient
52 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Linear Regression. Com
No ratings yet
Linear Regression. Com
13 pages
MODULE-3
No ratings yet
MODULE-3
34 pages
Chapter 6: How To Do Forecasting by Regression Analysis
No ratings yet
Chapter 6: How To Do Forecasting by Regression Analysis
7 pages
Unit 5
No ratings yet
Unit 5
104 pages
Untitled 472
No ratings yet
Untitled 472
13 pages
Week 5 Notes
No ratings yet
Week 5 Notes
175 pages
Regression Analysis (AI)
No ratings yet
Regression Analysis (AI)
9 pages
Unit-III (Data Analytics)
50% (2)
Unit-III (Data Analytics)
15 pages
Management Science Notes
No ratings yet
Management Science Notes
13 pages
Lecture6 Regression
No ratings yet
Lecture6 Regression
42 pages
Inference For Regression
No ratings yet
Inference For Regression
24 pages
IV Ai & Ds Al3451 Ml Unit2
No ratings yet
IV Ai & Ds Al3451 Ml Unit2
50 pages
Lecture 6 Simple Linear Regression
No ratings yet
Lecture 6 Simple Linear Regression
36 pages
Data Science 03 - Regression PDF
No ratings yet
Data Science 03 - Regression PDF
32 pages
09 Inference For Regression Part1
No ratings yet
09 Inference For Regression Part1
12 pages
Predective Analytics or Inferential Statistics
No ratings yet
Predective Analytics or Inferential Statistics
27 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Chapter 8 Regression Model - 2023
No ratings yet
Chapter 8 Regression Model - 2023
21 pages
Lecture 6 - Regression Analysis
No ratings yet
Lecture 6 - Regression Analysis
34 pages
Midterm 2 Nem Veg Leges
No ratings yet
Midterm 2 Nem Veg Leges
9 pages
Econometrics for Mgt ppt-2 (1)
No ratings yet
Econometrics for Mgt ppt-2 (1)
58 pages
Chapter2 1
No ratings yet
Chapter2 1
55 pages
2023 Statistics Fin 10
No ratings yet
2023 Statistics Fin 10
14 pages
STATG5 - Simple Linear Regression Using SPSS Module
No ratings yet
STATG5 - Simple Linear Regression Using SPSS Module
16 pages
Unit-III
No ratings yet
Unit-III
13 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
15 pages
14 Statistics and Probability
No ratings yet
14 Statistics and Probability
37 pages
Assignment On Regression
100% (1)
Assignment On Regression
11 pages
5 - Part II - Regression Analysis w-notes(1)
No ratings yet
5 - Part II - Regression Analysis w-notes(1)
10 pages
Unit 2-1
No ratings yet
Unit 2-1
30 pages
meWeek 3
No ratings yet
meWeek 3
57 pages
BA3-4-5modules
No ratings yet
BA3-4-5modules
258 pages
Module 3 - Regression and Correlation Analysis
No ratings yet
Module 3 - Regression and Correlation Analysis
54 pages
Lecture 13 BA
No ratings yet
Lecture 13 BA
36 pages
Module -05 Statistical Computing and r Programming
No ratings yet
Module -05 Statistical Computing and r Programming
53 pages
DA-MODULE-3
No ratings yet
DA-MODULE-3
54 pages
Applying_Machine_Learning_Algorithms_with_Scikit-learn(Sklearn)_-_Notes
No ratings yet
Applying_Machine_Learning_Algorithms_with_Scikit-learn(Sklearn)_-_Notes
19 pages
Chap3-INTERVENTION ANALYSIS
No ratings yet
Chap3-INTERVENTION ANALYSIS
62 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
Regression PDF
No ratings yet
Regression PDF
16 pages
Simple Regression Model: Erbil Technology Institute
No ratings yet
Simple Regression Model: Erbil Technology Institute
9 pages
Data Analytics Regression Unit III
No ratings yet
Data Analytics Regression Unit III
27 pages
20230305slides
No ratings yet
20230305slides
39 pages
DA Notes 3
No ratings yet
DA Notes 3
12 pages
Data Analytics Regression UNIT-III
No ratings yet
Data Analytics Regression UNIT-III
26 pages
Engineer's Toolkit: Statistics and Probability Essentials
From Everand
Engineer's Toolkit: Statistics and Probability Essentials
Pasquale De Marco
No ratings yet
Regression Analysis: A Journey from Simple to Complex
From Everand
Regression Analysis: A Journey from Simple to Complex
Pasquale De Marco
No ratings yet
BA unit 1
No ratings yet
BA unit 1
147 pages
BA Answers
No ratings yet
BA Answers
63 pages
BA unit2
No ratings yet
BA unit2
30 pages
BA unit5
No ratings yet
BA unit5
39 pages
BA unit6
No ratings yet
BA unit6
46 pages
wk03 - Hypothesis Testing - Hand Written Notes 170822
No ratings yet
wk03 - Hypothesis Testing - Hand Written Notes 170822
33 pages
LEARNING MODUL Stat 4th Quarter Final
No ratings yet
LEARNING MODUL Stat 4th Quarter Final
23 pages
Module 5.6 - 5.7 Pearson R and Regression
No ratings yet
Module 5.6 - 5.7 Pearson R and Regression
25 pages
Statistical Treatment of Data
No ratings yet
Statistical Treatment of Data
3 pages
Syllabus ESTAT Elementary Statistics
No ratings yet
Syllabus ESTAT Elementary Statistics
12 pages
Chapter 8 and 9 Intro-to-Hypothesis-Testing-Using-Sign-Test
No ratings yet
Chapter 8 and 9 Intro-to-Hypothesis-Testing-Using-Sign-Test
44 pages
DLL Week7 SP11 - 4QT
No ratings yet
DLL Week7 SP11 - 4QT
9 pages
Week 11 Lecture 20
No ratings yet
Week 11 Lecture 20
16 pages
05 Handout 1
No ratings yet
05 Handout 1
5 pages
Generalized Kappa Statistic
No ratings yet
Generalized Kappa Statistic
11 pages
Identification: Group 6 - Reack
No ratings yet
Identification: Group 6 - Reack
8 pages
Stats Answer Key
No ratings yet
Stats Answer Key
10 pages
Chap 10 Regression Analysis
No ratings yet
Chap 10 Regression Analysis
68 pages
Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning
No ratings yet
Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning
49 pages
Kruskal
No ratings yet
Kruskal
4 pages
ml4
No ratings yet
ml4
6 pages
WeiBull Analysis
100% (1)
WeiBull Analysis
76 pages
2023 Past Year Question Paper
No ratings yet
2023 Past Year Question Paper
6 pages
Curriculum-Implementation-Matrix Stat and Probability
No ratings yet
Curriculum-Implementation-Matrix Stat and Probability
11 pages
Instant ebooks textbook Introduction to Econometrics 3rd Edition James H. Stock download all chapters
100% (2)
Instant ebooks textbook Introduction to Econometrics 3rd Edition James H. Stock download all chapters
65 pages
4499-Article Text-16699-1-10-20220413
No ratings yet
4499-Article Text-16699-1-10-20220413
21 pages
Lecture 5 - Functional Forms of Linear Regression Models - Lin-Log Model
No ratings yet
Lecture 5 - Functional Forms of Linear Regression Models - Lin-Log Model
6 pages
Ag Stat 2.2 Theory (AgriHub)
No ratings yet
Ag Stat 2.2 Theory (AgriHub)
98 pages
Data Science Interview Preparation (30 Days of Interview Preparation)
No ratings yet
Data Science Interview Preparation (30 Days of Interview Preparation)
18 pages
NORMAL DISTRIBUTION Hand Out PDF
No ratings yet
NORMAL DISTRIBUTION Hand Out PDF
7 pages
Business Statistics Chapter 1
No ratings yet
Business Statistics Chapter 1
25 pages
Bluman Elem Stats 9e CH03 PPTS
No ratings yet
Bluman Elem Stats 9e CH03 PPTS
89 pages

BA unit3

Uploaded by

BA unit3

Uploaded by

Unit 3

Trendlines and Regression

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering, Andhra

Prof. S.Adinarayana, Dept of CS&SE,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

• A regression model that involves a single independent variable is called

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

• Linear Regression is to identify the linear relationship

From the above points, we

Prof. S.Adinarayana, Dept of CS&SE, College of

Let’s take the data from the hours of studying example.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering, Andhra University 12

It involves confirming if the estimated coefficients have statistical

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

1. Confidence Interval for the Mean of Normally Distributed Data

3. Confidence Interval for Non-Normally Distributed Data

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

A single t-test produces a single t-value. suppose we repeat our

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

• Suppose we have the following dataset with one response

• steps to fit a multiple linear regression model to this dataset.

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Root Mean Squared Error(RMSE)

• Categorical variables with two levels may be directly entered as

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

• Consider the effect of (self-reported) exercise on weight in college

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

• boxplot of this data

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

• This confidence interval shows us that we can’t conclude we have

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

• Non-linear regression is a general description of statistical

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

1. Regularization: helps prevent overfitting by adding a penalty for complex

Prof. S.Adinarayana, Dept of CS&SE, College of Engineering,

You might also like