8th MLlab
8th MLlab
Linear regression is one of the most commonly used predictive modelling techniques. The aim
of linear regression is to find a mathematical equation for a continuous response variable Y
as a function of one or more X variable(s). So that you can use this regression model to predict
the Y when only the X is known. It is expressed in the equation 1.
= 1 + 2 + (1)
Where 1 is intercept, and 2 is slope, and is the error term.
Problem Specification
In the given problem „Air velocity‟, and „Evaporation Coefficient‟ are the variables with 10
observations.
The goal here is to establish a mathematical equation for „Evaporation Coefficient‟ as a function of
„Air velocity‟, so you can use it to predict „Evaporation Coefficient‟ when only the „Air velocity‟ of
the car is known. So, it is desirable to build a linear regression model with the response variable as
„Evaporation Coefficient‟ and the predictor as „Air velocity‟. Before we begin building the
regression model, it is a good practice to analyse and understand the variables.
>airvelocity<-c(20,60,100,140,180,220,260,300,340,380)
>evaporationcoefficient<-c(0.18, 0.37, 0.35, 0.78, 0.56, 0.75, 1.18, 1.36, 1.17,1.65)
>airvelocity
[1] 20 60 100 140 180 220 260 300 340 380
> evaporationcoefficient
[1] 0.18 0.37 0.35 0.78 0.56 0.75 1.18 1.36 1.17 1.65
Graphical analysis
The aim of this exercise is to build a simple regression model that you can use to predict
„Evaporation Coefficient‟. But before jumping in to the syntax, let‟s try to understand these
variables graphically.
Typically, for each of the predictors, the following plots help visualize the patterns:
Using Scatter Plot to Visualize the Relationship
Scatter plots can help visualize linear relationships between the response and predictor
variables. Ideally, if you have many predictor variables, a scatter plot is drawn for each one
of them against the response, along with the line of best fit as seen below.
>scatter.smooth(airvelocity,evaporationcoefficient,main="Airvelocity ~ Eva
poration Coefficient")
1.0
0.5
airvelocity
The scatter plot along with the smoothing line above suggests a linear and positive
relationship between the „Air Velocity‟ and „Evaporation Coefficient‟.
Airvelocity Distance
350
1.5
300
250
1.0
200
150
100
0.5
50