0% found this document useful (0 votes)
14 views

Intro To Regression

The document discusses energy demand modeling using the Simple-E modeling tool. It outlines the types of questions that energy demand modeling can help answer, such as forecasting future energy demand levels by sector and fuel and determining future power generation needs. The document also provides an overview of regression analysis techniques that can be used in energy demand modeling, including describing the regression line and determination coefficient.

Uploaded by

Shewakena Girma
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Intro To Regression

The document discusses energy demand modeling using the Simple-E modeling tool. It outlines the types of questions that energy demand modeling can help answer, such as forecasting future energy demand levels by sector and fuel and determining future power generation needs. The document also provides an overview of regression analysis techniques that can be used in energy demand modeling, including describing the regression line and determination coefficient.

Uploaded by

Shewakena Girma
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Energy Modelling Using Simple-E

Joint SPC-APEC Regional Workshop on Energy Statistics and Modeling


for the SDG7 and the COP21 INDC Energy Targets
14-18 March 2016
Nuku’alofa, Kingdom of Tonga

MICHAEL SINOCRUZ
Asia Pacific Energy Research Centre
What do we want to know?

Energy Demand

 How much will be the energy demand in 20xx or outlook period?


 Which is the most energy consuming sector?
 How much oil, or gas or coal will be needed to supply the energy
requirements in 20xx outlook period?
 How much will be the electricity demand for the next number of years?

Power generation

 How many new power plants are needed or what type of power plant will
be needed to meet the electricity demand?
 Is the indigenous supply enough or how much will be imported?

2
Regression Analysis

Regression analysis is concerned with the study of the relationship between one
variable called the explained (or dependent) variable and one or more other
variables called explanatory (or independent) variables.
Reference: Damodar Gujarati, Essentials of Econometrics (second edition), McGraw-Hill

Regression Analysis - enables us to describe a straight line that best fits a series of
ordered pairs (x, y). The equation for a straight line, known as a linear equation w/c is
expressed as:
y = a + bx

Where y = the dependent variable


x = the independent variable
a = the y-intercept
b = the slope of the regression line

3
Regression Analysis

Regression Line or the Least Square Regression Line


= is a mathematical procedure to identify the linear equation that best fits a set
of ordered pairs by finding the values of a, the y-intercept and b, the slope. The
goal of the least square method is to minimize the total squared error between
the values of y (actual value) and ŷ (predicted value).

a or the y- intercept
= is the point where the line crosses the y-axis

b or the slope of the straight line


= is the ratio of the rise of the line over the run of the line.
If b = 0, it means there is no relationship between the independent and
dependent variables

4
Regression Analysis - Determination Coefficient

Y=a*X+b Y=a*X+b
Y Y

X
X

Determination coefficient Determination coefficient


= 0.6-0.7 = 0.8-0.9
Not fitting Fitting

Determination coefficient = (Correlation coefficient) 2


5
Regression Analysis - Ordinary Least Squares Method

 Involves the use of statistical


procedures to estimate
mathematically the average
relationships between the
dependent and independent
Y
Y = a + bX variables
 The Ordinary Least Squares
Error et  Yt  Yˆt (OLS) Method minimizes the
(x, y)
difference between the observed
X and the estimated value. Requires
that the sum of the square of the
errors for the best fit function be
Source: Prof. Wali Del Mundo
at a minimum 6
Forecasting
Regression ModelLeast
Analysis - Ordinary Validity and
Squares Method
Accuracy Testing


Y


* Yi Actual value
Yi  Yˆi Error  i  Yi  Yˆi
Yi  Yi
Yˆi Estimated value
Yˆ  Y
i i
Yi Average value
Error Sum of Squares

 
n
ESS   Yi  Yˆi
2

Regression Equation
i 1
Total Sum of Squares
X n
TSS   Yi  Yi 
2

i 1

7
Regression Analysis

Age (x) Repair Bill (y) xy x2 y2


(vehicle) (Peso)
5.5 1,200 6,600 30.25 1,440,000
10.1 900 9,090 102.01 810,000
3.2 450 1,440 10.24 202,500
4.5 750 3,375 20.25 562,500
2.5 200 500 6.25 40,000
_______ _________ _______ ________ _________
∑ 25.8 3,500 21,005 169 3,055.000

y = a + bx

8
Regression Analysis

n∑xy – ∑x ∑y 5(21,005) – (25.8) (3,500)


b= ------------------------ = --------------------------------------------------- = 82.10
n ∑x2 – (∑x)2 5 (169) – (25.8)2

∑y – b(∑x) 3,500 – 82.10 (25.8)


a= ------------------------ = ------------------------------------------- = 276.36
n 5

Ŷ = a + bx
= 276.36 + 82.10 x
= 276.36 + 82.10 (8) = PhP 933.16

9
Regression Analysis

Another way of measuring the strength of a relationship is with the Coefficient


of Determination, r2. This represents the percentage of the variation

n∑xy – (∑x) (∑y)


r = -------------------------------------------------------
√ [n ∑x2 – (∑x)2] [n(∑y2) – (∑y)2]

r= 0.634

r2 = (0.634)2 = 0.402(100) = 40.2 percent

10
Regression Analysis

In other words, 40.2 percent of the variation in the repair bill is explained by the
age of the vehicle. If r2 = 1, all the variation in y is explained by the variable x.
If r2 = 0, none of the variation in y is explained by the variable x.

NOTE :

Just because a relationship between two variables is statistically significant


doesn’t necessarily mean that a causal relationship truly exists. The
mathematical relationship could be due to pure coincidence. Always use your
best judgment when making these decisions.

11
END USE DEMAND 101 (A)
Indicators should be calculated at the most disaggregated end-use level possible in order to represent
each activity level.

International Standard
End use Energy
Industry Sector Industrial Classification (ISIC)

Food and
Disaggregation Beverage
Industrial
Cooking
Basic Metals
Manufacturing Fuel (Oil, Industrial
(Iron and Steel) space cooling
Electricity)
Non-metallic Space
Industry Construction Minerals heating
(Cement)
Specific
energy Industrial
Lighting
Mining and Industry activity
Quarrying
sub-sector
END USE DEMAND 101 (B)

Transport Sector Residential/Commercial


Sectors
Road
Passenger End use by activity
Rail Transport (space heating and
mode cooling, appliance
Freight
Air use, lighting)
Water
End use by equipment
type or energy source
End use Vehicle type (light
vehicles i.e. cars,
Energy SUVs), by fuel
Regression Analysis using Simple-E

 SEE (Simple Econometric Simulation System) is an Add-In


application for Microsoft Excel. It exploits all the
advantages of the native spreadsheet functions as well as the
open interfaces with other Windows applications

 There are three processes involved in Simple-E, from data


input (worksheet) to simulation (worksheet) namely; 1)
Model Check; 2) Model Solve, and 3) Simulation.

 The following diagram shows the basic concept and the


relationship between these processes within the three
worksheets:

14
Regression Analysis using Simple-E

15
Introduction to SEE – Main Menu

Button to start the


Main Menu

Button to start the simulation


of the whole model.

Button to create See working


sheets in a new file .

Button to create See working


sheets in the current file .

See working sheets

Source: Sichao Kan 16


Introduction to SEE – Data Sheet
Define the Code Name for all the variables and input data.

Free area Input the data for the


variables here.

Usually we put the comments Input time series here. It ends


of the Code Name here. For with the year (or month, day,
example, their meanings, etc…) till which you want to
units, and the sources where forecast.
you get the data, etc…
Put the Code Name of the dependent variable y
here. Pay attention to that the code name should
be exactly the same as what you have input in the 17
Source: Sichao Kan
“data” sheet
Introduction to SEE – Data Sheet
Build your model on the left half of the “model” sheet.

Free area

Usually the comments of the Input the Code Names of the dependent
Code Names are put here. For variables here. Pay attention so that the
example, their meanings, code names should be exactly the same
units, and the sources of the as what have input in the “data” sheet.
data, etc…

Put the Code Name of the dependent “Option Type” includes ①the form of
variable y here. Pay attention to that the relationship between Y and X1, X2,… (equal,
code names should be exactly the same as linear (OLS), Double-log, Semi-log, etc…),
what have been input in the “data” sheet and ②how you want Y to change with time
18
(Linear trend, Growth trend, etc…)
Introduction to SEE – Typical Function Forms

Internal Option
I J Y Type X1 X2

Increase with historical grow rate RGDP $TG


TFED RGDP
TFED $DL RGDP
TFED $SL RGDP
TFED RGDP lag1.TFED
Function Form

lag1.Xj: the value of Xj of one year before;


Dummy values: To neglect abnormal value of designated years .
In SEE, the dummy value of a certain year is denoted by “dum.year”
For example: dum.1997

Source: Sichao Kan 19


Growth Rate and Elasticity
Growth Rate

Elasticity

For double-log function ln(Y) = a + b*ln (X), the slope coefficient ‘b’ is
equal to the average elasticity coefficient. This is because

Source: Sichao Kan 20


Introduction to SEE – Simulation Sheet

Once click the “All through” button in the Main Menu and if there are no bugs in your
model, the simulation results (the model outputs) will be displayed in the “simulation”
sheet automatically.

39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56
2023 2024 2025 2026 2027 2028 2029 2 0 3 0 < Su mmary> [Variable s Total 5 ; In te rn al 5 (Re gre ssion 4 ; De fin ition 0 ; Dire c t 1 ); Exte rn al 0 ]

8651.374 9094.028 9558.309 10045.27 10556.03 11091.73 11653.62 12242.95 1. RGDP; G%(5.42/5.16); [Growth Trend, Constant Adjusted]
165244.9 174214.9 183623.2 193491.1 203841.1 214696.8 226082.9 238025.2 2. TFED; G%(6.33/5.41); [=-10068.3+20.2642*RGDP] & [Linear Trend, Constant Adjusted]
177443.8 188393.9 199994 212282.5 225300 239089.4 253696.4 269169 3. TFED; G%(6.33/6.13); [=EXP(1.20766)*(RGDP)^1.20002] & [Linear Trend, Constant Adjusted]
561739.4 683902.2 840675.5 1043859 1309932 1662160 2133782 2772854 4. TFED; G%(6.33/20.81); [=EXP(9.39292+0.000444539*RGDP)] & [Linear Trend, Constant Adjusted]
161484.8 170220 179387.5 189006.5 199097.8 209683.8 220788.1 232435.6 5. TFED; G%(6.33/5.27); [=-2983.3+7.2932*RGDP+0.66185*LAG1.TFED] & [Linear Trend, Constant Adjusted]

Equation in original form

Source: Sichao Kan 21


Introduction to SEE – How good is your estimation?
Basic check
Symbol (+ or -) of independent variables

Check by parameters
R-squared
T-value
DW-value
etc…

Others
Elasticity
Trend
etc…
Source: Sichao Kan 22
Introduction to SEE – Model Sheet
Check the fitness of your model on the right half of your model.

Model equation

① R-squared ② T-value
Parameters for testing the fitness of
③ Durbin-Watson testing the model
value

Notice: After building the model, go to the “main menu” and click the “All through” button. The
equation of the model and the parameters for testing the fitness of the model will be displayed on
the right half of the “model”. 23
Source: Sichao Kan
Parameters for testing the fitness of your model (estimation)
(1) R R-Squared, 0  Explained variance / Total variance  1,
(The larger the better)
(2) AR Adjusted R-Squared, AR  1, (The larger the better)
(3) SD SD = (e2 /(n-k))1/2 ,
e = Residual, n = Sample size, k = No. of independent variables
(4) t-value t2 : Significant
2t1 : Admissible to use
t1 : Insignificant
(5) DW Durbin Watson Statistics, 1 < DW < 3
DW = 2 : No serial correlation
DW  0 : Positive correlation
DW  4 : Negative correlation
(6) Dh Durbin h Statistics with lag,  Dh   2
(7) Rho Coefficient of serial correlation,  Rho   1
(8) DF Degree of Freedom, DF > 1 (The lager the better)
(9) F F-Statistics, F > 0 (The larger the better)
(10) RSS Residual Sum of Square, RSS > 0 (The smaller the better)
(11) YX Correlation Coefficient between Y and X’s,  YX   1 24
(12) XX Correlation Coefficient between X’s,  XX   0.95
Introduction to SEE – How to start model building with SEE?

 Before start SEE you need to


• Formulate the question of interest
• Specify variables
• Collect Data
 Then with SEE
• Input the data in the “data sheet”
• Build your model in the “model sheet ”
• Test the fitness of your estimation by checking the
parameters on the far right side of the “model sheet”
• The prediction results are given in the “simulation sheet ”
 After…
• You can do any analysis you like with the output data
25
Introduction to SEE – Functions

Option Type (Useful)


“$LS” or Blank cell -- Simple E. executes regression based on
Ordinary Least Square (Regression Analysis).
Y Type X1
YY $LS XX

YY = a * XX + b

26
Introduction to SEE – Functions

“=“ or “$EQ”-- Direct Equation: The variable in “Y” is


defined directly by the formula in “X.” “

Y Type X1
YY = XX

YY = XX

27
Introduction to SEE – Functions
“$DL” – Double Log: Simple E. executes regression after
transforming the variables of both sides to log format.

Y Type X1
YY $DL XX

Log(YY) = a * log(XX) + b

28
Introduction to SEE – Functions
$CA—Constant Adjustment: Simple E adjust between
regression equation and the latest actual value.
Y
Y=a*X+b+c

Y=a*X+b

29
Introduction to SEE – Functions

$TL—Linear Trend, estimated by serial number


$TG—Growth Trend, estimated by average actual growth rate

$TL
Y=a*X+b

X
1 2 3 4 5 6 7 8 9 10 11 12

10.0% 10.0%
$TG
30

You might also like