Linear Regression Analysis in Excel
Linear Regression Analysis in Excel
The tutorial explains the basics of regression analysis and shows a few di erent ways to do linear regression in Excel.
Imagine this: you are provided with a whole lot of di erent data and are asked to predict next year's sales
numbers for your company. You have discovered dozens, perhaps even hundreds, of factors that can possibly
a ect the numbers. But how do you know which ones are really important? Run regression analysis in Excel. It
will give you an answer to this and many more questions: Which factors matter and which can be ignored? How
closely are these factors related to each other? And how certain can you be about the predictions?
Dependent variable (aka criterion variable) is the main factor you are trying to understand and predict.
Independent variables (aka explanatory variables, or predictors) are the factors that might in uence the
dependent variable.
Regression analysis helps you understand how the dependent variable changes when one of the independent
variables varies and allows to mathematically determine which of those variables really has an impact.
Technically, a regression analysis model is based on the sum of squares, which is a mathematical way to nd the
dispersion of data points. The goal of a model is to get the smallest possible sum of squares and draw a line that
comes closest to the data.
In statistics, they di erentiate between a simple and multiple linear regression. Simple linear regression models
the relationship between a dependent variable and one independent variables using a linear function. If you use
two or more explanatory variables to predict the dependent variable, you deal with multiple linear regression.
If the dependent variable is modeled as a non-linear function because the data relationships do not follow a
straight line, use nonlinear regression instead. The focus of this tutorial will be on a simple linear regression.
As an example, let's take sales numbers for umbrellas for the last 24 months and nd out the average monthly
rainfall for the same period. Plot this information on a chart, and the regression line will demonstrate the
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 1/17
5/3/2021 Linear regression analysis in Excel
relationship between the independent variable (rainfall) and dependent variable (umbrella sales):
y = bx + a + ε
Where:
x is an independent variable.
y is a dependent variable.
a is the Y-intercept, which is the expected mean value of y when all x variables are equal to 0. On a regression
graph, it's the point where the line crosses the Y axis.
b is the slope of a regression line, which is the rate of change for y as x changes.
ε is the random error term, which is the di erence between the actual value of a dependent variable and its
predicted value.
The linear regression equation always has an error term because, in real life, predictors are never perfectly
precise. However, some programs, including Excel, do the error term calculation behind the scenes. So, in Excel,
you do linear regression using the least squares method and seek coe cients a and b such that:
y = bx + a
For our example, the linear regression equation takes the following shape:
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 2/17
5/3/2021 Linear regression analysis in Excel
There exist a handful of di erent ways to nd a and b. The three main methods to perform linear regression
analysis in Excel are:
2. In the Excel Options dialog box, select Add-ins on the left sidebar, make sure Excel Add-ins is selected in the
Manage box, and click Go.
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 3/17
5/3/2021 Linear regression analysis in Excel
3. In the Add-ins dialog box, tick o Analysis Toolpak, and click OK:
This will add the Data Analysis tools to the Data tab of your Excel ribbon.
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 4/17
5/3/2021 Linear regression analysis in Excel
With Analysis Toolpak added enabled, carry out these steps to perform regression analysis in Excel:
1. On the Data tab, in the Analysis group, click the Data Analysis button.
Select the Input Y Range, which is your dependent variable. In our case, it's umbrella sales (C1:C25).
Select the Input X Range, i.e. your independent variable. In this example, it's the average monthly rainfall
(B1:B25).
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 5/17
5/3/2021 Linear regression analysis in Excel
If you are building a multiple regression model, select two or more adjacent columns with di erent
independent variables.
Check the Labels box if there are headers at the top of your X and Y ranges.
Optionally, select the Residuals checkbox to get the di erence between the predicted and actual values.
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 6/17
5/3/2021 Linear regression analysis in Excel
This part tells you how well the calculated linear regression equation ts your source data.
Multiple R. It is the Correlation Coe cient that measures the strength of a linear relationship between two
variables. The correlation coe cient can be any value between -1 and 1, and its absolute value indicates the
relationship strength. The larger the absolute value, the stronger the relationship:
R Square. It is the Coe cient of Determination, which is used as an indicator of the goodness of t. It shows how
many points fall on the regression line. The R2 value is calculated from the total sum of squares, more precisely, it
is the sum of the squared deviations of the original data from the mean.
In our example, R2 is 0.91 (rounded to 2 digits), which is fairy good. It means that 91% of our values t the
regression analysis model. In other words, 91% of the dependent variables (y-values) are explained by the
independent variables (x-values). Generally, R Squared of 95% or more is considered a good t.
Adjusted R Square. It is the R square adjusted for the number of independent variable in the model. You will
want to use this value instead of R square for multiple regression analysis.
Standard Error. It is another goodness-of- t measure that shows the precision of your regression analysis - the
smaller the number, the more certain you can be about your regression equation. While R2 represents the
percentage of the dependent variables variance that is explained by the model, Standard Error is an absolute
measure that shows the average distance that the data points fall from the regression line.
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 7/17
5/3/2021 Linear regression analysis in Excel
Basically, it splits the sum of squares into individual components that give information about the levels of
variability within your regression model:
df is the number of the degrees of freedom associated with the sources of variance.
SS is the sum of squares. The smaller the Residual SS compared with the Total SS, the better your model ts the
data.
F is the F statistic, or F-test for the null hypothesis. It is used to test the overall signi cance of the model.
The ANOVA part is rarely used for a simple linear regression analysis in Excel, but you should de nitely have a
close look at the last component. The Signi cance F value gives an idea of how reliable (statistically signi cant)
your results are. If Signi cance F is less than 0.05 (5%), your model is OK. If it is greater than 0.05, you'd probably
better choose another independent variable.
The most useful component in this section is Coe cients. It enables you to build a linear regression equation in
Excel:
y = bx + a
For our data set, where y is the number of umbrellas sold and x is an average monthly rainfall, our linear
regression formula goes as follows:
Equipped with a and b values rounded to three decimal places, it turns into:
Y=0.45*x-19.074
For example, with the average monthly rainfall equal to 82 mm, the umbrella sales would be approximately 17.8:
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 8/17
5/3/2021 Linear regression analysis in Excel
0.45*82-19.074=17.8
In a similar manner, you can nd out how many umbrellas are going to be sold with any other monthly rainfall (x
variable) you specify.
Why's the di erence? Because independent variables are never perfect predictors of the dependent variables.
And the residuals can help you understand how far away the actual values are from the predicted values:
For the rst data point (rainfall of 82 mm), the residual is approximately -2.8. So, we add this number to the
predicted value, and get the actual value: 17.8 - 2.8 = 15.
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 9/17
5/3/2021 Linear regression analysis in Excel
2. On the Inset tab, in the Chats group, click the Scatter chart icon, and select the Scatter thumbnail (the rst one):
This will insert a scatter plot in your worksheet, which will resemble this one:
3. Now, we need to draw the least squares regression line. To have it done, right click on any point and choose
Add Trendline… from the context menu.
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 10/17
5/3/2021 Linear regression analysis in Excel
4. On the right pane, select the Linear trendline shape and, optionally, check Display Equation on Chart to get
your regression formula:
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 11/17
5/3/2021 Linear regression analysis in Excel
As you may notice, the regression equation Excel has created for us is the same as the linear regression
formula we built based on the Coe cients output.
5. Switch to the Fill & Line tab and customize the line to your liking. For example, you can choose a di erent line
color and use a solid line instead of a dashed line (select Solid line in the Dash type box):
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 12/17
5/3/2021 Linear regression analysis in Excel
At this point, your chart already looks like a decent regression graph:
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 13/17
5/3/2021 Linear regression analysis in Excel
If your data points start in the middle of the horizontal and/or vertical axis like in this example, you may want
to get rid of the excessive white space. The following tip explains how to do this: Scale the chart axes to reduce
white space.
Important note! In the regression graph, the independent variable should always be on the X axis and the
dependent variable on the Y axis. If your graph is plotted in the reverse order, swap the columns in your
worksheet, and then draw the chart anew. If you are not allowed to rearrange the source data, then you can
switch the X and Y axes directly in a chart.
The LINEST function uses the least squares regression method to calculate a straight line that best explains the
relationship between your variables and returns an array describing that line. You can nd the detailed
explanation of the function's syntax in this tutorial. For now, let's just make a formula for our sample dataset:
=LINEST(C2:C25, B2:B25)
Because the LINEST function returns an array of values, you must enter it as an array formula. Select two
adjacent cells in the same row, E2:F2 in our case, type the formula, and press Ctrl + Shift + Enter to complete it.
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 14/17
5/3/2021 Linear regression analysis in Excel
The formula returns the b coe cient (E1) and the a constant (F1) for the already familiar linear regression
equation:
y = bx + a
If you avoid using array formulas in your worksheets, you can calculate a and b individually with regular formulas:
=INTERCEPT(C2:C25, B2:B25)
=SLOPE(C2:C25, B2:B25)
Additionally, you can nd the correlation coe cient (Multiple R in the regression analysis summary output) that
indicates how strongly the two variables are related to each other:
=CORREL(B2:B25,C2:C25)
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 15/17
5/3/2021 Linear regression analysis in Excel
The following screenshot shows all these Excel regression formulas in action:
Tip. If you'd like to get additional statistics for your regression analysis, use the LINEST function with the stats
parameter set to TRUE as shown in this example.
That's how you do linear regression in Excel. That said, please keep in mind that Microsoft Excel is not a statistical
program. If you need to perform regression analysis at the professional level, you may want to use targeted
software such as XLSTAT, RegressIt, etc.
Available downloads:
To have a closer look at our linear regression formulas and other techniques discussed in this tutorial, you are
welcome to download our sample Regression Analysis in Excel workbook.
Copyright © 2003 - 2021 4Bits Ltd. All rights reserved. Privacy policy Terms of use Contact us
Mi f d h O l d k i d d k f Mi f C i G l Ch i d k f
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 16/17
5/3/2021 Linear regression analysis in Excel
Microsoft and the O ce logos are trademarks or registered trademarks of Microsoft Corporation. Google Chrome is a trademark of
Google LLC.
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/ 17/17