Business Analytics Practical Problems
Business Analytics Practical Problems
age salary
16 7.5
20 8.5
22 20
25 21
28 22
35 25
38 25.5
41 27
46 29
49 30
55 31
59 33
60 34
62 35
65 37
67 37.5
Create
(i) Make a scatter plot of age and salary
(ii) Find the linear correlation coefficient
(iii) Plot the regression line
(iv) Find the equation of the regression line
AIM:
To create scatter plots, find linear correlation coefficient, plot the regression lines and find the
regression equations and to conclude that this is a linear relationship with 99% confidence for
the data in Microsoft Excel.
PROCEDURE
1. MAKE A SCATTER PLOT OF AGE VS SALARY
1: Open an Microsoft excel sheet
2 :Enter the values
3: Highlight data age and salary . Now select insert-> graph-> scatter plot …the graph
appears .
4 : Click on the chart and select the chart which has x-axis label, y-axis label and the title
click on that the chart changes.
2. FIND THE LINEAR CORRELATION COEFFICIENT
1 : the formula =rsq (B2:B17,A2:A17) .
2 : The linear correlation coefficient value appears.
salary
40
35
30
25
20
15
10
0
10 20 30 40 50 60 70
The correlation coefficient is 0.89425. The correlation coefficient value will be from -1 to +1.
The more the value of correlation near to +1, the better the positive correlation between the
variables.
3. PLOT THE REGRESSION LINE AND EQUATION OF THE REGRESSION LINE
salary
40
f(x) = 0.492993772241993 x + 5.23876779359431
35 R² = 0.894252526151605
30
25
20
15
10
0
10 20 30 40 50 60 70
RESULT
Thus the scatter plots, linear correlation coefficient, regression lines, regression equations for
the data is done using Microsoft Excel.
The demand for an item is observed for 24 months and recorded below
S No Year Sales
1 2010 1000
2 2011 1500
3 2012 2000
4 2013 2200
5 2014 2250
6 2015 2800
7 2016 2900
8 2017 3200
9 2018 3500
10 2019 4100
11 2020 4150
S No Year Sales
1 2010 1000
2 2011 1500
3 2012 2000
4 2013 2200
5 2014 2250
6 2015 2800
7 2016 2900
8 2017 3200
9 2018 3500
10 2019 4100
11 2020 4150
2021 4507.27273
The forecasted sales value for the year 2021 is 4507 approximately.
RESULT :
Thus, the forecasting for the given data is calculated using MS.Excel.
The following table below has the period demand and forecast. calculate the mean absolute
deviation (MAD) & mean absolute error (MAE)and Tracking signal(TS).
Deman Forecas
Year Month Period d t
p Dt Ft
2001 JAN 1 37
2001 FEB 2 40 37.00
2001 MAR 3 41 37.90
2001 APRIL 4 37 38.83
2001 MAY 5 45 38.28
2001 JUN 6 50 40.3
2001 JUL 7 43 43.21
2001 AUG 8 47 43.15
2001 SEP 9 56 44.3
2001 OCT 10 52 47.81
2001 NOV 11 55 49.07
2001 DEC 12 54 50.85
2002 JAN 13 51.79
AIM:
To calculate the the mean absolute deviation (MAD) , mean absolute error (MAE)and
Tracking signal (TS) using Microsoft Excel.
PROCEDURE
METHOD 1
1: Open an Microsoft excel sheet
2:. Enter the given values .
3: To find Error Et =Forecast – Demand.
4: To find absolute value =abs (G7)
5: To find squared error =G7^2
6: To compute MSE –Mean Square Error =avg (I7:I17)
7: To compute MAD –Mean Absolute Deviation =avg (H7:H17)
8: To compute TS =sum (G7:G17)/h2
Absolute Squared
Error
Value error
-37 37 1369
-3 3 9
-3.1 3.1 9.61
1.83 1.83 3.3489
-6.72 6.72 45.1584
-9.7 9.7 94.09
0.21 0.21 0.0441
-3.85 3.85 14.8225
-11.7 11.7 136.89
-4.19 4.19 17.5561
-5.93 5.93 35.1649
-3.15 3.15 9.9225
51.79 51.79 2682.2041
G H I
MSE 340.5239615
MAD 10.93615385
TS -11.50333333
37
9.7 11.7
6.72 4.19 5.93
3 3.1 1.83 3.85
0.210000000 3.15
1 2 3 4 5 6 000001
7 8 9 10 11 12 13
Absolute Value
The mean absolute deviation (MAD) , mean absolute error (MAE)and Tracking signal (TS)
using Microsoft Excel has been done.
RESULT
Thus the mean absolute deviation (MAD) , mean absolute error (MAE)and Tracking signal
(TS) is calculated using Microsoft Excel.
The demand for an item is observed for 24 months and recorded below
Deman
Period Month d
1 Jan 120
Calculate (i) 3- monthly .What is the forecast for the
month of 25 and 2 Feb 103 give a graphical representation.
3 Mar 105
AIM: 4 Apr 84
To calculate the 5 May 114 forecast using 3 monthly simple
moving average 6 Jun 90 using Microsoft Excel.
7 Jul 100
PROCEDURE 8 Aug 113
1: Select the tools 9 Sep 99 menu
10 Oct 108
2: Select the data analysis option
11 Nov 109
3: When the Data 12 Dec 88 analysis Tools dialogue box appears ,
choose moving 13 Jan 91 average
14 Feb 96
4: When the moving average dialogue box appears
15 Mar 113
.enter the data in input range, interval box and output
16 Apr 84
range and Click OK
17 May 98
18 Jun 87
19 Jul 91
OUTPUT & INTERPRETATION
20 Aug 119
21 Sep 99
22 Oct 106
23 Nov 89
24 Dec 107
Deman
Period Month
d
1 Jan 120 Moving Average
2 Feb 103 #N/A
140
3 Mar 105 #N/A 120
109.333 100
4 Apr 84 80
3 Actual
Value
60 Forecast
97.3333
5 May 114 40
3 20
6 Jun 90 101 0
7 Jul 100 96 1 4 7 1 0 1 3 1 6 19 2 2
101.333 Data Point
8 Aug 113
3
9 Sep 99 101
10 Oct 108 104
106.666
11 Nov 109
7
105.333
12 Dec 88
3
101.666
13 Jan 91
7
14 Feb 96 96
91.6666
15 Mar 113
7
16 Apr 84 100
97.6666
17 May 98
7
98.3333
18 Jun 87
3
89.6666
19 Jul 91
7
20 Aug 119 92
21 Sep 99 99
22 Oct 106 103
23 Nov 89 108
24 Dec 107 98
100.666
25
7
RESULT
Thus, the three monthly moving average for the given data is calculated and the graph
is calculated.
The following table below has 14 weeks and correspondingly the actual demand is
given
Calculate the forecast demand for week 15 using exponential smoothing method and give a
graphical representation.
AIM:
To calculate the the forecast demand for week 15 using exponential smoothing method and
give a graphical representation.using Microsoft Excel.
PROCEDURE
1: Select the tools menu
2: Select the data analysis option - exponential smoothing
3: When the moving average dialogue box appears enter data in input range ,interval box
and output range box.
40 Forecast
20
0
1 3 5 7 9 11 13
Data Point
7 72 67.635
8 75 69.9905
71.3971
9 77
5
73.9191
10 79
5
76.0757
11 81
4
78.1227
12 83
2
80.1368
13 86
2
82.1410
14 88
5
84.8423
15
1
RESULT
Thus, the exponential smoothing for the given data is calculated.
RESULT
Thus a Least-Squares Graph is computed Using Microsoft Excel.
For instance, if the stock price of a particular company has been dropping consistently over
the last 10 days, we can assume that the price will drop tomorrow too. Or if it has been
raining every day for the past week, we can guess that it would rain today as well, and hence
it’s a good idea to carry an umbrella.
To understand the exponential smoothing models and how they forecast future values, we
must be familiar with the different time series components. A time series has the following
three components:
1. Trend Component
The trend describes the general tendency of the data which could be increasing or
decreasing or stable. For instance, at the time of demonetization, we observed a
decreasing trend for stock prices. Or over the years, we have seen an increase in the
number of sales of smartphones.
A trend often depicts the long term movement of the series. Have a look at the
following examples – can you identify the trend in these series?
Even though there is some noise in the data, you can observe that there is an
increasing trend in the above series.
2. Seasonal Component
The next important component is the seasonal component of the time series. For
instance, there could be a higher sale in clothing items and sweets around New Year’s
or Diwali every year. Similarly, there could be an increase in flight bookings around
the holiday season(s). And this pattern could be observed throughout the year.
If you look closely at the images below, you would notice that there is a certain
pattern that keeps repeating. This repeating pattern observed in the series is the
seasonal component of the time series. It depicts the short term movement of the
series.
Although due to the noise in the series, you’ll notice that it’s slightly difficult to
identify the seasonality in the first series. But the seasonality in the second series is
evident. We have a particular pattern repeating every year, which shows that we have
a yearly seasonality for the second series. If you explore the first series and take a
closer look, you will find that it has a weekly seasonality.
3. Residual Component
Let’s say we identify the trend and seasonal component from a time series and remove
these two. What remains after removing these two is the residual component. It does
not have any pattern or trend. As the name suggests, the residual component is
irregular.
So now that we understand the different components of a time series, let’s understand how
exponential smoothing algorithms use these to make predictions.
TREND PROJECTION
Consider the time series for bicycle sales of a particular manufacturer over the past 10 years
is shown in the table .
Year Sales
(t) (1000s)
1 21.6
2 22.9
3 25.5
4 21.9
5 23.9
6 27.5
7 31.5
8 29.7
9 28.6
10 31.4
Year Sales
35
1 21.6
30
2 22.9
25
3 25.5 20
4 21.9 15
5 23.9 10
6 27.5 5
0
1 2 3 4 5 6 7 8 9 10 11
Year Sales
7 31.5
8 29.7
9 28.6
10 31.4
11 32.5
RESULT
Thus the forecast for year 11 , in this case 32.5, will appear in the cell selected and
hence the forecast is found using trend projection for the given data is calculated
Period Demand
Year Quarter (t) Dt
1 2 1 8000
1 3 2 13000
1 4 3 23000
2 1 4 34000
2 2 5 10000
2 3 6 18000
2 4 7 23000
3 1 8 38000
3 2 9 12000
3 3 10 13000
3 4 11 32000
4 1 12 41000
AIM:
To estimate the seasonal factors and give a graphical representation.using Microsoft Excel.
PROCEDURE
1: Open an Microsoft excel sheet
2: Enter the given values.
3: add the formula =18439+A2*524 and press enter.
4: Then, add the formula =B2/C2 and press enter.
5: Select the deseasonalized demand and seasonal factor and drag it.
DeSeasonalised Demand
Quarter Demand Dt Seasonal Factor
Dt
2 8000 19487 0.410530097
3 13000 20011 0.649642697
4 23000 20535 1.120038958
1 34000 18963 1.792965248
2 10000 19487 0.513162621
3 18000 20011 0.899505272
4 23000 20535 1.120038958
1 38000 18963 2.003902336
2 12000 19487 0.615795145
3 13000 20011 0.649642697
4 32000 20535 1.558315072
1 41000 18963 2.162105152
Chart Title
25000
20000
15000
10000
5000
0
1 2 3 4 5 6 7 8 9 10 11 12
RESULT
Thus the seasonal factors and de seasonalised demand is done using Microsoft Excel.
Consider the table below. It shows three performance measures for 10 students.
Using data from the table, we are going to complete the following tasks:
Develop a least-squares regression equation to predict test score, based on (1) IQ and
(2) the number of hours that the student studied.
Assess how well the regression equation predicts test score, the dependent variable.
Assess the contribution of each independent variable (i.e., IQ and study hours) to the
prediction.
1. Open Excel
2. Click on Data Tab
3. If the Data Analysis button in the upper right corner, the Analysis Tool Pak is enabled
and you are ready to go.
7. From the Manage drop-down box, choose Excel Add-Ins and click the Go button.
This opens the Add-Ins dialog box.
8. From the Add-Ins dialog, check the box beside Analysis ToolPak and click Go.
9. Click Data Analysis Button
10. This will open the Data Analysis dialog box. From the drop-down list, select "Regression"
and click OK.
11. Excel will display the Regression dialog box. This is where you identify data fields for the
independent and dependent variables. In the Input Y Range, enter coordinates for the
dependent variable. In the Input X Range, enter coordinates for the independent variable(s). If
you include column labels in these input ranges, check the Labels box. In the example below,
we have included labels, so the Labels box is checked.
OUTPUT& INTERPRETATION
In this equation, ŷ is the predicted test score. The independent variables are IQ and study hours, which
are denoted by x1 and x2, respectively. The regression coefficients are b 0, b1, and b2. On the right side
of the equation, the only unknowns are the regression coefficients; so to specify the equation, we need
to assign values to the coefficients.
Here, we see that the regression intercept (b0) is 23.156, the regression coefficient for IQ (b1)
is 0.509, and the regression coefficient for study hours (b2) is 0.467. So the least-squares
regression equation can be re-written as:
A quick glance at the output suggests that the regression equation fits the data pretty well. The
coefficient of muliple determination is 0.905. For our sample problem, this means 90.5% of test score
variation can be explained by IQ and by hours spent in study.
Here x depicts the independent variables (IQ and Study Hour) and one dependent variable (Test
Score).
The regression coefficients table shows the following information for each coefficient: its value, its
standard error, a t-statistic, and the significance of the t-statistic. In this example, the t-statistics for IQ
and study hours are both statistically significant at the 0.05 level. This means that IQ contributes
significantly to the regression after effects of study hours are taken into account.
RESULT:
The Multiple regression analysis has been done using Microsoft Excel
SIGNATURE OF FACULTY