0% found this document useful (0 votes)
112 views

Indian Institute of Technology: MS 5031: Data Analysis Applications in Class Assignment-1 October 24, 2016

This document contains solutions to an assignment on data analysis applications. It analyzes a linear regression model of house prices based on size. It finds predictive intervals for prices given house sizes. It also analyzes shock absorber measurement data to determine if a "before filling" measurement can predict an "after filling" measurement. A linear regression is performed and hypotheses about the intercept and slope are tested. It is determined that the "before" measurement is predictive of the "after" measurement.

Uploaded by

Vikas Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views

Indian Institute of Technology: MS 5031: Data Analysis Applications in Class Assignment-1 October 24, 2016

This document contains solutions to an assignment on data analysis applications. It analyzes a linear regression model of house prices based on size. It finds predictive intervals for prices given house sizes. It also analyzes shock absorber measurement data to determine if a "before filling" measurement can predict an "after filling" measurement. A linear regression is performed and hypotheses about the intercept and slope are tested. It is determined that the "before" measurement is predictive of the "after" measurement.

Uploaded by

Vikas Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Indian Institute of Technology

Department of Management Studies

MS 5031: Data Analysis Applications


In class Assignment-1
October 24, 2016
SOLUTIONS

The excel or software output should be in a single file with clear labelling and headings. The assignment
solutions should be a single pdf file.
1. Suppose we are modeling house price as depending on house size. Price is measured in thousands of
dollars and size is measured in thousands of square feet. Suppose our model is:
P = 20 + 50s + ,  N (0, 152 )
(a) Given you know that a house has size s = 1.6, give a 95% predictive interval for the price of the
house.
Point prediction: P = 20 + 50 1.6 = 100
Prediction Interval: [100 2 15] = [70, 130]
(b) Given you know that a house has size s = 2.2, give a 95% predictive interval for the price.
Point prediction: P = 20 + 50 2.2 = 130
Prediction Interval: [130 2 15] = [100, 160]
(c) In our model the slope is 50. What are the units of this number?
1000$/1000 Sq.ft= $/Sq.Ft.
(d) What are the units of the intercept 20?
1000$ same as P
(e) What are the units of the the error standard deviation 15?
1000$ same as P
(f) Suppose we change the units of price to dollars and size to square feet What would the values
and units of the intercept, slope, and error standard deviation?
Intercept: 20,000$
Slope: 50$/Sq. Feet
Error Standard Deviation: 15,000$
(g) If we plug s = 1.6 into our model equation, P is a constant plus the normal random variables .
Given s = 1.6, what is the distribution of P ?
When s = 1.6, the mean of the house prices is 20 + 50 1.6 = 100. The error standard deviation
is the same, 15. Therefore P |s = 1.6 N (100, 152 )

2. The Shock Absorber Data: The data comes from a company which supplies a major automobile
manufacturer with shock absorbers. An important characteristic is the force transferred through the
shock absorber when the shank is forced out of the cylinder. What we do need to understand is that
the manufacturer only considers the shock to be an acceptable part if the force measurement is between
485 and 585.
The shock manufacturer and the auto manufacturer are arguing over the following issue. Before the
shock is finally shipped, it is filled with gas. After it is filled with gas, it becomes very difficult to
measure the force characteristic we are interested in. The shock manufacturers would like to make
the measurement before the shock is filled with gas. The auto maker is concerned that there may be
a difference in the force before and after the shock is filled with gas and so would like to make the
measurement after it is filled. The shock maker claims that there is little difference between the before
and after measurement so that the before measurement can be used. To investigate this we have the
before (column 1, reboundb) and the after (column 2, rebounda) measurements on 35 shocks. The
data for this problem is in shock.xls.
(a) Plot the before measurement vs. the after measurement. Does this look like the kind of data we
can use the simple linear regression model to think about? Why does it make sense to choose the
after measurement as Y and the before measurement as X?
We are trying to determine whether or not the before measurement is predictive of the after
measurement. Therefore the dependent variable ( Y ) should be the after measurement and the
explanatory variable ( X ) the before measurement.
(b) What are 95% confidence intervals for both the slope and the intercept?
From the output above, we see that the 95% confidence interval for the slope is [0.8603;1.0386].
(c) Test the null hypothesis Ho : 0 = 0.
By looking at the 95% confidence interval we see that yes, it is.
We can also conclude the same think by looking at the t-statistics=0.7631 with a p-value of 0.4508.
(d) From the shock makers point of view, what hypotheses would be of interest to test for the slope
and intercept. That is, what would the shock maker like the true intercept and slope to be?
What line would represent equality between the before measurement and the after measurement? That would be a line with intercept equal to zero and slope equal to one.
1. Test
Test
2. Test
Test

whether the intercept is equal to the value proposed by the shock maker.
the null hypothesis Ho : 0 = 0: already answered in part (c)
whether the slope is equal to the value proposed by the shock maker.
the null hypothesis Ho : 1 = 1: Again confidence interval fails to reject null hypothesis

(e) Suppose the before measurement is 550.


1. What is the plug-in predictive interval given x-before=550.
The plug-in predictive interval is 18.22 + 0.949 550 2 7.67 = [524.83; 555.51]
2. What does this interval suggest about the acceptability of the shock absorber?
It looks like the shock maker is correct, ie., with x-before=550 we can predict with 95%
probability that the after measurement will be within acceptable bounds

You might also like