You are on page 1of 22

Introduction and Objectives: There are 3 parts to this tutorial exercise.

All 3 parts use the data in the sheet named "Auction".

In parts 1 and 2 you will investigate how results can change when multiple regression is used rather than simple reg explore the meaning of R2, the relationship between R2 and the correlation coefficient, and some of the drawbacks o tool. You will perform a hypothesis test for the slope in part 3.

Instructions:

As indicated above, there are 3 parts to this exercise, each on separate worksheets (as labelled Part 1, Part 2 and Par tab and complete the tasks outlined within them. Be sure to paste the appropriate Excel output in the spaces provide worksheets for any working if needed.
Once you have completed all three parts, save your work for future reference. Feedback on the exercise will be provided on Blackboard next week; this week's exercises are not for submission.

"Auction".

n is used rather than simple regression. In part 3 you will nt, and some of the drawbacks of R2 as a model comparison

s labelled Part 1, Part 2 and Part 3). Click on each worksheet cel output in the spaces provided. You can use additional

ercises are not for submission.

An antique collector believes that the auction price received for a particular item increases with its age and with the
1. Use Excel's Data analysis, Regression to estimate the following simple linear regression models: Model 1: Y = Auction_Price, X = Age_of_Item Model 2: Y = Auction_Price, X = Number_Bidders Paste your outputs alongside each other where indicated below (you do not need to tick the residuals/plot checkboxes).

(exercises continue below)

Paste output for Model 1 below: SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations ANOVA df Regression Residual Total 1 30 31 SS 2554859.011 2236335.207 4791194.219 MS F 2554859.011 34.27293462 74544.50691 0.730233212 0.533240544 0.517681895 273.0283995 32

Intercept X Variable 1

Coefficients Standard Error t Stat P-value -191.6575698 263.8865984 -0.726287621 0.473291422 10.47909492 1.789979792 5.854309064 2.0965E-06

(a) According to your regression outputs, fill in the following blanks.

(a) According to your regression outputs, fill in the following blanks.

^
Model 1: Model 2: Auction_Price i = -191.658 + 10.48*Age_of_Itemi

^
Auction_Price i = 806.40 +54.64*Number_Biddersi
(b) Interpret the estimated slopes of Age_of_Item and Number_Bidders: Age_of_Item: When number of age of the item increases by 1 the prices increases by $10.48

20 940.0846814 -211.0846814 Number_Bidders: 21 by 1306.853004 -452.8530036 When number of bidders increases 1 the price increases by 54.64 22 1767.93318 -174.93318 23 971.5219662 203.4780338 24 1243.978434 469.0215659 25 1841.286844 -485.2868445 26 1443.081238 378.9187625 (end of Part 1) 27 1505.955807 378.0441929 28 1034.396536 -10.3965357 29 1589.788566 541.2114336 30 971.5219662 -186.5219662 31 1411.643953 -319.6439528 32 1736.495895 304.5041047

ar item increases with its age and with the number of bidders. The worksheet labelled Auction contains data on these three variables fo

d to tick the residuals/plot checkboxes).

Regression Statistics

Significance F 2.0965E-06

Lower 95% Upper 95% Lower 95.0% Upper 95.0% -730.5858994 347.2707598 -730.5858994 347.2707598 6.823468506 14.13472133 6.823468506 14.13472133

orksheet labelled Auction contains data on these three variables for 32 recently auctioned comparable items .

Paste output for Model 2 below: SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations ANOVA df Regression Residual Total 1 30 31 SS MS F 746185.4264 746185.4264 5.534119687 4045008.792 134833.6264 4791194.219

0.394640355 0.15574101 0.127599044 367.1969858 32

Intercept X Variable 1

Coefficients Standard Error t Stat P-value 806.4049256 230.684572 3.49570376 0.001493746 54.63620453 23.22502811 2.352470975 0.025403899

omparable items .

Significance F 0.025403899

Lower 95% Upper 95% Lower 95.0% Upper 95.0% 335.2841798 1277.525671 335.2841798 1277.525671 7.204369473 102.0680396 7.204369473 102.0680396

2. Estimate the following multiple regression model Model 3: Y = Auction_Price, X1 = Age_of_Item, X2 = Number_Bidders Tick the "Line Fit Plots" checkbox and paste your output below. (exercises continue to the right)

SUMMARY OUTPUT Regression Statistics Multiple R 0.944834723 R Square 0.892712653 Adjusted R Square 0.885313526 Standard Error 133.1365018 Observations 32 ANOVA df Regression Residual Total SS MS F Significance F 2 4277160 2138580 120.6511 8.77E-15 29 514034.5 17725.33 31 4791194

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Lower 95.0% Upper 95.0% Intercept -1336.722052 173.3561 -7.71084 1.67E-08 -1691.28 -982.169 -1691.28 -982.169 X Variable 1 12.73619884 0.90238 14.114 1.6E-14 10.89062 14.58177 10.89062 14.58177 X Variable 2 85.8151326 8.705757 9.857286 9.14E-11 68.00986 103.6204 68.00986 103.6204

RESIDUAL OUTPUT Observation 1 2 3 4 5 6 7 8 9 Predicted Y 874.8046103 1126.190328 728.6467428 1925.232596 1346.043967 1396.371925 1460.669757 1269.009936 1578.633806 Residuals 71.19539 209.8097 15.35326 53.7674 175.956 -161.372 22.33024 -117.01 -33.6338

X Variable 1 Line Fit Plot


$3,000 $2,000 Y $1,000 $0 0 50 100 150 200 250 X Variable 1 Y Predicted Y

X Variable 1

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

1403.665281 881.4811289 1240.199279 1202.607521 1180.473383 1094.65825 1157.722406 1667.787198 1165.015763 1715.393734 553.6782183 999.4451777 1731.468192 1364.223008 1695.364178 1563.176186 1679.906558 1670.508619 1097.379671 2029.843607 677.7019474 1126.807166 1864.889861

-141.665 -36.4811 -185.199 50.39248 116.5266 52.34175 -77.7224 -117.787 -118.016 76.60627 175.3218 -145.445 -138.468 -189.223 17.63582 -207.176 142.0934 213.4914 -73.3797 101.1564 107.2981 -34.8072 176.1101

X Variable 2 Line Fit Plot


$3,000 $2,000 Y $1,000 $0 0 5 10 X Variable 2 15 20

(a) Complete the blanks:

^
Model 3: Auction_Pricei = -1336.72 +12.74*Age_of_Itemi + 85.81* Number_Bidders (b) Interpret the estimated regression intercept and slopes.

Intercept: b0 = -1336.72 When the age of item is 0 and there are no bidders auction pr

Slope for X1: b1=12.74

Slope for X2: 85.82

(c) Why are the estimated slopes of Age_of_Item and Number_Bidders different from those obtained in th

becasue in single linear regression we only take one variable into the reggresion while in MLR we take all vari

Upper 95.0%

(d) Of the three models formulated, which model is more plausible? Why? (Provide both an intuitive and sta

model 1=R^2= 0.53 model2= R^2=0.16 model3 = R^2 = .89

This clearly shows that using both variables provides a far stronger relation than using a single variab

it Plot
(end of Part 2)

Y Predicted Y

ne Fit Plot

Y Predicted Y 20

+ 85.81* Number_Biddersi

there are no bidders auction price will be $-1336.72

fferent from those obtained in the simple regression?

esion while in MLR we take all variables into account

(Provide both an intuitive and statistical explanation.)

lation than using a single variable.

3. (a) Interpret the R2 for Model 3 in Part 2.

R^2= 0.8927 This means 89.27% of variation in the auction prices is explained by the variation in age of item & the variation in no

(b) R2 is the square of the correlation between the actual and fitted Y values (Y denotes the dependent variable, A Calculate this correlation using a formula , and verify that this is indeed equal to the R2 of the model. N.B. The fitted been provided for you in the "Predicted Auction_Price" column under the "RESIDUAL OUTPUT" section of your Mode R^2 = (correlation b/w actual & predicted Y)^2 residual output: "predicted Y" =correl("predicted Y", actual Y)

0.944834723 Correlation: 0.944835 Correlation squared:

(c) Based on (b), why it is desirable to have an R2 as high as possible? Why can't the values of R2 exceed 1? The higher the R^2 the stronger the magnitude of the relation implying a more linear relationship, making accurate predicti values more feasible. The value of R^2 can't exeed one because R is always less than one and squaring decimals less than or one will never exeed 1.

(d) Although the R2 value is one of the most frequently quoted values from a regression analysis, it does have one serious drawback: R2 can only increase when additional explanatory variables are added to the model. Let's demonstrate this issue a third X variable to the model which has NO relationship to the dependent variable, yet despite this, R2 more of Y has been explained by adding this extra variable.

(d) (i) First, create the irrelevant variable as 32 random numbers. Go to Data Analysis, Random Number Generation. Enter following:

Random Variables: 1 Random Numbers: 32 Distribution: Uniform Random Seed: 1234 Output Range: D4 (of the Auction sheet, so the random numbers are pasted under the "X_Random" heading)
Now use Excel to calculate the coefficient of correlation between Auction_Price and X_Random:

Correlation:

-0.08548

Your correlation figure should be close to 0 - meaning there is no linear relationship between Y and this random X variable.

(d) (ii) Next, regress Auction_Price against Age_of_Item, Number_Bidders and X_ Random. Paste your output below (you need to tick the Residuals/plot checkboxes). Call this Model 4. (exercises continue to the right)

SUMMARY OUTPUT Regression Statistics Multiple R 0.947285587 R Square 0.897349983 Adjusted R Square 0.886351767 Standard Error 132.5324981 Observations 32 ANOVA df Regression Residual Total 3 28 31 SS MS 4299378.053 1433126 491816.1656 17564.86 4791194.219 F Significance F 81.5905 5.94E-14

Intercept X Variable 1 X Variable 2 X Variable 3

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% -1292.436463 177.0049059 -7.3017 5.98E-08 -1655.01 -929.858 12.67697262 0.899828844 14.0882 3.1E-14 10.83376 14.52019 86.44666957 8.684433648 9.954209 1.07E-10 68.65741 104.2359 -100.6129794 89.45826985 -1.12469 0.270269 -283.86 82.63398

n age of item & the variation in no of biders.

Test whether there is a relationship between X_Random and Auction_Price, and h included in the regression model to explain Auction_Price. Perform the test by com below. Step 1: Define H0 and H1

denotes the dependent variable, Auction_Price). of the model. N.B. The fitted values of Y have UAL OUTPUT" section of your Model 3 output.

H0: b 3 =0 There is no relationship between X_Random and Auction_Price H1: b 3 does not equal 0. There is a relationship between X_Random and Auction_ Step 2: Significance level a = 0.05

Step 3: p-value p-value = 0.27 0.892713 Step 4: Conclusion The decision rule is to reject H0 if reject ifp value < a . In this case, auction price: b1*age of item + b2* no bidders + b3* X random

exceed 1? tionship, making accurate prediction of and squaring decimals less than or equal to

Is your result as expected? Why / why not?

analysis, it does have one serious odel. Let's demonstrate this issue by adding despite this, R2 will increase, suggesting that

andom Number Generation. Enter the

(d) (iii) Interpret the R2 for Model 4. How does it compare to that for Model 3? W should be careful in comparing R2 across regression models with different depend It's the same.

_Random" heading)

(d) (iv) Another drawback of R2 is that we cannot compare R2 model for Y=Income with a model for Y=Income per capita, even if they have the s think this is the case? Hint: What does R2 capture?

een Y and this random X variable.

om. Paste your output below (you do not

Lower 95.0% Upper 95.0% -1655.01 -929.858 10.83376 14.52019 68.65741 104.2359 -283.86 82.63398

Random and Auction_Price, and hence whether X_Random should be ion_Price. Perform the test by completing the gaps in the steps outlined

X_Random and Auction_Price between X_Random and Auction_Price

In this case, a is smaller than p so we do not reject H0 and conclude

t compare to that for Model 3? Why does this result demonstrate that we ion models with different dependent variables?

ot compare R2 across models with different dependent variables, e.g. a per capita, even if they have the same independent variables. Why do you

Auction Price and Potentially Relevant Data Age_of_Item (in Years) 113 126 115 182 150 127 159 117 175 168 127 108 132 137 137 115 182 156 179 108 143 187 111 137 194 156 162 117 170 111 153 184

Auction_Price $946 $1,336 $744 $1,979 $1,522 $1,235 $1,483 $1,152 $1,545 $1,262 $845 $1,055 $1,253 $1,297 $1,147 $1,080 $1,550 $1,047 $1,792 $729 $854 $1,593 $1,175 $1,713 $1,356 $1,822 $1,884 $1,024 $2,131 $785 $1,092 $2,041

Number_Bidders 9 10 7 11 9 13 9 13 8 7 7 14 10 9 8 12 8 6 9 6 6 8 15 15 5 12 11 11 14 7 6 10

X_ Random 0.124149297 0.006500443 0.389446699 0.267281106 0.703634754 0.235511338 0.466139714 0.74794763 0.123783074 0.40601825 0.608691671 0.516312143 0.755973998 0.121555223 0.354564043 0.415692618 0.203497421 0.67198706 0.033936583 0.356303598 0.533127842 0.965117344 0.099612415 0.951170385 0.492172002 0.499588 0.333414716 0.521103549 0.618701743 0.659962767 0.086306345 0.000122074

You might also like