You are on page 1of 33

A chain of stores invests varying amounts in security from store to store.

Losses due
to theft are recorded in each of these stores for one year.
store amt spent on security ($) theft from the store ($)
1 5000 4500
2 6500 2500
3 3000 6500
4 7000 2000
5 2500 7500
6 4500 5000
7 6000 3000
(A) Plot a scatter diagram of the data, display the scatter diagram on this sheet only.

(B) Assuming that the two variables are related in a linear way, do the regression analysis using excel data analysis.
Show the regression estimate on a new sheet, name the sheet as Output-SR-SS.
Write a commentary on your analysis by analysis the p value and R square of the regression output. This commentary should b
sheet "Output-SR-SS".
(C) Based on the regression analysis, estimate expected losses if the amount spent in security in one store is $4100

theft from the store ($)


8000 7500

7000 6500

6000
5000
5000 4500

4000
3000
3000 2500
2000
2000

1000

0
2 00 0 3 00 0 4 0 00 5000 6 00 0 7000 8 00 0
amount spent on security
excel data analysis.

tput. This commentary should be on the

n one store is $4100

2000

7000 8 00 0
SUMMARY OUTPUT

Regression Statistics
Multiple R 0.9973306667 On the basis of R2 we can say that amount spent
R Square 0.9946684588
Adjusted R Square 0.9936021505 based on F significance , we can say that the R2 va
Standard Error 165.5879066801 based on p value both intrecept and coefficient ar
Observations 7

ANOVA
df SS MS F Significance F
Regression 1 25577188.94009 25577189 932.8151 7.06056602E-07
Residual 5 137096.7741935 27419.35
Total 6 25714285.71429

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 10350.80645161 203.7547519922 50.80032 5.59E-08 9827.03818722 10874.57472
amt spent on securi -1.2016129032 0.0393429384 -30.542 7.06E-07 -1.3027471462 -1.10047866
we can say that amount spent on security explains 99.46% of the theft in the store.

ance , we can say that the R2 value is significant


oth intrecept and coefficient are significant

THEFT 5424.193548387
SPENT 4100

Lower 95.0% Upper 95.0%


9827.03818722 10874.57471601
-1.3027471462 -1.1004786603
A personnel officer in a company is examine if the average monthly absentee rate is
somehow related to the number of rainy days for the month in question. To investigate
this, the following data is obtained.
rainy days absentees
23 11
15 18
24 13
12 15
19 16
20 14
22 22

(A) Plot a scatter diagram of the data, display the scatter diagram on this sheet only.

(B) Assuming that the two variables are related in a linear way, do the regression analysis using excel data analysis.
Show the regression estimate on a new sheet, name the sheet as Output-SR-abs.
Write a commentary on your analysis by analysis the p value and R square of the regression output. This commentary should b
the sheet "Output-SR-abs".
(C) Based on the regression analysis, estimate expected average absentees if the number of rainy days in a month is 12

absentees
25 22

20 18
16
15 14
15 13
Axi s Ti tl e

11
10

0
10 12 14 16 18 20 22 24 26
rai ny days
g excel data analysis.

utput. This commentary should be on

iny days in a month is 12

13

24 26
SUMMARY OUTPUT

Regression Statistics
Multiple R 0.2021155222 On the bais of R2 we can say that only 4.08% absentes are there on r
R Square 0.0408506843 on the bais of F significance we can say that only R2 is also significan
Adjusted R Squ -0.1509791788 here P value is also significant as it is less than 0.05
Standard Error 3.8610776724
Observations 7

ANOVA
df SS MS F Significance F
Regression 1 3.1746817539 3.1746817539 0.2129526846 0.6638432602
Residual 5 74.5396039604 14.9079207921
Total 6 77.7142857143

Coefficients Standard Error t Stat P-value Lower 95%


Intercept 18.7698019802 7.0828404964 2.6500387789 0.0454243227 0.5627808555
rainy days -0.1658415842 0.3593783231 -0.461467967 0.6638432602 -1.0896529737
8% absentes are there on rainy day
t only R2 is also significance

rainy days 12
absentes 16.7797029703

Upper 95% Lower 95.0% Upper 95.0%


36.9768231049 0.5627808555 36.9768231049
0.7579698054 -1.0896529737 0.7579698054
(A) Plot a scatter diagram of the data, display the scatter diagram on this sheet only.

(B) Assuming that the two variables are related in a linear way, do the regression analysis using excel data analysis.
Show the regression estimate on a new sheet, name the sheet as Output-SR-hw.
Write a commentary on your analysis by analysis the p value and R square of the regression output. This commentary should
be on the sheet "Output-SR-HW".
(C) Based on the regression analysis, estimate expected losses if the amount spent in security in one store is $4100

Husband Wife
186 175 Wife
180 168
185
160 154
186 166
163 162 180
172 152
192 179
175
170 163
174 172
191 170 170
182 170
178 147
165
181 165
168 162
162 154 160
188 166
168 167
155
183 174
188 173
166 164 150
120 130 140 150 160 170 180 190 200
180 163
176 163 Wi fe
185 171
169 161
182 167
162 160
169 165
176 167
180 175
157 157
170 172
186 181
180 166
188 181
153 148
179 169
175 170
165 157
156 162
185 174
172 168
166 162
179 159
181 155
176 171
170 159
165 164
183 175
162 156
192 180
185 167
163 157
185 167
170 157
176 168
176 167
160 145
167 156
157 153
180 162
172 156
184 174
185 160
165 152
181 175
170 169
161 149
188 176
181 165
156 143
161 158
152 141
179 160
170 149
170 160
165 148
165 154
169 171
171 165
192 175
176 161
168 162
169 162
184 176
171 160
161 158
185 175
184 174
179 168
184 177
175 158
173 161
164 146
181 168
187 178
181 170
sheet only.

ression analysis using excel data analysis.


-SR-hw.
e of the regression output. This commentary should

unt spent in security in one store is $4100

Wife

160 170 180 190 200

Wi fe
SUMMARY OUTPUT

Regression Statistics On the basis of R2 we can say that 58.27% of the variation is
Multiple R 0.763386397 On the basis of F significance we can say the R2 is also signific
R Square 0.5827587911 As p value is smaller than 0.05 we can say that it is significant
Adjusted R Squar 0.5783200548
Standard Error 5.9280089458
Observations 96

ANOVA
df SS MS F
Regression 1 4613.6770675494 4613.6770675494 131.2893482132
Residual 94 3303.2812657839 35.1412900615
Total 95 7916.9583333333

Coefficients Standard Error t Stat P-value


Intercept 41.9301535479 10.661622663 3.9328116247 0.0001605824
Husband 0.6996537352 0.0610616325 11.4581564055 1.536359332145E-19
at 58.27% of the variation is shown by the independent variable
can say the R2 is also significance.
e can say that it is significant.

Significance F
1.53635933215E-19

Lower 95% Upper 95% Lower 95.0% Upper 95.0%


20.7612518221 63.0990552737 20.7612518221 63.0990552737
0.5784144326 0.8208930379 0.5784144326 0.8208930379
Age Bidders Price Variable Description
127 13 1235 Age Age of the clock (years)
115 12 1080 Bidders Number of individuals participating in the bidding
127 7 845 Price Selling price (pounds sterling)
150 9 1522
156 6 1047 Question
182 11 1979 The data give the selling price at auction of 32 antique grandfather clocks. Also recorded
156 12 1822 and the number of people who made a bid.
Identify the independent and dependent variables.
132 10 1253 Do the regression using the excel's built in tool and create the output on a new sheet -
137 9 1297 "Output-MR-Clock" and write commentary, that should cover the following
113 9 946 Q1> Is price related to Age & Bidders? (Hint: This analysis can be done using the R squar
p values of the coefficients)
137 15 1713 Q2> Which Independent variables are significant at 95% confidence Level
117 11 1024 Q3> Which Independent variables are significant at 99% confidence Level
137 8 1147 Q4> Write the equation of the price of clock in terms of Age of clock and number of bidd
the sheet "Output-MR-Clock"
153 6 1092 Q5> As per your regression analysis, How much should be the price of a clock, when the
117 13 1152 10 and age is 150
126 10 1336
170 14 2131
182 8 1550
162 11 1884
184 10 2041
143 6 854
159 9 1483
108 14 1055
175 8 1545
108 6 729
179 9 1792
111 15 1175
187 8 1593
111 7 785
115 7 744
194 5 1356
168 7 1262
on
ndfather clocks. Also recorded is the age of the clock

the output on a new sheet - name that sheet as


over the following
can be done using the R square of regression and the

nfidence Level
nfidence Level
e of clock and number of bidders. This should be in

the price of a clock, when the number of bidders is


SUMMARY OUTPUT

Regression Statistics Both age and bidders are significant at 95% significant le
Multiple R 0.9448347227 both age and bidders are significant at 99% significant le
R Square 0.8927126533
Adjusted R 0.8853135259 Price= B17+B18*age+B19*bidders
Standard E 133.1365018143 Price= 1431.8591000405
Observatio 32

ANOVA
df SS MS F Significance F
Regression 2 4277159.70340504 2138579.851703 120.6510727354 8.76906568825E-15
Residual 29 514034.515344964 17725.32811534
Total 31 4791194.21875

Coefficients Standard Error t Stat P-value Lower 95%


Intercept -1336.722052143 173.3561260683 -7.7108440438 1.674205397E-08 -1691.2751398231
Age 12.7361988409 0.9023804868 14.1140007209 1.597895546E-14 10.8906235209
Bidders 85.8151326045 8.7057568147 9.8572857514 9.135309451E-11 68.0098607098
nificant at 95% significant level.
nificant at 99% significant level.

*age+B19*bidders bidders 10
clock age 150

Upper 95% Lower 95.0% Upper 95.0%


-982.1689644628 -1691.2751398231 -982.1689644628
14.581774161 10.8906235209 14.581774161
103.6204044991 68.0098607098 103.6204044991
Diameter Height Volume
8.3 70 10.3
8.6 65 10.3 The data give the volume (cubic feet), height (feet) and diameter (inches) (at 54 inch
a sample of 31 black cherry trees in the Allegheny National Forest, Pennsylvania.
8.8 63 10.2 The data were collected in order to find an estimate for the volume of a tree (and th
10.5 72 16.4 yield), given its height and diameter.
10.7 81 18.8 Identify the independent and dependent variables.
Do the regression using the excel's built in tool and create the output on a new shee
10.8 83 19.7 as "Output-MR-Tree" and write commentary, that should cover the following
11 66 15.6 Q1> Is Volume related to Diameter and Height of the tree (Hint: This analysis can be
11 75 18.2 square of regression and the p values of the coefficients)
Q2> Which Independent variables are significant at 95% confidence Level
11.1 80 22.6 Q3> Which Independent variables are significant at 99% confidence Level
11.2 75 19.9 Q4> Write the equation of the volume of tree in terms of Diameter and Height of tre
11.3 79 24.2 the sheet "Output-MR-Tree"
Q5> As per your regression analysis, How much should be the volume of a tree, whe
11.4 76 21 and height is 80
11.4 76 21.4
11.7 69 21.3
12 75 19.1
12.9 74 22.2
12.9 85 33.8
13.3 86 27.4
13.7 71 25.7
13.8 64 24.9
14 78 34.5
14.2 80 31.7
14.5 74 36.3
16 72 38.3
16.3 77 42.6
17.3 81 55.4
17.5 82 55.7
17.9 80 58.3
18 80 51.5
18 80 51
20.6 87 77
and diameter (inches) (at 54 inches above ground) for
ational Forest, Pennsylvania.
for the volume of a tree (and therefore the timber

create the output on a new sheet - name that sheet


hould cover the following
e tree (Hint: This analysis can be done using the R
ents)
95% confidence Level
99% confidence Level
ms of Diameter and Height of tree. This should be in

uld be the volume of a tree, when the diameter is 10


SUMMARY OUTPUT

Regression Statistics
Multiple R 0.9736272581
R Square 0.9479500378
Adjusted R 0.9442321833
Standard E 3.8818320381
Observatio 31

ANOVA
df SS MS F
Regression 2 7684.162511745 3842.0812558727 254.9723374107
Residual 28 421.9213592224 15.0686199722
Total 30 8106.083870968

Coefficients Standard Error t Stat P-value


Intercept -57.9876589184 8.6382258653 -6.7129130243 0.000000275
Diameter 4.708160503 0.2642646094 17.8160840883 8.223303688648E-17
Height 0.3392512342 0.1301511807 2.6065935969 0.0144909745

Both the independent variables , diameter and height are significant at 95% significance level.
One of the independent variable that is diameter is significant at 99% significant level and heigth is not.
Volume= Intercept+4.7081*diameter+0.3392*height
Volume= 16.2340448514
16.2340448514

Significance F
1.07123772981E-18

Lower 95% Upper 95% Lower 95.0% Upper 95.0%


-75.6822624733 -40.2930553635 -75.6822624733 -40.2930553635
4.1668389898 5.2494820163 4.1668389898 5.2494820163
0.0726486262 0.6058538423 0.0726486262 0.6058538423

cance level.
vel and heigth is not.
diameter 10
height 80
Date Stock A Stock B Stock C A's return B's return C's return
1-Jan-97 186 193 112
1-Jan-98 188 193 109 1.08% 0.00% -2.68%
1-Jan-99 187 194 111 -0.53% 0.52% 1.83%
1-Jan-00 183 198 116 -2.14% 2.06% 4.50%
31-Dec-00 183 196 114 0.00% -1.01% -1.72%
31-Dec-01 185 194 117 1.09% -1.02% 2.63%
31-Dec-02 183 193 114 -1.08% -0.52% -2.56%
31-Dec-03 184 195 116 0.55% 1.04% 1.75%
30-Dec-04 186 193 115 1.09% -1.03% -0.86%
30-Dec-05 185 190 116 -0.54% -1.55% 0.87%
30-Dec-06 186 191 118 0.54% 0.53% 1.72%
30-Dec-07 190 192 117 2.15% 0.52% -0.85%
29-Dec-08 188 194 115 -1.05% 1.04% -1.71%
29-Dec-09 185 194 118 -1.60% 0.00% 2.61%
29-Dec-10 181 194 118 -2.16% 0.00% 0.00%
29-Dec-11 178 195 113 -1.66% 0.52% -4.24%
28-Dec-12 175 198 109 -1.69% 1.54% -3.54%
28-Dec-13 174 198 112 -0.57% 0.00% 2.75%
28-Dec-14 174 197 112 0.00% -0.51% 0.00%
28-Dec-15 178 199 108 2.30% 1.02% -3.57%
Use the data in the sheet named "Data for Corr & Covar". Perform correlation analaysis for Stock A, B and C
Show the correlation Matrix below. This matrix is the output of correlation analysis

A's return
B's return
C's return
or Stock A, B and C
nalysis

A's return B's return C's return


1 -0.21511 -0.14453
-0.21511 1 -0.0023
-0.14453 -0.0023 1
Use the data in the sheet named "Data for Corr & Covar". Perform covariance analaysis for Stock A, B and C.
Show the covariance Matrix below. This matrix is the output of covariance analysis

A's return B's return C's return


A's return 0.000177
B's return -2.66E-05 8.64E-05
C's return -4.75E-05 -5.27E-07 0.000609
sis for Stock A, B and C.
nce analysis
X Y
1 1.2781522025 Show scatter plot for X & Y below.
1. Do the curve fitting
2 1.064710737 2. Show the equation of curve
3 2.0162354658 3. Show the R square value on the curve
4 1.8578592709
5 2.1400661635
6 2.3360198691
7 2.7479117345
8 2.7067159781
9 2.9470671016
10 2.9193910403
Y
5
11 3.151025158
4.5
12 3.1072736483 f(x) = 0.9812240503 l n(x) + 0.7521906667
4 R² = 0.9828307139
13 3.1875915348
14 3.39316521 3.5
15 3.4483989831 3
16 3.421000009 2.5
17 3.579343567 2
18 3.6333667442 1.5
19 3.6704608686 1
20 3.734808386 0.5
21 3.6958550679 0
22 3.7502097093 0 10 20 30 40 5
23 3.8592549399
24 3.8420296336
25 3.9489333594
26 3.977810746
27 4.0167433162
28 3.9930498441
29 4.0853038176
30 4.1239033645
31 4.0961762171
32 4.1756172824
33 4.2149384585
34 4.2433391149
35 4.2681578035
36 4.2587284183
37 4.321081649
38 4.354526961
39 4.342375829
40 4.3642444649
41 4.3942022106
42 4.4442969604
43 4.4413563007
44 4.4552770964
45 4.5112989704
46 4.5046866474
47 4.5275329163
48 4.5518748889
49 4.5740922304
50 4.6220273031
Y

667

30 40 50 60
X Y
79 799.33 Show scatter plot for X & Y below.
1. Do the curve fitting
7 76.58 2. Show the equation of curve
97 996.24 3. Show the R square value on the curve
99 1021.06
36 375
12 127.64
78 859.03
20 218.97 Y
6 63.7 1200
20 212.58
40 418.89 1000 f(x) = 10.2722568017x + 9.1440494061
64 641.9 R² = 0.9969799582
800
38 411.71
24 256.9
600
78 781.52
20 212.48 400
86 892.29
34 358.74 200
100 1040.05
0
51 557.43 0 20 40 60 80 100
10 104.15
63 689.33
17 179.66
58 584.59
89 896.79
54 542.7
9 100
39 405.51
33 361.37
81 821.42
91 912.47
1 11.67
43 456.32
69 720.44
36 386.62
79 822.01
32 323.24
66 712.09
41 421.73
64 683.8
57 603.51
84 903.05
50 537.3
47 517.7
11 122.68
68 715.36
17 176.37
68 710.87
82 850.62
19 203.45
0 80 100 120

You might also like