Professional Documents
Culture Documents
superannuation data
Readings
Y e a r s in
W o rk fo rc e
25
31
37
37
40
30
32
26
29
36
28
29
10
15
30
28
35
17
25
22
G ender
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
S a la r y
$ '0 0 0
5 0 .6
7 5 .2
4 8 .3
5 2 .3
1 0 6 .2
6 1 .3
5 2 .6
4 8 .9
4 2 .6
8 9 .5
3 3 .1
3 5 .6
3 1 .2
3 3 .9
4 9 .7
6 9 .3
8 6 .4
2 8 .1
4 6 .2
5 0 .7
S u p e r a n n u a tio n
B a la n c e $ '0 0 0
1 1 7 .9
4 1 7 .1
1 5 6 .2
2 0 2 .9
5 0 6 .2
255
1 7 9 .8
8 2 .6
4 7 .3
4 8 8 .5
7 0 .5
1 2 0 .1
1 5 .6
8 .9
1 2 4 .7
2 2 3 .4
3 0 1 .6
5 2 .8
6 7 .9
8 9 .5
4
Excel output
Estimate 1
SUMMARY OUTPUT
So a male employee
with a current salary
of $50,000 who has
worked for 30 years is
estimated to have a
superannuation
balance of
Regression Statistics
Multiple R
0.951399308
R Square
0.905160643
Adjusted R Square
0.887378264
Standard Error
50.11767261
Observations
20
ANOVA
df
Regression
Residual
Total
3
16
19
Intercept
Years in Workforce
Gender
Current Salary $'000
SS
MS
F
Significance F
383564.8798
127855 50.90211 2.09087E-08
40188.49773 2511.781
423753.3775
Coefficients
Standard Error
-196.0738799 45.52498098
-0.604845409 2.633748738
59.58681943 29.84990611
6.480588884 0.799454349
t Stat
-4.306951
-0.229652
1.996215
8.106265
P-value
0.000543
0.821272
0.06322
4.67E-07
169.397
or $169,397
Regression equation
Estimate 2
Yi b0 b1 X 1i b2 X 2i b3 X 3i
We can no longer draw this as a line because it is in four
dimensional space. From the Excel output below we see
that the equation should be
X 1i years in workforce
X 2 i gender
or only $109,810
X 3i salary$'000
6
Interpretation of coefficients
residual plots
We should check the residuals against Yi
plus (separately) against each of the
independent variables. These plots are
shown below and on the next slide.
residuals
40
Check
either side
of 0 level
20
0
-100
-20
Residuals
0
100
200
300
400
500
600
-40
-60
-80
-100
11
predicted super
Residuals
176.3096018
-58.40960183
332.1030159
84.99698407
154.1461025
2.053897502
180.068458
22.83154197
527.5576627
-21.35766266
242.6276759
12.37232415
185.0368617
-5.236861742
164.6877553
-82.08775532
122.0455091
-74.74550913
10
421.7512099
66.74879007
11
61.08476014
9.415239862
12
76.68138694
13
0.072039187
15.52796081
14
14.54540213
-5.645402131
15
107.8660254
16.83397463
16
236.0952583
-12.69525831
17
342.6794104
-41.07941037
18
-24.25170421
77.05170421
19
88.20819132
-20.30819132
20
119.1853775
-29.68537752
100
Residuals
R e s id u a ls
Observation
-100
20
40
60
80
100
120
100
50
0
-50 0
10
20
30
40
50
-100
Years in Workforce
43.41861306
Residuals
0.5
1.5
Gender
10
12
Residuals normal?
8
6
4
2
0
Frequency
-100
-50
50
100
Intercept
Years in Workforce
Gender
Current Salary $'000
Coefficients
Standard Error
-196.0738799 45.52498098
-0.604845409 2.633748738
59.58681943 29.84990611
6.480588884 0.799454349
t Stat
-4.30695
-0.22965
1.996215
8.106265
P-value
0.000543
0.821272
0.06322
4.67E-07
More
Residual
13
2.
R Square
15
16
5. Analysis of Variance
However, we can only reject the
hypothesis that the gender coefficient 2 0
using a two tail test at the 10% level since
its P-value is 0.06322
We are not able to reject the hypothesis
that the years in the workforce coefficient
1 0 even at the 10% level.
17
19
F test
For a regression with only one independent variable the
significance level of F is the same as the p-value for the
slopes t test. In a multiple regression the F value is
used to perform a joint test of the regression coefficients
i.e. that
H 0 1 2 ... K 0
non-zero
where the test statistic is F
SSR / k
MSR
SSE / n k 1 MSE
20
F tables
21
23
F test conclusion
Instead of using tables you can simply read off
the Significance F relating to F=50.90211 and
compare it with the desired level of significance.
The given value is 2.09087E-08
or 2.09087x10-8 < 0.01
At the 1 % level we would reject
H 0 1 2 ... K 0
27
28
additive model
multiplicative model
29
31
additive example
Petrol example
30
Sun
Mon
Tue
Wed
Thu
Fri
Sat
1.20
1.18
1.16
1.18
1.27
1.25
1.24
1.23
1.22
1.21
1.21
1.30
1.28
1.26
1.25
1.22
1.20
1.32
1.32
1.30
1.29
1.28
1.26
1.25
1.26
1.35
1.34
1.30
32
Step 1
Day
Price
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
1.20
1.18
1.16
1.18
1.27
1.25
1.24
1.23
1.22
1.21
1.21
1.30
1.28
1.26
1.25
1.22
1.20
1.32
1.32
1.30
1.29
1.28
1.26
1.25
1.26
1.35
1.34
1.30
Y 1.1917 0.0043t
33
Yt
Ct I t
Tt
35
Trend line
The plot below shows how the daily price
fluctuates in a fairly regular pattern around
this trend line.
Day Line Fit Plot
Price
1.40
1.30
Price
1.20
Predicted Price
1.10
0
10
20
30
Day
34
36
days 13-28
The adjusted index is then found by
dividing each daily price by the index for
the corresponding day.
For example to find the adjusted price for
Day 1(a Sunday), as we see on the next
slide
adjusted price 1.20 0.998808 1.201
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
1.28
1.26
1.25
1.22
1.20
1.32
1.32
1.30
1.29
1.28
1.26
1.25
1.26
1.35
1.34
1.30
1.248128079
1.252471264
1.25681445
1.261157635
1.265500821
1.269844007
1.274187192
1.278530378
1.282873563
1.287216749
1.291559934
1.29590312
1.300246305
1.304589491
1.308932677
1.313275862
1.025535778
1.006011104
0.994577998
0.967365193
0.948241186
1.03949776
1.035954535
1.016792423
1.005555058
0.994393525
0.975564483
0.964578278
0.969047168
1.034808274
1.023734852
0.989891033
37
Day
Price
1
2
3
4
5
6
7
8
9
10
11
12
Price/predicted
Predicted Price price
1.20 1.196009852 1.003336216
1.18 1.200353038 0.983044124
1.16 1.204696223 0.962898345
1.18 1.209039409 0.975981421
1.27 1.213382594 1.046660802
1.25
1.21772578 1.026503685
1.24 1.222068966 1.014672686
1.23 1.226412151 1.002925484
1.22 1.230755337
0.99126119
1.21 1.235098522 0.979678931
1.21 1.239441708 0.976245992
1.30 1.243784893 1.045196808
Day
Sun
Mon
Tue
Wed
Thu
Fri
Sat
1.246951
1.24178
1.245844
1.241043
1.246237
1.352485
1.261154
1.266435
1.271346
1.275744
1.281733
1.298164
1.291008
1.289816
1.305402
1.281201
39
forecasting
Average
Adjusted
(price/pred
series
price)
0.998808306 1.201432
0.979308748 1.204932
0.963849185 1.203508
0.990193085 1.191687
1.040655105 1.220385
1.023141684 1.221727
1.00403247 1.23502
1.22591
6.999988584 1.241043
1.256623
1.239778
1.242045
38
40
10
forecasting
Yt Tt Ct I t
41
Price
1.20
1.18
1.16
1.18
1.27
1.25
1.24
1.23
1.22
1.21
1.21
1.30
1.28
1.26
1.25
1.22
1.20
1.32
Predicted Price
1.196009852
1.200353038
1.204696223
1.209039409
1.213382594
1.21772578
1.222068966
1.226412151
1.230755337
1.235098522
1.239441708
1.243784893
1.248128079
1.252471264
1.25681445
1.261157635
1.265500821
1.269844007
Price predicted
price
Day
0.003990148 Sun
-0.020353038Mon
-0.044696223Tue
-0.029039409Wed
0.056617406 Thu
0.03227422 Fri
0.017931034 Sat
0.003587849
-0.010755337total
-0.025098522
-0.029441708
0.056215107
0.031871921
0.007528736
-0.00681445
-0.041157635
-0.065500821
0.050155993
Average (pricepred.price)
-0.0016133
-0.025956486
-0.045299672
-0.012142857
0.051013957
0.029170772
0.004827586
-3.88578E-16
43
7. Price indices
Adjusted
series
1.201613
1.205956
1.2053
1.192143
1.218986
1.220829
1.235172
1.231613
1.245956
1.2553
1.222143
1.248986
1.250829
1.255172
1.251613
1.245956
1.2453
1.332143
44
11
deficiencies
p 100
p
n
0
45
Example 1
47
Weighted indices
Weighted indices allow greater importance
to be given to items for which greater
quantities are sold or consumed
Laspeyres index uses base period
quantities ( q0 ) as weights
It can be used to compare prices between
other periods
2000
2012
Zucchini/kg
3.99
5.99
Mushrooms/kg
6.50
7.99
Pink Lady
Apples/kg
3.99
5.99
2.99
Laspeyres index
46
p q 100
pq
n
48
12
Notices
2000 p
2012p
Zucchini/kg
Calculate the 3.99 5.99
2000 q
2012q
3.2
4.3
Mushrooms/kg 6.50
7.99
1.2
1.5
Pink Lady
Apples/kg
3.99
5.99
5.2
5.6
Navel
Oranges/kg
1.60
2.99
6.2
7.0
49
51
Stuvac consultations
The Paasche index used the current
period quantities but has some practical
problems such as obtaining quantity data
for every period
Paasche index =
p q 100
pq
n
50
13
Exam
The exam will consist of two parts:
Part A: 16 multiple choice questions on both
maths and statistics, each worth 1 mark. Use a
pencil to mark answers and your personal
details.
Part B: 3 written problems with two of the three
based on the statistics section of the course.
They are not all of equal marks so plan your time
carefully.
Total marks for the exam: 50
Please bring an approved calculator, textbook,
notes, pencil, pen, ruler, eraser. No tables will
be supplied (you can use the textbook ones). 53
54
14