Professional Documents
Culture Documents
2<0
1>0
Business research
Prof. Herbert Hamers
Assignment
Submission deadline: 22.10.2009 Study Group BO9D: Student : 09027165
Table of Content 1)
a) b) c) d)
2)
a)
i) Weight _____________________________________________________________ 7 ii) Education level _____________________________________________________ 8 iii) Wage _____________________________________________________________ 9 iv) Food expenses______________________________________________________ 9
b) c) d) e) f) Sample mean, standard deviation and median ________________________________ 2s, 4s and 6s intervals _____________________________________________________ 95% confidence interval ___________________________________________________ Scatter plots _____________________________________________________________ Equation of the regression line _____________________________________________ 10 10 11 11 12
i) Food and housing income/family income _________________________________ 13 ii) Clothing and recreation/family income _________________________________ 13
g) Interpretation ___________________________________________________________ 14
i) Questions a to c: distributions __________________________________________ 14 ii) Question d: 95% confidence interval ___________________________________ 14 iii) Questions e and f: linear relationship ___________________________________ 14 3)
a)
4) 5)
a) b) c) d)
Exercise 4: Markowitz portfolio theorem __________________________________ 19 Exercise 5: Confidence interval and tests __________________________________ 21
95% confidence interval for ______________________________________________ Opinion on the announced _______________________________________________ Formal hypothesis test ____________________________________________________ Minimal size of the sample _________________________________________________ Excesses of the company and the world stock exchange _________________________ Scatter plot______________________________________________________________ Regression ______________________________________________________________ Reliability of the slope. ____________________________________________________ Constant term ___________________________________________________________ Performance ____________________________________________________________ Percentage explained by the model __________________________________________ Prediction interval _______________________________________________________ 21 21 21 22 23 23 24 25 25 25 26 26
6)
a) b) c) d) e) f) g) h)
a) Descriptive statistics
The usage of the data analysis function of excel provides the following results: Descriptive statistics
Crop Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count Table 1 1502,941176 15,06687062 1500 1500 87,8541978 7718,360071 0,16548812 -0,379772489 350 1300 1650 51100 34
12 10 Frequency 8 6 4 2 0 1350 1400 1450 1500 Bin 1550 1600 1650 2 4 6 5 4 3 10
Histogram
Crop yield distribution
Figure 1
The crop in kilograms produced per hectare in the sample seems symmetrically distributed around the mean (1502). The mode and the median that are exactly at 1500 seem to confirm this hypothesis. The histogram is not clearly bell shapes so we can only apply Chebyshevs rule. At least 75% of the measurement fall into the interval (1326;1678) Actually 94,11 At least 89% of the measurement fall into the interval (1238;1766) Actually 100% The fact that the distribution is normally distributed can not be concluded only from the graphical analysis, however this is the hypothesis taken for the next questions.
This area under thr curve is the complement and can be computed with the excel function NORMDIST
1500
1600
Figure 2
This is now this aread we are looking for. It can be computed with the excel function NORMDIST
1500
1600
Figure 3
2<0
1>0
Figure 4
With the help of fertiliser we try to move to the left so that the area detailed in Figure 2 (now approximately 0,3) becomes 0,4.
1600
Figure 5
Business research Assignment Student 09027165 November 2009 5
x1
Figure 6
To say that P(X>x1)=0,4 is the same as saying P(X<x1)=0,6 We can then use excel to compute x1 with the function NORMINV x1= NORMINV(0,6;1500;200)=1550,669 If we have understood from Figure 4 that the fertiliser consists in a translation of the curve to the right we now recognise that the distance between and x1 is the same distance as between 1 and 1600. Means x1- =1600- 1. 1= -x1+1600 = 1549,330579
Verification: we can compute back the probability for the farms to be profitable with the fertiliser giving a 1= 1549,331 P(Xfertiliser>1600)= 1-P(Xfertiliser<1600)=1- NORMDIST(1600; 1549,330579;200;1) =0,4
Figure 7
Level of education
5; 14; 5%
111
4; 41; 14% 1; 69; 23%
69
65
14
2; 65; 22%
Level of eductaion
Figure 8
Figure 9
iii) Wage
Classes considered :see explanation in 2)a)i).
Class n k w min max 300 10 2,7 0 26,39 1 2 3 4 5 6 7 8 9 10 Table 3 lower bound 0 2,7 5,4 8,1 10,8 13,5 16,2 18,9 21,6 24,3 upper bound 2,7 5,4 8,1 10,8 13,5 16,2 18,9 21,6 24,3 27
80 70
Frequency
69
74 64 48 28 7
60 50 40 30 20 10 0
2
27
60 Frequency 50 40 30 20 10 0 35
2, 7 5, 4 8, 1 10 ,8 13 ,5 16 ,2 18 ,9 21 ,6 24 ,3
Figure 10
58
29 16
13 5 5 4
4,8 6,6 8,4 10,2 12 13,8 15,6 17,4 19,2 21 Upper bound of the interval (width 1800 )
x -2s x +2s
Empirical rule Chebyshev Actual data in this interval
x -3s x +3s
Empirical rule Chebyshev Actual data in this interval
x +s) Chebyshev is not presented because it is applicable only from the ( x -2s;
10
e) Scatter plots
The scatter diagrams are realized using basic chart wizard of Excel and asking for the display of the equation of the regression line and the R2 coefficient. Note that R2 is not the correlation coefficient but its square. We consequently used the function CORREL of excel to compute the coefficient of correlation. However we can note that both scatter diagram tend to show a clear positive linear relation, hence the coefficient of correlation will be positive so R = R 2 We compute in Table 8 the square of rxy to check our result.
11
Figure 11
Total expenses for clothing and recreation on income
16 14 Total expenses non vital 12 10 8 6 4 2 0 0 10 20 30 40 50 60 70 80 90 Familly ncome y = 0,1677x - 0,013 R2 = 0,9905
Figure 12
Coefficient of correlation:
rxy rxy Correl Vital exp/income 0,979879344 0,960163529 Correl non-Vital exp/income 0,995243075 0,990508779 Table 8
2
We use the functions data analysis/ regression which will give us b0 and b1 such as: y = b0 + b1 x
12
b0
Intercept FINC
b1
Intercept FINC
13
g) Interpretation
i) Questions a to c: distributions
When considering the graphs produced in question a we note that only the variable Weight has a distribution with a bell shape and the others have mostly skewed distributions. Consequently this is no surprise that the variable respects the empirical rule in the table presented in question c. For the 3 other variable only the Chebyshevs rule is applicable as they are not bell shaped.
14
These are the values we use as basis here for our calculations
i) H5N1
The table prepared by excel hereunder aims at applying the two following formula: E ( X ) = xP ( X = x) , V ( X ) = ( x ) 2 P ( X = x) and = V ( X )
allx allx
2 2 V(H) H P(H) H*P(H) E(H) (H-E(H)) *P(H) H-E(H) (H-E(H)) 0,1 1 0 -1 1 0,1 0,128 0,357771 0,5 0,5 0 0 0 0,1 0,11 0,1 0,01 0,001 0,3 0,39 0,3 0,09 0,027 Table 12
H 0 1 1,1 1,3
ii) Thinderbird
The same philosophy is applied a second time for T.
T
2 2 P(T) T*P(T) E(T) T T-E(T) (T-E(T)) (T-E(T)) *P(T) V(T) 1,9 0,35 2 0,665 -0,1 0,01 0,0035 0,007 0,083666 2 0,3 0,6 0 0 0 2,1 0,35 0,735 0,1 0,01 0,0035 Table 13
15
iii) Correlation
We calculate first the covariance using the formula: xy = cov( x, y ) = (a x )(b y ) P( X = a, Y = b) The calculation was done in Excel and is not presented as the detail would not be possible to follow.
xy= -0,002
The coefficient of correlation is then
xy = -0,06682 x y
b) All on H5N1
Instead of building a separate spreadsheet for each of the exercise hereunder I decided to build a few formula in excel using one parameter that I call w which is the weight of H in the portfolio considered. If the investor has 1000 Euros to invest the portfolio is then represented by: M=w*1000H+(1-w)*1000/2*T= w*1000H+(1-w)*500*T Note the share price of T of 2 Euros used in the equation that gives the 500 in front of T. We then use the basic formulas for the expectation and the variance of the portfolio. E ( M ) = w * 1000 E ( H ) + (1 w) * 500 * E (T ) V ( M ) = w2 * 1000 2V ( H ) + (1 w) 2 * 5002 * V (T ) + 2 * w * (1 w) * 1000 * 500 * COV ( H , T ) And m = V (M ) All parameter were calculated in the previous exercise. The formula are entered in Excel and then we can solve the current question and the next two by simply changing w.
16
Lets come back to the case we invest everything in H5N1. Means w=1. Excel delivers us the following results:
H T E(M) V(M) M w (1-w) 1 0 1000 128000 357,7708764 Table 14
c) All on Thunderbirds
The same is applied as previous question using w=0.
H w 0 T (1-w) 1 E(M) 1000 V(M) 1750 Sigma M 41,83300133 Table 15
d) Half-Half scenario
The same is applied as previous question using w=1/2.
H w 0,5 T (1-w) 0,5 E(M) 1000 V(M) 31937,5 Sigma M 178,71066 Table 16
17
f ( w) = M = w2 * 1000 2V ( H ) + (1 w) 2 * 500 2 * V (T ) + 2 * w * (1 w) * 1000 * 500 * COV ( H , T ) We then replace the values we know form a)
f ( w) = 128000 * w2 + 1750 * (1 w) 2 2000 * w * (1 w) f ( w) = 128000 * w2 + 1750 * (1 + w2 2w) 2000 * ( w w2 ) f ( w) = (128000 + 1750 + 2000) * w2 + (3500 2000) * w + 1750
Lets consider what is under the square that we will call h(w).
h( w) = 131750 * w2 5500 * w + 1750
h(w) 250000
200000
The function h is always positive on the range considered 0n [0;1] we checked it only graphically in Figure 13.
Notes: This demonstration was not really necessary as we are dealing with a variance which by definition is positive, but I wanted to take no mathematical risk. The graph is also showing a minimum between 0 and 0,1
150000
100000
50000
0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,1 1,2 -0,3 -0,2 -0,1 1,3 0 1
Figure 13
Now that we have shown that h(w)>0 whatever w value we see that minimizing functions h and f is the same thing. As the square root function is such that if x1<x2 then x1 < x2 on the range [0;+[. We then search the value such that h is minimal. For that, we use its derivate and search for the value where it is 0.
The ideal portfolio for risk averse traders would then be composed approximately of 20 shares of H5N1 and 490 of Thinderbird.
18
If we then put p back the first equations we find: E(M)= 0,221871 M=0,0849272 This is the minimum variance point Verification: We just checked a few values around this supposed minimum.
X=p 0,395709 0,39571 0,395711 0,395712 0,395713 0,395714 0,395715 M E(M) SIGMA(M) 0,604291 0,22187127 0,084927178076 0,60429 0,2218713 0,084927178074 0,604289 0,22187133 0,084927178072 0,604288 0,22187136 0,084927178071 0,604287 0,22187139 0,084927178072 0,604286 0,22187142 0,084927178073 0,604285 0,22187145 0,084927178075 Table 17
19
The curve is obtained by using excel and the above mentioned formulas. We then get the Table 18 with several variation of the parameter p which is then depicted with a scatter diagram in Figure 14.
X 0 0,05 0,1 0,15 0,2 0,25 0,3 0,35 0,4 0,45 0,5 0,55 0,6 0,65 0,7 0,75 0,8 0,85 0,9 0,95 1 M 1 0,95 0,9 0,85 0,8 0,75 0,7 0,65 0,6 0,55 0,5 0,45 0,4 0,35 0,3 0,25 0,2 0,15 0,1 0,05 0 E(M) 0,21 0,2115 0,213 0,2145 0,216 0,2175 0,219 0,2205 0,222 0,2235 0,225 0,2265 0,228 0,2295 0,231 0,2325 0,234 0,2355 0,237 0,2385 0,24 M 0,14 0,1291022 0,1188709 0,1094931 0,1012063 0,0942987 0,0890916 0,0858949 0,0849357 0,0862889 0,0898499 0,0953717 0,1025382 0,1110312 0,1205708 0,1309284 0,1419251 0,1534234 0,1653187 0,1775313 0,19
Table 18
0,24
0,235
Efficiency Curve
0,23
0,225
E(M)
0,22
Minimum variance point x=0,0849272 y=0,221871 at the intersection of the tangent and the curve it is where the efficiency curve starts
0,215
0,21
Tangent at minimum sigma x=0,0849272
0,205
0,2
0,195 0 0,02 0,04 0,06 0,08 0,1 (M) 0,12 0,14 0,16 0,18 0,2
Figure 14
20
21
n0
) ( x z0, 025
n0
) = 0,5
n0
= 0,5 = n0
0,5
n0 = (2 * z0, 025
0,5
) 2 2213
The minimum sample size so that the width of the confidence interval is 0,5 is 2213. Verification: With excel we computed back the confidence interval for the values n=2212 and n=2213
x_bar Z0,975 Sigma n lb ub ub-lb 90 90 1,96 1,96 6 6 2212 2213 89,74996 89,75002 90,25004 90,24998 0,50008 0,49996 Table 20
22
b) Scatter plot
0,15
0,10
Excess company 1
0,05
0,00 -0,08 -0,06 -0,04 -0,02 0,00 0,02 0,04 0,06 0,08 0,10 0,12
-0,05
Figure 15
23
c) Regression
We use the function data analysis/linear regression from excel to produce the following results
SUMMARY OUTPUT
Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations ANOVA df Regression Residual Total SS 1 0,071402455 77 0,049141849 78 0,120544304 MS F Significance F 1,15886E-16 0,769632194 0,592333715 0,587039347 0,025262736 79
Coefficients
t Stat
P-value
Lower 95%
Upper 95%
-0,00667695 0,004755109
Table 22
Coefficients of the linear regression equation
According to this table the linear regression equation between the company excess and the world stock excess is:
y = 0,81929933 1 x - 0,00096092 1
Note : as for 2)f) before trusting this equation we should verify the 4 assumptions necessary to apply a linear model. This is not rigorously done here and the result given by Excel are trusted as they are.
24
Verification: We ask excel to draw the regression line on the scatter diagram and to display the equation
0,15 y = 0,8193x - 0,001 0,10
Excess company 1
0,05
0,00 -0,08 -0,06 -0,04 -0,02 0,00 0,02 0,04 0,06 0,08 0,10 0,12
-0,05
Figure 16
e) Constant term
We base our reasoning on the Table 22. It indicates that the 95% confidence interval (5% significance level) is [-0,00667695 ; 0,004755109]. It can be positive or negative. We then can not conclude that the company offers a guaranteed advantage or disadvantage compared to the world stock.
f) Performance
If the market if performing good, the company is somehow performing worse that the world stock exchange. Explanation if we look at the slope it is according to the 95 and 96% interval definitely less than 1 and the value given by the equation is 0,82 approximately. Means if the excess of the world stock is increasing by 1 the company excess is increasing only about 0,82. But on the contrary if the market is falling it is falling less. The constant term being around 0, we can conclude that:
The investment in the company will not bring the investor a real advantage than the average market but will lower his risk.
25
59,23 % or the of the variability of the return of the company is explained by the linear regression model.
Note: this question could have also been explained by displaying on the excel scatter plot the R2 value associated to the trend line.
h) Prediction interval
The prediction interval could be manually computed with the following formula: y t 0, 04, n 2 s 1 +
2 y b0 1 ( x g x) + where x g = and y = b1 x + b0 2 n (n 1) s x b1
But this manual calculation is quite laborious as we need to calculate the y for each value of x, then the error and the standard deviation of it. Due to the high probability of errors I preferred to use the data analysis plus/prediction interval in excel (In the tools provided on the CD with the book Managerial Statistics from author Gerald Keller) The result is displayed in Table 23, and then we corrected the values because the question was referring to the return and not to the excess. (We put back the 0,01 of the no risk investment)
Prediction Interval excess Predicted value Prediction Interval Lower limit Upper limit Interval Estimate of Expected Value Lower limit Upper limit Prediction interval return of the copmpany Predicted value Lower limit Upper limit 0,025425066 -0,027738959 0,078589091 0,009021793 0,021828339 -0,037738959 0,068589091 0,015425066
Table 23
The prediction interval is pretty wide as the return based on a 96% confidence interval can take values between -2,77% and 7,85%, this is not a narrow prediction.
26