You are on page 1of 26

Academic year 2009 / 2010

2<0

1>0

Business research
Prof. Herbert Hamers

Assignment
Submission deadline: 22.10.2009 Study Group BO9D: Student : 09027165

Table of Content 1)
a) b) c) d)

Exercise 1: normal distribution ___________________________________________ 3


Descriptive statistics _______________________________________________________ Probability that the farm will be profitable next summer ________________________ Probability that the farm will not be profitable next summer _____________________ Mean with fertilizer _______________________________________________________ 3 4 4 5

2)
a)

Exercise 2: descriptive statistics and data patterns ____________________________ 7


Graphical way presentation _________________________________________________ 7

i) Weight _____________________________________________________________ 7 ii) Education level _____________________________________________________ 8 iii) Wage _____________________________________________________________ 9 iv) Food expenses______________________________________________________ 9
b) c) d) e) f) Sample mean, standard deviation and median ________________________________ 2s, 4s and 6s intervals _____________________________________________________ 95% confidence interval ___________________________________________________ Scatter plots _____________________________________________________________ Equation of the regression line _____________________________________________ 10 10 11 11 12

i) Food and housing income/family income _________________________________ 13 ii) Clothing and recreation/family income _________________________________ 13
g) Interpretation ___________________________________________________________ 14

i) Questions a to c: distributions __________________________________________ 14 ii) Question d: 95% confidence interval ___________________________________ 14 iii) Questions e and f: linear relationship ___________________________________ 14 3)
a)

Exercise 3: Portfolio expectation, standard deviation and co-variance___________ 15


Individual shares_________________________________________________________ 15

i) H5N1 _____________________________________________________________ 15 ii) Thinderbird _______________________________________________________ 15 iii) Correlation _______________________________________________________ 16


b) c) d) e) f) All on H5N1 _____________________________________________________________ All on Thunderbirds ______________________________________________________ Half-Half scenario________________________________________________________ Risk lover portfolio _______________________________________________________ Risk averse scenario.______________________________________________________ 16 17 17 17 18

4) 5)
a) b) c) d)

Exercise 4: Markowitz portfolio theorem __________________________________ 19 Exercise 5: Confidence interval and tests __________________________________ 21
95% confidence interval for ______________________________________________ Opinion on the announced _______________________________________________ Formal hypothesis test ____________________________________________________ Minimal size of the sample _________________________________________________ Excesses of the company and the world stock exchange _________________________ Scatter plot______________________________________________________________ Regression ______________________________________________________________ Reliability of the slope. ____________________________________________________ Constant term ___________________________________________________________ Performance ____________________________________________________________ Percentage explained by the model __________________________________________ Prediction interval _______________________________________________________ 21 21 21 22 23 23 24 25 25 25 26 26

6)
a) b) c) d) e) f) g) h)

Exercise 6: Linear regression and confidence intervals_______________________ 23

Business research Assignment Student 09027165 November 2009

1) Exercise 1: normal distribution


Basisi: file crop.xls

a) Descriptive statistics
The usage of the data analysis function of excel provides the following results: Descriptive statistics
Crop Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count Table 1 1502,941176 15,06687062 1500 1500 87,8541978 7718,360071 0,16548812 -0,379772489 350 1300 1650 51100 34
12 10 Frequency 8 6 4 2 0 1350 1400 1450 1500 Bin 1550 1600 1650 2 4 6 5 4 3 10

Histogram
Crop yield distribution

Figure 1

The crop in kilograms produced per hectare in the sample seems symmetrically distributed around the mean (1502). The mode and the median that are exactly at 1500 seem to confirm this hypothesis. The histogram is not clearly bell shapes so we can only apply Chebyshevs rule. At least 75% of the measurement fall into the interval (1326;1678) Actually 94,11 At least 89% of the measurement fall into the interval (1238;1766) Actually 100% The fact that the distribution is normally distributed can not be concluded only from the graphical analysis, however this is the hypothesis taken for the next questions.

Business research Assignment Student 09027165 November 2009

b) Probability that the farm will be profitable next summer


Hypothesis given Crop yield distribution normal with =1500 and =200 We are looking for the probability P(X>1600)=1-P(X<1600)

This area under thr curve is the complement and can be computed with the excel function NORMDIST

We are looking for this area

1500

1600

Figure 2

We compute the value using excel. P(X>1600)=1-P(X<1600)=1- 1-NORMDIST(1600;1500;200;1) = 0,308537539

c) Probability that the farm will not be profitable next summer


We are looking for the probability P(X<1600) and we in fact already calculated it in the previous question

This is now this aread we are looking for. It can be computed with the excel function NORMDIST

1500

1600

Figure 3

We compute the value using excel. P(X<1600)= NORMDIST(1600;1500;200;1) = 0,691462461

Business research Assignment Student 09027165 November 2009

d) Mean with fertilizer


To solve the question determine the mean such that the probability that the farm will be profitable next summer will be equal to 0,4 we need to understand the effect of the fertiliser. The effect of the fertiliser is to increase the mean leaving the standard deviation unchanged. This effect is shown in the Figure 4 hereunder.
decreases increases

2<0

1>0

Figure 4

With the help of fertiliser we try to move to the left so that the area detailed in Figure 2 (now approximately 0,3) becomes 0,4.

We are looking for a 1 so that this area is 0,4

1600

Figure 5
Business research Assignment Student 09027165 November 2009 5

First we will look for the point x1 so that P(X>x1)=0,4

We search for a x1 so that this area is 0,4 without fertiliser

x1

Figure 6

To say that P(X>x1)=0,4 is the same as saying P(X<x1)=0,6 We can then use excel to compute x1 with the function NORMINV x1= NORMINV(0,6;1500;200)=1550,669 If we have understood from Figure 4 that the fertiliser consists in a translation of the curve to the right we now recognise that the distance between and x1 is the same distance as between 1 and 1600. Means x1- =1600- 1. 1= -x1+1600 = 1549,330579
Verification: we can compute back the probability for the farms to be profitable with the fertiliser giving a 1= 1549,331 P(Xfertiliser>1600)= 1-P(Xfertiliser<1600)=1- NORMDIST(1600; 1549,330579;200;1) =0,4

Business research Assignment Student 09027165 November 2009

2) Exercise 2: descriptive statistics and data patterns


My home address is 135F Luruper Haupstrasse, 22547 Hamburg. Come and visit me sometimes! I consequently chose the file family5.xls

a) Graphical way presentation


i) Weight
The value to present is an interval value, the presentation as histogram has been chosen. The number of categories (bin) has been calculated using the formula: k=1+3,3log(n) where n is the number of measures. k=1+3,3*log(300)= 9,1745 For the weight and the others we rounded this value to k=10 categories. The width of the classes is then calculated using the formula: w= (largest observation-smallest observation)/Number of classes. w=(109,2-40,4)/10= 6,88 For the width we rounded the value to w=7. We consequently considered the following classes:
Class n k w min max 300 10 7 40,4 109,2 1 2 3 4 5 6 7 8 9 10 Table 2 lower bound 40 47 54 61 68 75 82 89 96 103 upper bound 47 54 61 68 75 82 89 96 103 110

Business research Assignment Student 09027165 November 2009

Using excel histogram function we get the following result.


Weight (In kg)
70 60 50 Frequency 40 30 20 10 0 47 54 61 68 75 82 89 96 103 110 Upper bound of the interval (width 7 kg) 4 8 8 2 30 26 46 49 63 64

Figure 7

ii) Education level


For this variable, the number of possible value being more limited (nominal variable) the previous reasoning on the classes was ignored and all the categories possible were displayed. In this case a pie chart seems also a suitable way to present the distribution of the variable.
EDU
120 100 Frequency 80 60 41 40 20 0 1 2 3 4 5
3; 111; 36%

Level of education
5; 14; 5%

111
4; 41; 14% 1; 69; 23%

69

65

14
2; 65; 22%

Level of eductaion

Figure 8

Figure 9

Business research Assignment Student 09027165 November 2009

iii) Wage
Classes considered :see explanation in 2)a)i).
Class n k w min max 300 10 2,7 0 26,39 1 2 3 4 5 6 7 8 9 10 Table 3 lower bound 0 2,7 5,4 8,1 10,8 13,5 16,2 18,9 21,6 24,3 upper bound 2,7 5,4 8,1 10,8 13,5 16,2 18,9 21,6 24,3 27

Wage (Hourly rate in )

80 70
Frequency

69

74 64 48 28 7

60 50 40 30 20 10 0

2
27

iv) Food expenses


Classes considered :see explanation in 2)a)i).
Class n k w min max 300 10 1,8 3,05 20,88 1 2 3 4 5 6 7 8 9 10 Table 4 lower bound 3 4,8 6,6 8,4 10,2 12 13,8 15,6 17,4 19,2 upper bound 4,8 6,6 8,4 10,2 12 13,8 15,6 17,4 19,2 21
80 70

60 Frequency 50 40 30 20 10 0 35

2, 7 5, 4 8, 1 10 ,8 13 ,5 16 ,2 18 ,9 21 ,6 24 ,3

upper bound of intervals (width 2,7 )

Figure 10

Food expenses (In 1000 )


73 62

58

29 16

13 5 5 4

4,8 6,6 8,4 10,2 12 13,8 15,6 17,4 19,2 21 Upper bound of the interval (width 1800 )

Business research Assignment Student 09027165 November 2009

b) Sample mean, standard deviation and median


The data analysis/descriptive statistics function of excel is used repeatedly and summarized in this table using only the values requested.
Mean Standard deviation Median WEIGHT WAGE 74,59 11,96 74,85 Table 5 FOODEXP 6,24 8,49 4,83 3,48 5,95 7,85

c) 2s, 4s and 6s intervals


The table is computed from the one presented in 2)b) with x being the sample mean and s the standard deviation. The data actually in the interval are computed using a formula based on functions COUNT and COUNTIF of excel.
x -s x +s
Empirical rule Actual data in this interval WEIGHT WAGE FOODEXP 62,62 1,41 5,01 86,55 11,08 11,97 68,00% 68,00% 68,00% 68,33% 65,00% 71,00% 50,66 98,51 95,00% 75,00% 95,67% 38,70 110,48 99,70% 88,89% 100,00% Table 6 -3,43 15,91 95,00% 75,00% 96,33% -8,26 20,75 99,70% 88,89% 98,67% 1,52 15,45 95,00% 75,00% 95,00% -1,96 18,93 99,70% 88,89% 98,67%

x -2s x +2s
Empirical rule Chebyshev Actual data in this interval

x -3s x +3s
Empirical rule Chebyshev Actual data in this interval

Note: For the interval ( x -s; x +2s) interval.

x +s) Chebyshev is not presented because it is applicable only from the ( x -2s;

Business research Assignment Student 09027165 November 2009

10

d) 95% confidence interval


The data analysis/descriptive statistics function of excel is used repeatedly on the 3 variable considered. We then have the mean and the standard deviation of each sample that we need to compute our confidence intervals. We use the student distribution and the standard deviation of the The value of the student distribution t299; 0,025 will be needed to compute the interval. It is calculated using Excel and the formula TINV(0,05;299). Note the value 0,05 used as Excel is using the two tailed calculation. The boundaries of the confidence interval are then computed using the formula s s ( x t 299;0, 025 * , x + t 299;0, 025 * ) n n Result:
FINC Mean ( x ) Standard deviation (s) t299;0,025 n lower bound 95% interval upper bound 95% interval 30,43 13,94 1,97 300 28,84 32,01 Table 7 TOTEXP1 TOTEXP2 19,71 5,09 8,63 2,35 1,97 1,97 300 300 18,72 4,82 20,69 5,36

e) Scatter plots
The scatter diagrams are realized using basic chart wizard of Excel and asking for the display of the equation of the regression line and the R2 coefficient. Note that R2 is not the correlation coefficient but its square. We consequently used the function CORREL of excel to compute the coefficient of correlation. However we can note that both scatter diagram tend to show a clear positive linear relation, hence the coefficient of correlation will be positive so R = R 2 We compute in Table 8 the square of rxy to check our result.

Business research Assignment Student 09027165 November 2009

11

Total expenses for food and housing on income


60 50 Total expenses vital 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 Familly ncome y = 0,6067x + 1,2469 R2 = 0,9602

Figure 11
Total expenses for clothing and recreation on income
16 14 Total expenses non vital 12 10 8 6 4 2 0 0 10 20 30 40 50 60 70 80 90 Familly ncome y = 0,1677x - 0,013 R2 = 0,9905

Figure 12

Coefficient of correlation:
rxy rxy Correl Vital exp/income 0,979879344 0,960163529 Correl non-Vital exp/income 0,995243075 0,990508779 Table 8
2

f) Equation of the regression line


Note Excel has been doing all the job and the equations are given on the Figure 11 and Figure 12 here above. However we provide herewith an alternative calculation for checking. There are normally some assumption we shall be checking before applying a linear regression. From the scatter diagram we can only believe that the homoscedasticity is respected, the other 3 assumptions are not checked and shall be checked if we want to make real conclusion.

We use the functions data analysis/ regression which will give us b0 and b1 such as: y = b0 + b1 x

Business research Assignment Student 09027165 November 2009

12

i) Food and housing income/family income


SUMMARY OUTPUT Regression Statistics Multiple R 0,979879344 R Square 0,960163529 Adjusted R Square 0,96002985 Standard Error 1,725643349 Observations 300 ANOVA df Regression Residual Total 1 298 299 Coefficients 1,246850796 0,606667955 Table 9

b0

Intercept FINC

b1

The equation in Figure 11 is confirmed. y = 1,246850796 + 0,606667955 x

ii) Clothing and recreation/family income


SUMMARY OUTPUT Regression Statistics Multiple R 0,995243075 R Square 0,990508779 Adjusted R Square 0,99047693 Standard Error 0,229240545 Observations 300 ANOVA df Regression Residual Total 1 298 299 Coefficients -0,01295752 0,167697813 Table 10

Intercept FINC

The equation in Figure 12 is confirmed. y = -0,01295752 + 0,167697813 x

Business research Assignment Student 09027165 November 2009

13

g) Interpretation
i) Questions a to c: distributions
When considering the graphs produced in question a we note that only the variable Weight has a distribution with a bell shape and the others have mostly skewed distributions. Consequently this is no surprise that the variable respects the empirical rule in the table presented in question c. For the 3 other variable only the Chebyshevs rule is applicable as they are not bell shaped.

ii) Question d: 95% confidence interval


There is not much to conclude from this table except that we can guess that the mean for the population for each of these variables is with 95% certainty between the lower bound and the upper bound of the interval presented. For example I can tell with 95% certainty that the average net family income is between 28840 and 32010 Euros.

iii) Questions e and f: linear relationship


We can conclude of strong linear relationships between the variables FINC and TOTEXP1 and between FINC and TOTEXP2. The strength of this relationship is shown by the scatter plots that show that graphs are very much aligned and concentrated around the trend line. But this is also confirmed by the coefficient of determination show that respectively 96% and 99% of the variables TOTEXP1 and TOTEXP2 can be explained by using FINC and the linear model.
Note : as explained in f) we have not fully demonstrated the validity of our model as we have not checked the 4 assumptions necessary for concluding of the applicability of the model, the conclusion here above is then true only under the condition that the 4 assumptions are verified.

Business research Assignment Student 09027165 November 2009

14

3) Exercise 3: Portfolio expectation, standard deviation and co-variance


a) Individual shares
In this first time we are interested in the probability distribution of each share, regardless of the value of the other one. To obtain this probability distributions we sum the different lines (for T) and column (for H) to obtain the probability associated with each possible values.
T P(H,T) H 0 1 1,1 1,3 P(T) 1,9 0 0,15 0 0,2 0,35 2 0,1 0,2 0 0 0,3 Table 11 2,1 0 0,15 0,1 0,1 0,35 P(H) 0,1 0,5 0,1 0,3

These are the values we use as basis here for our calculations

i) H5N1
The table prepared by excel hereunder aims at applying the two following formula: E ( X ) = xP ( X = x) , V ( X ) = ( x ) 2 P ( X = x) and = V ( X )
allx allx
2 2 V(H) H P(H) H*P(H) E(H) (H-E(H)) *P(H) H-E(H) (H-E(H)) 0,1 1 0 -1 1 0,1 0,128 0,357771 0,5 0,5 0 0 0 0,1 0,11 0,1 0,01 0,001 0,3 0,39 0,3 0,09 0,027 Table 12

H 0 1 1,1 1,3

ii) Thinderbird
The same philosophy is applied a second time for T.
T
2 2 P(T) T*P(T) E(T) T T-E(T) (T-E(T)) (T-E(T)) *P(T) V(T) 1,9 0,35 2 0,665 -0,1 0,01 0,0035 0,007 0,083666 2 0,3 0,6 0 0 0 2,1 0,35 0,735 0,1 0,01 0,0035 Table 13

Business research Assignment Student 09027165 November 2009

15

iii) Correlation
We calculate first the covariance using the formula: xy = cov( x, y ) = (a x )(b y ) P( X = a, Y = b) The calculation was done in Excel and is not presented as the detail would not be possible to follow.

xy= -0,002
The coefficient of correlation is then

xy = -0,06682 x y

b) All on H5N1
Instead of building a separate spreadsheet for each of the exercise hereunder I decided to build a few formula in excel using one parameter that I call w which is the weight of H in the portfolio considered. If the investor has 1000 Euros to invest the portfolio is then represented by: M=w*1000H+(1-w)*1000/2*T= w*1000H+(1-w)*500*T Note the share price of T of 2 Euros used in the equation that gives the 500 in front of T. We then use the basic formulas for the expectation and the variance of the portfolio. E ( M ) = w * 1000 E ( H ) + (1 w) * 500 * E (T ) V ( M ) = w2 * 1000 2V ( H ) + (1 w) 2 * 5002 * V (T ) + 2 * w * (1 w) * 1000 * 500 * COV ( H , T ) And m = V (M ) All parameter were calculated in the previous exercise. The formula are entered in Excel and then we can solve the current question and the next two by simply changing w.

Business research Assignment Student 09027165 November 2009

16

Lets come back to the case we invest everything in H5N1. Means w=1. Excel delivers us the following results:
H T E(M) V(M) M w (1-w) 1 0 1000 128000 357,7708764 Table 14

c) All on Thunderbirds
The same is applied as previous question using w=0.
H w 0 T (1-w) 1 E(M) 1000 V(M) 1750 Sigma M 41,83300133 Table 15

d) Half-Half scenario
The same is applied as previous question using w=1/2.
H w 0,5 T (1-w) 0,5 E(M) 1000 V(M) 31937,5 Sigma M 178,71066 Table 16

e) Risk lover portfolio


Lets consider again the formula in b) and replace the E(H) and E(T) by their actual value. E ( M ) = w * 1000 E ( H ) + (1 w) * 500 * E (T ) = w * 1000 + (1 w) * 500 * 2 = 1000 We discover that the mean of the mix is not dependant on w, consequently it does not matter which scenario is chosen, the expectation will be the same. In reality the investor will certainly take into account the risk as the expectation has no importance, this question is seen in the next paragraph.

Business research Assignment Student 09027165 November 2009

17

f) Risk averse scenario.


We need to find the minimum to the equation:

f ( w) = M = w2 * 1000 2V ( H ) + (1 w) 2 * 500 2 * V (T ) + 2 * w * (1 w) * 1000 * 500 * COV ( H , T ) We then replace the values we know form a)
f ( w) = 128000 * w2 + 1750 * (1 w) 2 2000 * w * (1 w) f ( w) = 128000 * w2 + 1750 * (1 + w2 2w) 2000 * ( w w2 ) f ( w) = (128000 + 1750 + 2000) * w2 + (3500 2000) * w + 1750

f ( w) = 131750 * w2 5500 * w + 1750

Lets consider what is under the square that we will call h(w).
h( w) = 131750 * w2 5500 * w + 1750

h(w) 250000

200000

The function h is always positive on the range considered 0n [0;1] we checked it only graphically in Figure 13.
Notes: This demonstration was not really necessary as we are dealing with a variance which by definition is positive, but I wanted to take no mathematical risk. The graph is also showing a minimum between 0 and 0,1

150000

100000

50000

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,1 1,2 -0,3 -0,2 -0,1 1,3 0 1

Figure 13

Now that we have shown that h(w)>0 whatever w value we see that minimizing functions h and f is the same thing. As the square root function is such that if x1<x2 then x1 < x2 on the range [0;+[. We then search the value such that h is minimal. For that, we use its derivate and search for the value where it is 0.

h( w) = 2 * 131750 w 5500 = 0 w 5500 w= = 0,020873 131750 * 2

The ideal portfolio for risk averse traders would then be composed approximately of 20 shares of H5N1 and 490 of Thinderbird.

Business research Assignment Student 09027165 November 2009

18

4) Exercise 4: Markowitz portfolio theorem


This exercise uses the same formula as the previous one. If we call p the weight of the asset X then the weight of the asset Y is (1-p) Which gives us for M=pX+(1-p)Y E ( M ) = p * E ( X ) + (1 p ) * E (Y ) V ( M ) = p 2 * V ( X ) + (1 p ) 2 * V (Y ) + 2 * p * (1 p ) * COV ( X , Y ) And m = V (M ) We search first the answer to the question b and then we will present all results together graphically Here we assume that V(M) is always positive. Minimizing V and is the same thing. We consider the function f ( p ) = V ( M ) = p 2 * V ( X ) + (1 p ) 2 * V (Y ) + 2 * p * (1 p ) * COV ( X , Y ) f ( p ) = p 2 * (V ( X ) + V (Y ) 2 * COV ( X , Y )) + p * (2V (Y ) + 2 * COV ( X , Y )) + V (Y ) We look for the minimum point of this function which we find by derivation.
f ( p ) = 2 * p * (V ( X ) + V (Y ) 2 * COV ( X , Y )) + (2V (Y ) + 2 * COV ( X , Y )) = 0 p (2V (Y ) 2 * COV ( X , Y )) p= 0,395712 2 * (V ( X ) + V (Y ) 2 * COV ( X , Y ))

If we then put p back the first equations we find: E(M)= 0,221871 M=0,0849272 This is the minimum variance point Verification: We just checked a few values around this supposed minimum.
X=p 0,395709 0,39571 0,395711 0,395712 0,395713 0,395714 0,395715 M E(M) SIGMA(M) 0,604291 0,22187127 0,084927178076 0,60429 0,2218713 0,084927178074 0,604289 0,22187133 0,084927178072 0,604288 0,22187136 0,084927178071 0,604287 0,22187139 0,084927178072 0,604286 0,22187142 0,084927178073 0,604285 0,22187145 0,084927178075 Table 17

Business research Assignment Student 09027165 November 2009

19

The curve is obtained by using excel and the above mentioned formulas. We then get the Table 18 with several variation of the parameter p which is then depicted with a scatter diagram in Figure 14.
X 0 0,05 0,1 0,15 0,2 0,25 0,3 0,35 0,4 0,45 0,5 0,55 0,6 0,65 0,7 0,75 0,8 0,85 0,9 0,95 1 M 1 0,95 0,9 0,85 0,8 0,75 0,7 0,65 0,6 0,55 0,5 0,45 0,4 0,35 0,3 0,25 0,2 0,15 0,1 0,05 0 E(M) 0,21 0,2115 0,213 0,2145 0,216 0,2175 0,219 0,2205 0,222 0,2235 0,225 0,2265 0,228 0,2295 0,231 0,2325 0,234 0,2355 0,237 0,2385 0,24 M 0,14 0,1291022 0,1188709 0,1094931 0,1012063 0,0942987 0,0890916 0,0858949 0,0849357 0,0862889 0,0898499 0,0953717 0,1025382 0,1110312 0,1205708 0,1309284 0,1419251 0,1534234 0,1653187 0,1775313 0,19

Table 18

Efficient curve of the Mix


0,245

0,24

0,235

Efficiency Curve

0,23

0,225

Market line y=0,2793549x+0,2 (approximation only)

E(M)

0,22

Minimum variance point x=0,0849272 y=0,221871 at the intersection of the tangent and the curve it is where the efficiency curve starts

0,215

0,21
Tangent at minimum sigma x=0,0849272

0,205

0,2

0,195 0 0,02 0,04 0,06 0,08 0,1 (M) 0,12 0,14 0,16 0,18 0,2

Figure 14

Business research Assignment Student 09027165 November 2009

20

5) Exercise 5: Confidence interval and tests


a) 95% confidence interval for
Hypothesis: in this full exercise we will consider that we trust the value of 6 given by the company for the standard deviation
We use the following formulas to draw the 95% interval around x . lb = x z 0, 025 lb = x + z 0, 025

Excel helps us to get the results presented in the table hereunder. x


Z0,025 n lb ub Table 19 90 1,96 6 9 86,08 93,92

b) Opinion on the announced


The mean =100 of the population given is outside the 95% confidence interval given. Based on this interval we can conclude that this is not accurate and that it should be lower. This is what we will formally test in the next question.

c) Formal hypothesis test


We consider the following hypothesis: H0: =100 H1: <100 We built our static test the following way: x ztest = = 5

n Now we compare it to z / 2 = z0,005 = 2,58

As ztest<z0,005 we reject the H0 hypothesis. is less than 100.

Business research Assignment Student 09027165 November 2009

21

d) Minimal size of the sample


We are looking for n0 so that ub-lb=0,5

( x + z0, 025 2 * z0,025 2 * z0,025

n0

) ( x z0, 025

n0

) = 0,5

n0

= 0,5 = n0

0,5

n0 = (2 * z0, 025

0,5

) 2 2213

The minimum sample size so that the width of the confidence interval is 0,5 is 2213. Verification: With excel we computed back the confidence interval for the values n=2212 and n=2213
x_bar Z0,975 Sigma n lb ub ub-lb 90 90 1,96 1,96 6 6 2212 2213 89,74996 89,75002 90,25004 90,24998 0,50008 0,49996 Table 20

Business research Assignment Student 09027165 November 2009

22

6) Exercise 6: Linear regression and confidence intervals


a) Excesses of the company and the world stock exchange
The excess is calculated by subtracting from the return of the company (respectively the world stock) the risk free interest rate (0,01). As the table is quite long we give herewith only the first lines of the table as exemple.
company world stock Risk_free Excess_company Excess_world_stock 0,02 -0,01 0,01 0,01 -0,02 0,01 -0,01 0,01 0,00 -0,02 0,12 0,08 0,01 0,11 0,07 0,08 0,06 0,01 0,07 0,05 0,04 0,02 0,01 0,03 0,01 -0,01 0,02 0,01 -0,02 0,01 0,00 -0,03 0,01 -0,01 -0,04 Table 21

b) Scatter plot
0,15

0,10

Excess company 1

0,05

0,00 -0,08 -0,06 -0,04 -0,02 0,00 0,02 0,04 0,06 0,08 0,10 0,12

-0,05

-0,10 Excess world stock

Figure 15

Business research Assignment Student 09027165 November 2009

23

c) Regression
We use the function data analysis/linear regression from excel to produce the following results
SUMMARY OUTPUT
Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations ANOVA df Regression Residual Total SS 1 0,071402455 77 0,049141849 78 0,120544304 MS F Significance F 1,15886E-16 0,769632194 0,592333715 0,587039347 0,025262736 79

0,071402455 111,8799805 0,000638206

Intercept Excess_world_ stock

Standard Error -0,000960921 0,002870567 0,819299331 0,077458023

Coefficients

t Stat

P-value

Lower 95%

Upper 95%

Lower 96,0% -0,006957893 0,657479881

Upper 96,0% 0,005036052 0,981118782

-0,334749365 0,738724329 10,57733333

-0,00667695 0,004755109

1,15886E-16 0,665060704 0,973537958

Table 22
Coefficients of the linear regression equation

95% confidence interval

96% confidence interval

According to this table the linear regression equation between the company excess and the world stock excess is:

y = 0,81929933 1 x - 0,00096092 1
Note : as for 2)f) before trusting this equation we should verify the 4 assumptions necessary to apply a linear model. This is not rigorously done here and the result given by Excel are trusted as they are.

Business research Assignment Student 09027165 November 2009

24

Verification: We ask excel to draw the regression line on the scatter diagram and to display the equation
0,15 y = 0,8193x - 0,001 0,10

Excess company 1

0,05

0,00 -0,08 -0,06 -0,04 -0,02 0,00 0,02 0,04 0,06 0,08 0,10 0,12

-0,05

-0,10 Excess world stock

Figure 16

d) Reliability of the slope.


We base our reasoning on the Table 22. It indicates that the 96% confidence interval for the slope is [0,657479881 ; 0,981118782]. If we rely on this interval the relation between the two variable is positive and is definitely existing because 0 is not in the interval.

e) Constant term
We base our reasoning on the Table 22. It indicates that the 95% confidence interval (5% significance level) is [-0,00667695 ; 0,004755109]. It can be positive or negative. We then can not conclude that the company offers a guaranteed advantage or disadvantage compared to the world stock.

f) Performance
If the market if performing good, the company is somehow performing worse that the world stock exchange. Explanation if we look at the slope it is according to the 95 and 96% interval definitely less than 1 and the value given by the equation is 0,82 approximately. Means if the excess of the world stock is increasing by 1 the company excess is increasing only about 0,82. But on the contrary if the market is falling it is falling less. The constant term being around 0, we can conclude that:

The investment in the company will not bring the investor a real advantage than the average market but will lower his risk.

Business research Assignment Student 09027165 November 2009

25

g) Percentage explained by the model


In Table 22 , R2 is given as 0,592333715. Mean:

59,23 % or the of the variability of the return of the company is explained by the linear regression model.
Note: this question could have also been explained by displaying on the excel scatter plot the R2 value associated to the trend line.

h) Prediction interval
The prediction interval could be manually computed with the following formula: y t 0, 04, n 2 s 1 +
2 y b0 1 ( x g x) + where x g = and y = b1 x + b0 2 n (n 1) s x b1

But this manual calculation is quite laborious as we need to calculate the y for each value of x, then the error and the standard deviation of it. Due to the high probability of errors I preferred to use the data analysis plus/prediction interval in excel (In the tools provided on the CD with the book Managerial Statistics from author Gerald Keller) The result is displayed in Table 23, and then we corrected the values because the question was referring to the return and not to the excess. (We put back the 0,01 of the no risk investment)
Prediction Interval excess Predicted value Prediction Interval Lower limit Upper limit Interval Estimate of Expected Value Lower limit Upper limit Prediction interval return of the copmpany Predicted value Lower limit Upper limit 0,025425066 -0,027738959 0,078589091 0,009021793 0,021828339 -0,037738959 0,068589091 0,015425066

Table 23

The prediction interval is pretty wide as the return based on a 96% confidence interval can take values between -2,77% and 7,85%, this is not a narrow prediction.

Business research Assignment Student 09027165 November 2009

26

You might also like