You are on page 1of 10

David Acquaye BSc. MBA, PhD-St.

(2010/11)

How to do a regression in Excel 2007+


The following procedure can be followed to do a regression in Excel using office 2007 or 2010. Step 1 : Input your data.

Just enter your data.

Step 2 : Click on Data - Click on the Data Analysis Tab

1|Page

David Acquaye BSc. MBA, PhD-St. (2010/11)

i.

Click on the Office button and Select Excel Options

ii.

Click on the Excel options and Select Add-ins

2|Page

David Acquaye BSc. MBA, PhD-St. (2010/11)

iii.

Select Analysis Tool Pak or The first Add-in which appear on your version. You may have Analysis ToolPak or Analysis ToolPak-VBA from the Add-In Window

iv.

Click on Go and Select activate Analysis ToolPak

3|Page

David Acquaye BSc. MBA, PhD-St. (2010/11)

Step 3 :

Start your regression by clicking on the Data Analysis Tab and Select Regression from the Menu which appears.

Step 4 : Input your Dependent Variable in (Y) and your Independent Variable in X. For this example our dependent Variable is Sales and our Independent Variable is Advertising.

4|Page

David Acquaye BSc. MBA, PhD-St. (2010/11)

Click on Labels, Confidence Level and Line Fit Plots etc :

Step 5 :

Click on Ok Interpret your results and Add Trend Line by Clicking on the Scatter plots.

5|Page

David Acquaye BSc. MBA, PhD-St. (2010/11)

SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations ANOVA df Regression Residual Total 1 4 5 SS 1.235571 1.779429 3.015 Standard Error 1.265289 0.003986 Upper 95% 10.40372 0.01771 MS 1.235571 0.444857 F 2.777456647 Significance F 0.170928 0.640163 0.409808 0.26226 0.666976 6

Coefficients Intercept Advert Exp ('000) 6.890714 0.006643

t Stat 5.445961 1.66657

P-value 0.005521213 0.170928062

Lower 95% 3.377709 -0.00442

The Regression has three components: 1. 2. 3. Regression Statistics Table ANOVA Table Regression Coefficient Table Regression Statistics Table:

SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations Explanation : The above gives the goodness-of-fit measures : R2 = 40.98% 0.64016256 0.4098081 0.26226013 0.66697612 6

6|Page

David Acquaye BSc. MBA, PhD-St. (2010/11)

The correlation between the dependent variable (Sales) and Independent Variable (Advertising Expenditure is 0.64-Multiple R. When this is squared we get R2. There 0< R2 < 1. The more closer we are to 1 the more the variation is explained. Therefore R Square = 0.409 or 40.98% means that 40.98% of the variations in sales is explained by the Advertising. When there is more than more than one independent variable it is better to use the Adjusted R square. The standard error of the regression is 0.66697 Analysis of the Variance : ANOVA
ANOVA df Regression Residual Total 1 4 5 SS 1.235571 1.779429 3.015 MS 1.235571 0.444857 F 2.777456647 Significance F 0.170928

This splits the sum of squares into its components: SS, MS will be discussed later

This splits the sum of squares into its components: SS, MS will be discussed later

Statistical Significance of the regression as whole. Here we use the F statistic. From the above the F statistic is 2.777 with a probability of 0.1709. It is normally the same as the p-value when only one independent variable is used. This means the entire regression is not statistically significant at the 95% confidence level (5% level of significance) 3. Regression Coefficient Table Standard Lower Upper Coefficients Error t Stat P-value 95% 95% Intercept 6.890714286 1.265288948 5.445961016 0.0055212 3.377709 10.40372 Advert Exp ('000) 0.006642857 0.003985945 1.666570325 0.1709281 0.0044239 0.0177096 Regression Equation : Sales = 6.890 + 0.0066Advert

a.

[Using the Coefficient Column]

This should be explained under the following headings : Size and Sign :

7|Page

David Acquaye BSc. MBA, PhD-St. (2010/11)

Sign : From the very beginning we expect Sales to be related to advertising. It is seen from the equation that sales is positively related to advertising (If we need more sales we need to do advert). This can also be supported with the Multiple R which is the same as the correlation. Size : If there is no advert (i.e Advert = 0) then Sales will be 6.890 (m). Sales will rise by 0.0066 (000,000) or 6600 for a unit increase in advertising (ie 1000). b. Statistical Significance of Coefficients : This can be done using the p-values or the critical t values. If we perform regression at the 95% level or at the 5% percent level of significance then the following applies : If p < 0.05 : Statistically significant ; p > 0.05 : Statistically not significant.
From the regression above the p>0.05 which means that the relationship found between advertising and sales is not statistically significant.

Example 2 Multiple Regression A car dealer believes the number of cars sold each month is related to the number of years of sales experience and the age of the sales person. The data for a random sample of 10 sales persons are as follows: Y 17 23 20 18 19 22 21 28 26 12 X1 2 6 8 11 4 7 7 14 12 3 X2 23 33 30 35 24 49 36 40 46 51

Y = number of cars sold per month, X1 = Number of years of sales experience and X2= Age of Sales person (Years). Required : Using a multiple regression analysis and use your output to perform the following :

8|Page

David Acquaye BSc. MBA, PhD-St. (2010/11)

i. ii. iii.

What is the estimated regression Equation? Interpret the coefficient s on X1 and X2. State whether or not there is a significant relationship between sales and both explanatory variables, taken together, at the 5% level of significance. State whether or not the explanatory variables X1 and X2 are statistically significant at the 5% level . Be sure to state the null and alternative hypothesis and on what your conclusions are based (i.e. the test procedure used).

iv.

Example 3 Months January February March April May June July August September October November December Sales (000) 75 90 148 183 242 263 278 318 256 200 140 80 Price () 6.8 6.5 6 3.5 3 2.9 2.6 2.1 3.1 3.6 4.2 5.2 Advert Exp () 2 5 6 7 22 25 28 30 22 18 10 2 Mean Daily Hours 2.4 4 5.2 6.8 8 8.4 10.4 11.5 9.6 6.1 3.4 2

Perform a regression analysis and Discuss under the following headings: a. Explanatory power of the regression b. The regression Equation c. The Statistical Significance of the Independent Variable

9|Page

David Acquaye BSc. MBA, PhD-St. (2010/11)

When the trend line is added, this is what we get:

Regression Model for Sales on Advertising


12 y = 0.0066x + 6.8907 10

8 Sales ('m)

Sales Predicted Sales

Linear (Predicted Sales)

0 0 100 200 300 400 500

Advert ('000)

Reference: Fleming M.C and Nellis, J.G (1997) Statistics for Business, Prentice Hall, UK Curwin, J and Slater, R (2004), Quantitative Methods for Business Decisions, 5th Edition, Thomson, UK

10 | P a g e

You might also like