You are on page 1of 4

REVIEW OF SELECTED STATISTICAL TOOLS IN EXCEL

1. Summary of Excel Statistical Tools

(a) Insert Function (under Formulas): to calculate values of statistical functions


(sample average, sample standard deviation, quartiles, cumulative areas for the t
distribution,)
(b) Scatter, Line Chart, (under Insert)
(c) Data Analysis (under Data): Descriptive Statistics, Correlation, Regression,
Histogram, Regression, Tests)

2. Example:

Suppose a manufacturer of steel rods is experimenting with a new technology that is


supposed to increase the mean strength of rods. A random sample of 30 rods was
obtained with the old technology, and another sample of 30, with the new technology.
The breaking strength of each rod in the two groups was recorded. The data file
LabReview.xls is available in the Lecture Data folder on eClass.

(a) Use Excel to calculate the mean, median, standard deviation, variance and quartiles
for the new-technology rods.

(b) Obtain a 95% confidence intervals for the mean strength of new technology rods.

(c) Is there any evidence that new technology rods are stronger on average than old
technology rods? Use the appropriate test in Excel to answer the question. Do not
pool variances. Report the value of the test statistic and the p-value of the test.

(d) The new technology rods were subjected to high pressure and the strength of
each of them was determined again. The data are provided in the third column in
the Excel worksheet. Is there any evidence that high pressure increased the
strength of steel rods? What is the value of the test statistic to answer the
question?

(e) By how much (on the average), the breaking strength has increased due to the
high pressure treatment?

(f) Obtain the correlation between the breaking strength of the new technology rods
and the same rods before the high pressure treatment.

(g) Run the regression of breaking strength of new technology rods after high
pressure treatment as the dependent variable (y) vs. breaking strength of the rods
before the high pressure treatment (x). What is the sum of squares of residuals?
What is the estimate of the model standard deviation ? What is the equation of
the least-squares regression line for the data? Is there evidence of a positive linear
relationship between y and x? What is the p-value of the corresponding test?

1
SOLUTIONS:

(a) The mean, standard deviation and variance of breaking strength for new
technology rods are displayed in the Descriptive Statistics output below:

new tech

Mean 59.4
Standard Error 1.613591696
Median 58.5
Mode 56
Standard Deviation 8.838005704
Sample Variance 78.11034483
Kurtosis -1.070167218
Skewness 0.126080576
Range 29
Minimum 46
Maximum 75
Sum 1782
Count 30
Confidence Level (95.0%) 3.300165567

The quartiles can be obtained by QUARTILE.INC function in Insert Function


(Q1=53.25, Q2=58.5, Q3=65.75).

(b) From the above output, a 95% confidence interval is 59.43.300165567.

(c) The output for t Test: Two-Sample Assuming Unequal Variances (Data Analysis)
is shown below:

t-Test: Two-Sample Assuming Unequal Variances

old tech new tech


Mean 56 59.4
Variance 70.34482759 78.11034483
Observations 30 30
Hypothesized Mean Difference 0
df 58
t Stat -1.528417063
P(T<=t) one-tail 0.065922395
t Critical one-tail 1.671552762
P(T<=t) two-tail 0.131844789
t Critical two-tail 2.001717484

The value of t is -1.5284 (note that the value of +1.5284 would also be correct).
The p-value of the test (one-sided) is 0.0659.

2
(d) The observations in new tech +hpress and new tech columns are obviously
related. The paired data test (Data Analysis) should be used in this case:

t-Test: Paired Two Sample for Means

new tech
+hpress new tech
Mean 60.6 59.4
Variance 58.52413793 78.1103448
Observations 30 30
Pearson Correlation 0.994420979
Hypothesized Mean
Difference 0
df 29
t Stat 4.466435333
P(T<=t) one-tail 5.56997E-05
t Critical one-tail 1.699127027
P(T<=t) two-tail 0.000111399
t Critical two-tail 2.045229642

There is overwhelming evidence that high pressure treatment increased the


strength of steel rods (p-value=5.57*10-5). The value of the corresponding test
statistic is t=4.4664.

(e) Based on the output above, the mean effect of high pressure treatment on breaking
strength is 60.6-59.4=1.2

(f) The Correlation output (Data Analysis) is shown below:

new
tech
new tech +hpress
new tech 1
new tech
+hpress 0.994421 1

The correlation is 0.9944.

3
(g) The Regression output (Data Analysis) is shown below:

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.994420979
R Square 0.988873084
Adjusted R
Square 0.988475694
Standard Error 0.821249088
Observations 30

ANOVA
df SS MS F Significance F
Regression 1 1678.315398 1678.315398 2488.420547 6.69882E-29
Residual 28 18.8846018 0.674450064
Total 29 1697.2

Standard
Coefficients Error t Stat P-value Lower 95% Upper 95%
Intercept 9.470686915 1.035871731 9.142721666 6.6986E-10 7.348799864 11.592574
new tech 0.860762847 0.017255265 49.88407108 6.69882E-29 0.825417039 0.89610865

The sum of squares of residuals is 18.88. The estimate of the model standard deviation is
0.82125 (Standard Error). 98.88% of the variation in breaking strength of new rods after
high pressure treatment is explained by their strength before the treatment. The equation
of the least-squares regression line for the data is

y-hat = 9.47+0.8608*x

The one-sided p-value of the test for the slope (that slope is positive, i.e. evidence of a
positive linear relationship between y and x) is 6.699*E-29 divided by 2 (note that the p-
value in the output is for the two-sided alternative that the slope is different from zero).
Thus there is overwhelming evidence of a positive linear relationship between y and x.

You might also like