You are on page 1of 3

MAE244 ANALYSIS c.

STATISTICAL ANALYSIS

Mean and Standard Deviations

Statistical analysis is often used to explain variations in experimental data. It is the basis for
which predictions can be made from measurements (as in extrapolation). Probably the statistical
measures that are most familiar to students are the mean (or average), which is used to describe a
sample center or location, and standard deviation, which is a measure of the spread of the sample.
The mean is defined as
n

∑y i
µ= i

where y is the variable of interest for each member of the sample and n is the number of
observations in the sample. The standard deviation is the square root of the variance

Standard Deviation = s = s2
Where

∑y − (∑ y i ) 2 / n
2
i
s2 = i i

Example: A total of 58 AISI 1018 cold-drown steel bars were tested to determine the 0.2 percent
offset yield strength Sy in kpsi. The results were:
20
Sy m
64 2
15
68 6
Frequency

72 6
76 9 10
80 19
84 10 5
88 4
92 2
0
64 68 72 76 80 84 88 92
m is the number of measurements at the
Yield Strength S y, kpsi
certain value.

Employ the previous equations:

µ=
∑ S × m = 78.41
y

∑m
∑ S × m − (∑ S × m) /(∑ m) = 42.45
2 2
y y
s 2
=
∑m
s = s 2 = 6.52
MAE244 ANALYSIS c.2

Therefore, the yield strength of the steel equals to 78.41±6.52 kpsi


Often in laboratory experiments, students will collect data (e.g. strain) as a result of some known
stimulus (e.g. load) and will be asked to determine the relationship between x (strain) and y
(load). As an example, Young's Modulus (or the Elastic Modulus) is the linear relationship, or
slope, between stress (y) and strain (x). To find Young's modulus, student would plot stress and
strain, and then draw a line that best fits the data. The slope could then be determined by finding
the change in y over the change in x (y=mx+b from algebra). The problem with this method is
that everyone would probably draw this line differently, and there would be no unique value for E
for the experimental data, only estimates. To overcome this shortcoming, the method of least
squares will be used.

Linear Regression (Least Squares Fit)

The least squares method is used to fit a polynomial of nth degree. Because our
experiments will be conducted in the linear range of linear elastic materials, the only thing that
should be considered is the fit of a straight line. Thus, through the least squares fit, the slope, m,
and the intercept, b, of the straight line will be determined.

y = mx + b

that will be the best representation of the experimental data (x1, y1), (x2, y2), .... (xn, yn)....(xN, yN).
The least square fit will tell how changes in x affect changes in y, where x is the independent
variable and y is the dependent variable.

y variable (dependent)
y = mx+ b + ε

y = mx+ b

x variable (independent)
The term ε is added to define the actual location of the points (i.e. ε is an error term). For n x and
y data points, the slope, m, and the intercept, b, are calculated using the following equations:

x=∑ n , y=∑ n
x y

(∑ x ) (∑ x )(∑ y )
2

= ∑x − , Sxy = ∑ xy −
2
Sxx n n
m = Sxy Sxx , b = y − mx
MAE244 ANALYSIS c.3

Correlation

The main use of regression is prediction. The sample correlation coefficient, r, is the
statistic to determine the strength of the correlation (or prediction). It is found using

Sxy
r=
Sxx Syy

where r=1 is a perfect positive fit and r=-1 is a perfect negative fit. r2, the coefficient of
determination, is often used to indicate the proportion of the variability in y explained by the
linear bivariate association with x.

Example. r = 0.89, therefore r2 = 0.79. Then 79% of the variability among y is explained on the
basis of the linear relationship between x.

Regression is for prediction!


Correlation is the strength of the prediction!

Statistical analysis can be performed using Excel or Lotus 1-2-3 so it is not necessary to perform
hand calculations using the above equations.

This analysis tool performs linear regression analysis by using the "least squares" method to fit a
line through a set of observations. Student can analyze how a single dependent variable is affected
by the values of one or more independent variables ¾ for example, how an athlete's performance
is affected by such factors as age, height, and weight. Student can apportion shares in the
performance measure to each of these three factors, based on a set of performance data, and then
use the results to predict the performance of a new, untested athlete.

For the following set of experimental data, regression analysis was performed using Excel.

Linear Regression Example


40
Strain Stress 35 y = 0.0193x + 0.3153
µ mm/mm MPa
30
2
R = 0.9889
Stress (MPa)

25
0 0
20
180 5
570 10 15
700 15 10 Experimental Data
1075 20 5 Linear Regression of Data
1300 25 0
1600 30 0 500 1000 1500 2000
1690 35 Strain (µmm/mm)

You might also like