Professional Documents
Culture Documents
Linear Regression
What is Regression?
What is regression? Given n data points ( x1, y1), ( x2, y 2), ... , ( xn, yn)
best fit y f (x) to the data. The best fit is generally based on
minimizing the sum of the square of the residuals,
Sr .
Residual at a point is
( xn, yn)
i yi f ( xi )
y f (x)
Sr ( yi f ( xi ))
i 1
( x1, y1)
Figure. Basic model for regression
Linear Regression-Criterion#1
Given n data points ( x1, y1), ( x2, y 2), ... , ( xn, yn) best fit y a0 a1 x
to the data.
y
xi , yi
i yi a0 a1 xi
x2 , y 2
xn , yn
x3 , y3
i yi a0 a1 xi
x1 , y1
Does minimizing
i 1
i yi (a0 a1xi )
10
8
4.0
3.0
6.0
2.0
6.0
3.0
8.0
2.0
2
x
Linear Regression-Criteria#1
Using y=4x-4 as the regression curve
Table. Residuals at each point for
regression model y = 4x 4.
10
ypredicted
= y - ypredicted
2.0
4.0
4.0
0.0
3.0
6.0
8.0
-2.0
2.0
6.0
4.0
2.0
3.0
8.0
8.0
0.0
i 1
4
2
0
Linear Regression-Criteria#1
Using y=6 as a regression curve
Table. Residuals at each point for y=6
x
ypredicted
= y - ypredicted
2.0
4.0
6.0
-2.0
3.0
6.0
6.0
0.0
2.0
6.0
6.0
0.0
3.0
8.0
6.0
2.0
10
8
i 1
6
4
2
0
0
x
Figure. Regression curve for y=6, y vs. x data
i 1
Linear Regression-Criterion#2
n
Will minimizing
i 1
xi , yi
i yi a0 a1 xi
x2 , y 2
x1 , y1
xn , yn
x3 , y3
i yi a0 a1 xi
x
Linear Regression-Criteria 2
Using y=4x-4 as the regression curve
ypredicted
|| = |y - ypredicted|
2.0
4.0
4.0
0.0
3.0
6.0
8.0
2.0
2.0
6.0
4.0
2.0
3.0
8.0
8.0
0.0
8
6
10
4
2
0
i 1
10
Linear Regression-Criteria#2
Using y=6 as a regression curve
2.0
y
4.0
|| = |y
ypredicted|
ypredicted
6.0
6
y
10
2.0
4
3.0
6.0
6.0
0.0
2.0
6.0
6.0
0.0
3.0
8.0
6.0
2.0
4
i 1
11
2
0
0
Linear Regression-Criterion#2
4
i 1
The sum of the errors has been made as small as possible, that
is 4, but the regression model is not unique.
Hence the above criterion of minimizing the sum of the absolute
value of the residuals is also a bad criterion.
4
12
i 1
Sr i yi a0 a1 xi
2
i 1
i 1
y
xi , yi
i yi a0 a1 xi
x2 , y 2
x1 , y1
xn , yn
x3 , y3
i yi a0 a1 xi
13
a1
we minimize
Sr
with respect to
n
S r
2 yi a0 a1 xi 1 0
a0
i 1
n
S r
2 yi a0 a1 xi xi 0
a1
i 1
giving
n
a a x y
i 1
1 i
i 1
a x a x
i 1
14
0 i
i 1
1 i
i 1
yi xi
i 1
(a0 y a1 x)
i 1
i 1
a1 and a 0 .
a1
a1 directly yields,
i 1
i 1
i 1
n x i y i x i y i
n
2
n x i x i
i 1
i 1
n
and
n
a0
15
x y x x y
i 1
2
i
i 1
i 1
i 1
2
n x i2 xi
i 1
i 1
n
(a0 y a1 x)
Example 1
The torque, T needed to turn the torsion spring of a mousetrap through
an angle, is given below. Find the constants for the model given by
T k1 k 2
Table: Torque vs Angle for a
torsional spring
Torque, T
Radians
N-m
0.698132
0.188224
0.959931
0.209138
1.134464
0.230052
1.570796
0.250965
1.919862
0.313707
Torque (N-m)
Angle,
0.4
0.3
0.2
0.1
0.5
16
1.5
(radians)
Example 1 cont.
The following table shows the summations needed for the calculations of
the constants in the regression model.
Table. Tabulation of data for calculation of important
summations
Radians
N-m
Radians2
N-m-Radians
0.698132
0.188224
0.487388
0.131405
0.959931
0.209138
0.921468
0.200758
1.134464
0.230052
1.2870
0.260986
1.570796
0.250965
2.4674
0.394215
1.919862
0.313707
3.6859
0.602274
i 1
1.1921
8.8491
1.5896
i 1
i 1
i 1
n i Ti i Ti
5
2
n i i
i 1
i 1
5
6.2831
51.5896 6.28311.1921
2
58.8491 6.2831
9.6091102 N-m/rad
17
Example 1 cont.
Use the average torque and average angle to calculate
5
i 1
n
1.1921
2.3842 101
i 1
n
6.2831
5
1.2566
Using,
_
k1 T k 2
2.3842 101 (9.6091 102 )(1.2566)
k1
Example 1 Results
Using linear regression, a trend line is found from the data
Can you find the energy in the spring if it is twisted from 0 to 180 degrees?
19
Example 2
To find the longitudinal modulus of composite, the following data is
collected. Find the longitudinal modulus, E using the regression model
Table. Stress vs. Strain data
E and the sum of the square of the
Strain
Stress
residuals.
(%)
(MPa)
20
0.183
306
0.36
612
0.5324
917
0.702
1223
0.867
1529
1.0244
1835
1.1774
2140
1.329
2446
1.479
2752
1.5
2767
1.56
2896
3.0E+09
Stress, (Pa)
2.0E+09
1.0E+09
0.0E+00
0
0.005
0.01
0.015
Strain, (m/m)
0.02
Example 2 cont.
Residual at each point is given by
i i E i
The sum of the square of the residuals then is
n
S r i2
i 1
i E i
i 1
Therefore
i 1
n
i 1
21
2
i
Example 2 cont.
Table. Summation data for regression model
0.0000
0.0000
0.0000
0.0000
1.8300103
3.0600108
3.3489106
5.5998105
3.6000103
6.1200108
1.2960105
2.2032106
and
5.3240103
9.1700108
2.8345105
4.8821106
12
7.0200103
1.2230109
4.9280105
8.5855106
8.6700103
1.5290109
7.5169105
1.3256107
1.0244102
1.8350109
1.0494104
1.8798107
1.1774102
2.1400109
1.3863104
2.5196107
1.3290102
2.4460109
1.7662104
3.2507107
10
1.4790102
2.7520109
2.1874104
4.0702107
11
1.5000102
2.7670109
2.2500104
4.1505107
12
1.5600102
2.8960109
2.4336104
4.5178107
1.2764103
2.3337108
12
i 1
22
With
12
i 1
2
i
1.2764 10 3
i 1
Using
2.3337 10 8
12
i 1
12
i i
i 1
2
i
2.3337 108
1.2764 10 3
182.84 GPa
Example 2 Results
The equation 182.84 describes the data.
THE END