You are on page 1of 9

Biology 300, Biometrics Name:

Final Exam B, Winter Quarter 2010

PART I
A. Multiple Choice (60 points). Circle the best answer. Only one choice is “best.”

1. Which of the following statements best describes linear regression analysis?


(a) Both the dependent variable and the independent variable are numeric and are subject to
random error.
B (b) The objective is to find a “best fit” straight line for a numeric dependent variable as a
function of a numeric independent variable.
(c) The null hypothesis is that the linear relationship between a numeric dependent variable as
a function of a numeric independent variable has a y-intercept of zero.
(d) All of the above.
(e) None of the above.

2. Which of the following statements is true?


(a) The arithmetic mean is always less than or equal to the geometric mean.
(b) The geometric mean is always less than or equal to the harmonic mean.
(c) The median is more sensitive than the arithmetic mean to very large or small observations.
D (d) The range is a measure of dispersion.
(e) All of the above are true.

3. In a study, 160 individuals were tested for the presence or absence of some hearing loss. It was
also noted whether or not they listened to music using earbuds. What analysis would you
recommend to see if the likelihood of hearing loss is associated with the use of earbuds?
(a) paired t-test
(b) t-test for two independent groups
(c) two factor ANOVA
(d) correlation analysis
E (e) contingency table

4. In two-factor ANOVA, a interaction that is not statistically significant suggests that


(a) the effects of the different levels of factors A are not significant.
(b) the effects of the different levels of factors B are not significant.
C (c) the effects of the different levels of the two factors are additive.
(d) all of the above
(e) none of the above

5. What statistical procedure would you use to test for an association between two categorical
variables?
A (a) contingency table
(b) paired t-test
(c) t-test for two independent groups
(d) two factor ANOVA
(e) correlation analysis
Biometrics — Winter 2010 Final Exam B Page 2

6. An evolutionary biologist measured the shell diameters of thirty tortoises from each of four
different islands. What analysis would you recommend to test for differences in mean shell
diameters among the four islands?
(a) chi-square goodness-of-fit
(b) paired t-test
(c) correlation analysis
D (d) one factor ANOVA
(e) two factor ANOVA

7. In some situations, we use a test statistic based on the Student’s t-distribution instead of the
standard normal distribution. This is because
(a) the Student’s t-distribution has a positive mean.
B (b) the Student’s t-distribution accounts for error associated with the estimation of the
standard deviation.
(c) the Student’s t-distribution has a smaller standard deviation.
(d) all of the above
(e) none of the above

8. In a one factor ANOVA, the F-ratio test statistic equals the mean squares among groups
divided by the error mean squares. If the null hypothesis of no differences among groups is true,
then the expected value for the test statistic will be approximately equal to
(a) the level of significance.
(b) the appropriate critical value from the F-distribution table.
(c) the degrees of freedom for the numerator of the F-ratio.
(d) zero.
E (e) one.

9. Volunteers were used to test the effectiveness of multivitamins. One group took a daily
multivitamin and a second control group took a daily placebo. Changes in plasma
homocysteine concentrations were measured six months after the start of the study. What
analysis would you recommend to test for a significant effect of multivitamins on plasma
homocysteine concentrations?
(a) paired t-test
(b) t-test for two independent groups
C (c) contingency table
(d) two factor ANOVA
(e) correlation analysis

10. The sample correlation coefficient, r,


A (a) is a measure of association between two numeric variables.
(b) has a chi-squared distribution.
(c) will fall between the values of zero and one.
(d) all of the above
(e) none of the above
Biometrics — Winter 2010 Final Exam B Page 3

11. An increase in sample size would


(a) increase the standard error of the mean.
(b) increase the width of a confidence interval.
(c) increase the probability of a Type I error of a statistical test.
(d) increase the probability of a Type II error of a statistical test.
E (e) increase the power of a statistical test.

12. In regression analysis, the “method of least squares” involves choosing parameters which
minimize the sum of the squared differences between the
A (a) dependent variable and its predicted value.
(b) dependent variable and its overall mean value.
(c) dependent variable and the independent variable.
(d) independent variable and its overall mean value.
(e) independent variable and its predicted value.

13. A marine biologist hypothesizes that zooplankton are randomly distributed in the upper 10
meters of the water column. He collects water samples at various depths and locations. If his
hypothesis is true, the number of zooplankton per sample can be described using a
(a) binomial distribution.
B (b) Poisson distribution.
(c) chi-square distribution.
(d) Student’s t-distribution.
(e) F-distribution

14. A biologist measured the adult body size of four genetic strains of mice. He randomly chose
nine males and females from each genetic strain. He wants to know how genetic strain and sex
influence body size. What analysis would you recommend?
(a) paired t-test
(b) t-test for two independent groups
(c) one factor ANOVA
D (d) two factor ANOVA
(e) contingency table

15. You would use correlation analysis to


(a) compare observed frequencies of categorical variables to expected frequencies.
B (b) test for an association between two numeric variables.
(c) test for significant differences among the means of treatment groups.
(d) test for an association between two categorical variables.
(e) test the fit of a functional relationship between a dependent and independent variable.
Biometrics — Winter 2010 Final Exam B Page 4

16. We use a contingency table


(a) to analyze Type I and Type II errors in hypothesis testing.
B (b) to test for an association between two categorical variables.
(c) to partition the variation in a factorial experiment.
(d) to test for a linear association between the two numeric variables.
(e) to test the fit of data to a probability distribution.

17. Which of the following statements is false.


A (a) Increasing the sample size decreases the probability of a Type I error.
(b) The mean deviation is a measure of dispersion.
(c) A binomial distribution is an example of a discrete probability distribution.
(d) The median is the same as the 50’th percentile.
(e) A 99% confidence interval is wider than a 95% confidence interval.

18. An entomologist placed a beetle in a petri dish with four types of leaves and noted which type
of leaf the insect began eating. He repeated the procedure 100 times using different insects of
the same species. He wants to know if there is a preference by the beetle species for type of
leaf. Which of the following analyses would you recommend?
(a) one factor ANOVA
(b) two factor ANOVA
C (c) chi-square goodness-of-fit
(d) linear regression
(e) correlation analysis

19. A biogeographer hypothesizes that the number of bird species found on an island will decrease
in proportion to the island’s distance from the mainland. She collates data collected from 43
islands to test her hypothesis. What analysis would you recommend?
(a) a paired t-test
(b) one-factor ANOVA
(c) two factor ANOVA
D (d) linear regression
(e) a contingency table

20. When one conducts an ANOVA for linear regression, the alternative hypothesis is that
(a) the dependent variable has a normal distribution.
(b) there is no association between the dependent variable and the independent variable.
C (c) the dependent variable is a linear function of the independent variable.
(d) the mean of the dependent variable is not equal to the mean of the independent variable.
(e) the independent variable is a linear function of the dependent variable.
Biology 300, Biometrics Name:
Final Exam B, Winter Quarter 2010

PART II
B. Answer each question and show intermediate calculations. Be Neat! You may use your
statistical tables, prepared notes (3 sheets), and your calculator. Use the reverse side of the these
pages if more space is needed.

1. A random sample of six snails were collected from a pond and their shell weights and the
diameter of their operculums were measured. Test for a significant correlation between shell
weight and operculum diameter at the 5% level of significance. (20 points)

Body weight Operculum


(g) diameter (mm)
10 8 X Y X*X Y*Y X*Y
10 8 100 64 80
9 7 9 7 81 49 63
12 9 12 9 144 81 108
7 4 7 4 49 16 28
12 8 144 64 96
12 8 10 6 100 36 60
10 6 SUM: 60 42 618 310 435

Sample size: n = 6

Sums of Squares: SSX = ( X2) – ( X)2/n


= 618 – (60)2/6
= 618 – 3600/6
= 618 – 600 = 18.0
SSY = ( Y2) – ( Y)2/n
= 310 – (42)2/6
= 310 – 1764/6
= 310 – 294 = 16.0
SSXY = ( XY) – ( X)( Y)/n
= 435 – (60)(42)/6
= 435 – 2520/6
= 435 – 420 = 15.0

Correlation: r = SSXY / Sqrt[(SSX)(SSy)]


= 15.0 / Sqrt[(18.0)(16.0)]
= 15.0 / Sqrt[288.0]
= 15.0 / 16.9706 = 0.883883 = 0.884

(1) H0 :  = 0
H1:   0 (two-sided)

(2)  = 0.05

(3) Student’s t-test for a zero correlation with df = n – 2 = 6 – 2 = 4. Two-sided alternative hypothesis.

(4) Reject H0 if t < –t4 = –2.776 or t > t4 = 2.776.

(5) t = r / Sqrt[(1 – r2) / (n–2)]


= 0.883883 / Sqrt[(1 – (0.883883)2) / (6–2)]
= 0.883883 / Sqrt[(1 – 0.781249) / 4]
= 0.883883 / Sqrt[0.218751 / 4]
= 0.883883 / Sqrt[0.0546878]
= 0.883883 / 0.233854 = 3.780

(6) Reject H0. There is evidence for a correlation between body weight and operculum diameter.
Biometrics — Winter 2010 Final Exam B Page 6

2. Trees were randomly selected from three sites and their trunk circumferences were measured.
Test to see if there are differences in mean circumference among the three locations at the 5%
level of significance. (25 points)

Tree Trunk Circumference (m)


Site A Site B Site C
0.8 1.2 0.7
1.0 1.0 0.6
0.9 0.8
1.1

Sample sizes: n1 = 4
n2 = 2
n3 = 3
n = n1 + n 2 + n 3 + n 4 = 4 + 2 + 3 = 9

Sums: X1• = 0.8 + 1.0 + 0.9 + 1.1 = 3.8


X2• = 1.2 + 1.0 = 2.2
X3• = 0.7 + 0.6 + 0.8 = 2.1
X•• = X1• + X2• + X3• = 3.8 + 2.2 + 2.1 = 8.1

SST =  Xij2 – ( X••)2/n


= 0.82 + 1.02 + 0.92 + 1.12 + 1.22 + 1.02 + 0.72 + 0.62 + 0.82 – (8.1)2/9
= 0.64 + 1.00 + 0.81 + 1.21 + 1.44 + 1.00 + 0.49 + 0.36 + 0.64 – 65.61/9
= 7.59 – 7.29 = 0.30

SSG = (X1•)2/n1 + (X2•)2/n2 + (X3•)2/n3 – ( X••)2/n


= (3.8)2/4 + (2.2)2/2 + (2.1)2/3 – (8.1)2/9
= 14.44/4 + 4.84/2 + 4.41/3 – 65.61/9
= 3.61 + 2.42 + 1.47 – 7.29 = 0.21

SSE = SST – SSG = 0.30 – 0.21 = 0.09

One Factor ANOVA


Source SS DF MS F
Sites 0.21 2 0.105 7.00
Error 0.09 6 0.015
Total 0.30 8

(1) H0: no differences in mean tree circumferences among the sites


H1: at least one site is different

(2)  = 0.05

(3) One factor ANOVA with df = 2, 6.

(4) Reject H0 if F > F2,6 = 5.14.

(5) F = 7.00. (See ANOVA table.)

(6) Reject H0. There is evidence for differences in mean tree circumferences among the sites.
Biometrics — Winter 2010 Final Exam B Page 7

3. In the development of a standard curve for a protein assay, a reagent was added to known
concentrations of protein and the absorbance was measured for each solution. The data appear
below. Estimate a linear regression for absorbance as a function of protein concentration. Write
out the equation for the fitted line. See if the slope is significantly different from zero at the 5%
level of significance. (You do not need to plot the data.) (30 points)

Protein concentration Absorbance


(mg/ml) (AU)
4 0.4
10 0.5
16 0.8
22 1.1
X Y X*X Y*Y X*Y
4 0.4 16 0.16 1.6
10 0.5 100 0.25 5.0
16 0.8 256 0.64 12.8
22 1.1 484 1.21 24.2
SUM: 52 2.8 856 2.26 43.6
MEAN: 13.0 0.70
Sample size: n = 4

Arithmetic means: x =  X)/n = 52/4 = 13.0 mg/ml


y =  Y)/n = 2.8/4 = 0.70 AU

Sums of Squares: SSX = ( X2) – ( X)2/n


= 856 – (52)2/4
= 856 – 2704/4
= 856 – 676 = 180
SSY = ( Y2) – ( Y)2/n
= 2.26 – (2.8)2/4
= 2.26 – 7.84/4
= 2.26 – 1.96 = 0.30
SSXY = ( XY) – ( X)( Y)/n
= 43.6 – (52)(2.8)/4
= 43.6 – 145.6/4
= 43.6 – 36.4 = 7.2

Slope: b = SSXY / SSX = 7.2 / 180 = 0.040

Intercept: a = y – b x = 0.70 – (0.040) 13.0 = 0.18

Equation: Yi = 0.18 + 0.040 Xi

SSR = (SSXY)2 / SSX = (7.2)2 / 180 = 51.84 / 180 = 0.288

SSE = SSY – SSR = 0.300 – 0.288 = 0.012

(1) H0 : b = 0 ANOVA for Regression


H1: b  0 Source SS DF MS F
Regression 0.288 1 0.288 48.00
(2)  = 0.05 Error 0.012 2 0.006
Total 0.300 3
(3) ANOVA for regression with df = 1, 2.

(4) Reject H0 if F > F1,2 = 18.51.

(5) F = 48.00 (from ANOVA table)

(6) Reject H0. There is evidence for a linear change in absorbance with protein concentration.
Biometrics — Winter 2010 Final Exam B Page 8

4. Two environmental chambers were used for plant growth experiments. One chamber was
maintained at ambient CO2 and the second was kept at a high CO2 concentration. Each chamber
held two different plant varieties (mountain and valley) with three plants each of variety. The
growth rate was measured for each of the plants. Test for significant differences in plant growth
rate between the ambient and high CO2 and between the mountain and valley varieties, and test
for a significant CO2 treatment-variety interaction. Use a 5% level of significance for each test.
(35 points)

Growth Rate (mm/d)


a=2 Sums: X11• = 7 + 8 + 6 = 21
CO2 Treatment
b=2 X12• = 8 + 6 + 7 = 21
Variety Ambient High s=3
X21• = 9 + 11 + 10 = 30
7 9 X22• = 8 + 9 + 7 = 24
n = abs = 12
X1•• = 21 + 21 = 42
Mountain 8 11
X2•• = 30 + 24 = 54
6 10 X•1• = 21 + 30 = 51
8 8 X•2• = 21 + 24 = 45
Valley 6 9 X••• = 42 + 54 = 96
7 7

SST =  Xijl2 – (X•••)2/n


= 72 + 82 + 62 + 82 + 62 + 72 + 92 + 112 + 102 + 82 + 92 + 72 – (96)2/12
= 49 + 64 + 36 + 64 + 36 + 49 + 81 + 121 + 100 + 64 + 81 + 49 – 9216/12
= 794 – 768 = 26

SSA = [(X1••)2 + (X2••)2]/(bs) – (X•••)2/n


= [422 + 542]/6 – (96)2/12
= [1764 + 2916]/6 – 9216/12 (1) H0: no effect due to CO2 treatment
= 4680/6 – 9216/12 H1: an effect due to CO2 treatment exists
= 780 – 768 = 12
(2)  = 0.05
SSB = [(X•1•)2 + (X•2•)2]/(as) – (X•••)2/n (3) Two factor ANOVA with df = 1, 8.
2 + 452]/6 – (36)2/12
= [51 (4) Reject H0 if F > F1,8 = 5.32.
= [2601 + 2025]/6 – 9216/12
(5) F = 12.00. (See ANOVA table.)
= 4626/6 – 9216/12
(6) Reject H0.
= 771 – 768 = 3

SSI = [(X11•)2 + (X12•)2 + (X21•)2 + (X22•)2]/s – SS1 – SS2 – (X•••)2/n


= [212 + 212 + 302 + 242]/3 – 12 – 3 – (36)2/12 (1) H0: no effect due to variety
= [441 + 441 + 900 + 576]/3 – 12 – 3 – 9216/12 H1: an effect due to variety exists
= 2358/3 – 12 – 3 – 768 (2)  = 0.05
= 786 – 12 – 3 – 768 = 3 (3) Two factor ANOVA with df = 1, 8.

SSE = SST – SS1 – SS2 – SSI (4) Reject H0 if F > F1,8 = 5.32.
= 26 – 12 – 3 – 3 = 8 (5) F = 3.00. (See ANOVA table.)
(6) Do not reject H0.

DFA = a–1 = 2–1 = 1. DFB = b–1 = 2–1 = 1.

DFI = (a–1)(b–1) = (2–1)(2–1) = 1. DFE = ab(s–1) = 2(2)(3–1) = 8. (1) H0: no CO2 treatment  variety interaction
H1: an interaction exists
Two-Factor ANOVA (2)  = 0.05
Source SS DF MS F (3) Two factor ANOVA with df = 1, 8.
CO2 12.0 1 12.0 12.00
Variety 3.0 1 3.0 3.00 (4) Reject H0 if F > F1,8 = 5.32.
Interaction 3.0 1 3.0 3.00 (5) F = 3.00. (See ANOVA table.)
Error 8.0 8 1.0 (6) Do not reject H0.
Total 26.0 11
Biometrics — Winter 2010 Final Exam B Page 9

5. Dilithium chloride concentrations were determined from 100 algal cells randomly sampled
from a stream. The results are presented as grouped data in the table below. For these data, the
sample mean is x = 5.82 g/g and the sample standard deviation is s = 0.50 g/g. Test the
hypothesis that the dilithium chloride concentrations follow a normal distribution. Use a 5%
level of significance. (30 points)

Dilithium Number of
chloride (g/g) algal cells
5.0 4
5.0–5.5 19
5.5–6.0 33
6.0–6.5 32
6.5–7.0 10
 7.0 2

Lower Upper Observed Z1 Z2 Prob Expected Chi sq


 5.0 4  -1.64 0.0505 5.05 0.218
5.0 5.5 19 -1.64 -0.64 0.2106 21.06 0.202
5.5 6.0 33 -0.64 0.36 0.3795 37.95 0.646
6.0 6.5 32 0.36 1.36 0.2725 27.25 0.828
6.5 7.0 10 1.36 2.36 0.0778 7.78 1.261
7.0  2 2.36  0.0091 0.91
TOTALS 100 1.0000 100.00 3.154

Use the sample mean x = 5.82 g/g and the sample standard deviation s = 0.50 g/g to convert the X intervals into Z intervals.
Prob{X  5.0} = Prob{Z  (5.0 – 5.82)/0.50} = Prob{Z  –1.64} = 0.5 – Prob{–1.64 < Z  0} = 0.5 – 0.4495 = 0.0505.
Prob{5.0 < X  5.5} = Prob{(5.0 – 5.82)/0.50 < Z  (5.5 – 5.82)/0.50} = Prob{–1.64 < Z  –0.64}
= Prob{–1.64 < Z  0} – Prob{–0.64 < Z  0} = 0.4495 – 0.2389 = 0.2106.
Prob{5.5 < X  6.0} = Prob{(5.5 – 5.82)/0.50 < Z  (6.0 – 5.82)/0.50} = Prob{–0.64 < Z  0.36}
= Prob{–0.64 < Z  0} + Prob{0< Z  0.36} = 0.2389 + 0.1406 = 0.3795.
Prob{6.0 < X  6.5} = Prob{(6.0 – 5.82)/0.50 < Z  (6.5 – 5.82)/0.50} = Prob{0.36 < Z  1.36}
= Prob{0 < Z  1.36} – Prob{0 < Z  0.36} = 0.4131 – 0.1406 = 0.2725.
Prob{6.5 < X  7.0} = Prob{(6.5 – 5.82)/0.50 < Z  (7.0 – 5.82)/0.50} = Prob{1.36 < Z  2.36}
= Prob{0 < Z  2.36} – Prob{0 < Z  1.36} = 0.4909 – 0.4131 = 0.0778.
Prob{X  7.0} = Prob{Z  (7.0 – 5.82)/0.50} = Prob{Z  2.36} = 0.5 – Prob{0 < Z  2.36} = 0.5 – 0.4909 = 0.0091.

Expected values = n Prob and Chi sq = (Expt - Obs)2 / Expt. (See table above.)

To avoid expected numbers less than one, we must pool the last two groups:

Obs = 10 + 2 = 12. Expt = 7.78 + 0.91 = 8.69. Chi sq = (12 – 8.692 / 8.69 = (3.31)2 / 8.69 = 10.9561 / 8.69 = 1.261

(1) H0: data come from a normal distribution


H1: data do not come from a normal distribution

(2)  = 0.05

(3) Chi-squared test with df = k – m – 1 = 5 – 2 – 1 = 2.


Note: must pool the first two groups
to avoid Expt < 1, so k = 5 groups after pooling.

(4) Reject H0 if 2 > 5.991.

(5) 2 = 3.154. (See table above.)

(6) Do not reject H0. No evidence to reject the hypothesis that the data come from a normal distribution.

You might also like