You are on page 1of 3

Biostatistics 2, Homework 2

Date Assigned: 05 April 2016


Date Due: 25 April 2016
Part 1 (for Questions 1-5): Emphysema is a swelling of the air sacs in the lungs that is
characterized by labored breathing and an increased susceptibility to infection. Carbon
monoxide diffusing capacity, denoted D1co, is a measure of lung function that has been
tested as a possible diagnostic tool for detecting emphysema.
Consider the distributions of CO-diffusing capacity for the population of healthy
individuals and for the population of patients with emphysema. We are interested in
finding out whether the diffusing capacity is the same in the two populations but are not
willing to assume that these distributions are necessarily normal. Use a two-sided test
conducted at the = 0.05 level of significance.
The data are contained in emphysema.dta.
Question 1: Are the data independent or dependent?
Consider that d1co is
A t-test looks at the difference in means of a continuous variable between two
groups. In this case, the null hypothesis Ho has no difference in the means
(i.e., 1= 2) and the alternative hypothesis has a difference in the means.

2 (chisquare) tests for relationships between variables. The

null hypothesis (Ho) is that there is no relationship between


the two variables (d1co and emph). To reject this we need a
Pr < 0.05 (at 95% confidence).
Stata command:
tab d1co emph, column row

chi2

Results: Pearson chi2 (35) = 36.0000

Pr = 0.422

Here the chi2 is not significant. We conclude that there is no


relationship between CO-diffusing capacity for the
population in question of healthy individuals and for the
population of patients with emphysema.

Question 2: What type of statistical test is most appropriate for this data and why?
The most appropriate statistical test is the Logistic
regression (Logit Regression) test which evaluates a
relationship between a continuous parametric variable as
independent variable and a dichotomous/dummy dependent
variable.
Question 3: What are your null and alternative hypotheses?
H0: There is no relationship between d1co and Emphysema.
H1: There is a relationship between d1co and Emphysema.
Question 4: Perform the test with Stata. Be sure to include your output.

Question 5: Do you reject your hypothesis? Draw a conclusion.

Part 2 (for Questions 6-10): A company has developed an approved drug that they
believe has the unintended side-effect of improving hemoglobin levels in anemic women.
They identify 20 women (non-pregnant) who have anemia. The measure hemoglobin
levels at baseline and again after taking the pill for three weeks.
The data are contained in anemiapill.dta.
Question 6: Are the data independent or dependent?

Question 7: If we cannot assume that the underlying population is normally distributed.


What type of statistical test is most appropriate for this data?
Question 8: What are your null and alternative hypotheses
Question 9: Perform the test with Stata. Be sure to include your output.
Question 10: Do you reject your hypothesis? Draw a conclusion.
Part 3 (for Questions 11-15): We are interested in the relationship between a childs
birth weight and the mothers weigh gain during pregnancy. The data are contained in
gestation.dta.
Question 11: Create a new variable, change, that is the difference between the mothers
weight at the end of pregnancy compared to at the beginning of pregnancy. What is the
mean change in weight of mothers?
Question 12: Create a scatter plot of the birthweight against the change in weight of the
mother. Be sure to include the scatter plot in your homework
Question 13: Does there appear be to be a linear relationship between the two variables?
Is it a positive or negative correlation?
Question 14: What is the Pearson correlation coefficient for birwei and change?
Question 15: Test the null hypothesis that the variables are uncorrelated. a) What is your
test statistic (by hand)? b) What is your p-value? c)What do you conclude?

Part 4 (for Questions 16-19): We are now interested in developing a linear model to
help predict the relationship between the birth weight of the child and the weight gain of
the mother. These questions will rely on the same data in part 1, including the new
variable change.
Question 16: What would be the equation for the true population regression line for the
relationship between birth weight and weight gain?
Question 17: Use Stata to fit find the least squares regression line. a) What is the least
squares estimate of the true population intercept ( )? Interpret this value in words. b)
What is the least squares estimate of the true population slope ( )? Interpret this value
in words.
Question 18: Test if there is a significant linear relationship between the birth weight and
mothers weight gain. State the null and alternative hypotheses, calculate the test statistic,
state the distribution of your test statistic, state the p-value, draw a conclusion.
Question 19: What is the expected birth weight of the baby if the mothers weight gain
was 20kg?

You might also like