You are on page 1of 9

HUSAIN KHUZEMA

646167

SUMMER SEMESTER 2018


HSC 4493-A: RESEARCH METHODS AND BIOSTATISTICS
INSTRUCTOR: DR. GABRIEL OKELLO
DATE: MAY 23, 2018 DUE DATE: JUNE 06, 2018
INDIVIDUAL ASSIGNMENT 1
INSTRUCTIONS
Attempt All Questions

1. A student argues that pre-clinical studies are done in Phase II and that there is
randomized large scale testing in Phase IV. As a Research Methods and Biostatistics
student, what will be your advice the student? (4 marks)

I shall tell the student that pre-clinical studies are done at the beginning and phase I studies are done
to assess the safety of the drug. Phase II is test efficacy of a drug. Phase III is now the randomized
large scale testing and Phase IV is post market surveillance.

2. The following data provides the number of hospital visits in one of the County’s in
Kenya.

44 27 31 36 40 38 32 39 22 34
61 36 24 45 38 43 32 28 31 55
37 34 20 23 34 47 25 31 57 61
34 30 43 22 37 26 31 40 25 33
61 54 59 55 53 44 46 29 42 29
42 31 24 35 125 29 34 21 29 45
60 58 52 58 59 58 51 14 36 44
31 30 134 26 45 24 40 43 52 30
35 35 34 20 42 34 27 40 59 31
47 42 56 57 49 41 43 58 47 25
30 41 28 14 40 36 38 24 21 26

a. Key in the data into SPSS: (print screen the variable view and data view) (2 marks)
HUSAIN KHUZEMA
646167

b. Generate an appropriate grouped frequency distribution table with intervals (4 marks)

No of classes
𝑘=1+3.322log10𝑁
N=110
k=1+3.322log10(110)
HUSAIN KHUZEMA
646167

k=7.78
k=8

𝑤𝑖𝑑𝑡ℎ=ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒−𝑙𝑜𝑤𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒/𝑘


W=134-14=120/7.8
W=15.4
W=16

interval groups
Frequenc Percent Valid Cumulative
y Percent Percent
14-29 27 24.5 24.5 24.5
30-45 55 50.0 50.0 74.5
46-61 26 23.6 23.6 98.2
Valid
110-125 1 .9 .9 99.1
126-141 1 .9 .9 100.0
Total 110 100.0 100.0
HUSAIN KHUZEMA
646167

c. Does the data contain any outlier? Explain (2 marks)

125 and 134, as all the other values are 61 and below. And these values are very high and out of the
range of 14-61. As you can see on the box plot the positions 109 and 110 and outliers which are 125
and 134.

[DataSet1] C:\Users\Husain\Documents\ind assgnment 1.sav


HUSAIN KHUZEMA
646167

3. Using SPSS data diabetes_costs.sav, generate and interpret the following descriptive
measures for variable Glycated hemoglobin level (4 marks)

i. Skewness

Statistics
Glycated hemoglobin level

Valid 250
N
Missing 0
Skewness .437
Std. Error of Skewness .154

Skewness is positive hence the tail is longer on the right and it skewed on the left.

ii. 10th percentile

Statistics
Glycated hemoglobin level

Valid 250
N
Missing 0

Percentiles 10 5.860

10% of patients have glycated haemoglobin level of 5.860 and below.


HUSAIN KHUZEMA
646167

4. Using patient_loss.sav data, generate an appropriate graph that shows the obesity by the
different age categories (4 marks)

[DataSet1] C:\Program Files


(x86)\IBM\SPSS\Statistics\20\Samples\English\patient_los.sav
HUSAIN KHUZEMA
646167

5. A regional director responsible for health projects and programs in Nairobi County is
concerned about the number of health project failures. If the mean number of health
project failures per year is 9, what is the probability that more than 4 health projects
will fail during a given month? (3 marks)

6. Consider SPSS data patient_loss.sav. Let age in years be represented by a random


variable X. Assuming that the age in years is normally distributed with mean and
standard deviation, determine the following:

Statistics
Age in years

Valid 10000
N
Missing 0
Mean 62.33
Std. Deviation 8.959

a. The probability that age is between 51 and 61 years (4 marks)

b. How many years should a patient have so that he or she can be among the bottom 78%
(4 marks)

7. The proportion of individuals insured by the Britam Insurance Company who received
at least one traffic ticket during a five-year period is 15%.

a. Show the sampling distribution of 𝑝̅ if a random sample of 150 insured individuals is


used to estimate the proportion having received at least one ticket. (3 marks)

b. What is the probability that the sample proportion will be within ±0.03 of the
population proportion? (3 marks)
HUSAIN KHUZEMA
646167

8. A certain study reported the percentage of people 18 years of age and older who smoke.
Suppose that a study designed to collect new data on smokers and nonsmokers uses a
preliminary estimate of the proportion who smoke as 30%.

a. How large should the sample be to be able to estimate the proportion of smokers in the
population with a margin of error of 0.02? Use 96% confidence. (3 marks)

b. Assume that the study uses your sample size recommendation in part (a) and finds 520
smokers. What is the point estimate of the proportion of smokers in the population? (2
marks)

c. What is the 98% confidence interval for the proportion of smokers in the population? (3
marks)
HUSAIN KHUZEMA
646167

9. In recent years more people have been working past the age of 65. In 2015, 27% of people
aged 65–69 worked. A recent report from the Labor Organization claimed that percentage
working had increased. The findings from a sample of 600 people aged 65–69 showed that 180
of them were working. Develop a hypothesis test such that the rejection of 𝐻0 will allow you to
conclude that the proportion of people aged 65–69 working has increased from 2015. Conduct
your hypothesis test using 𝛼=0.03. What is your conclusion? (5 marks)

You might also like