You are on page 1of 163

Introduction to Medical Statistics

Lecture 1
John J. Chen, Ph.D. Assistant Professor in Biostatistics Stony Brook University School of Medicine
Biostatistics for Fellows and Residents of GCRC & Surgery Dept. December 1, 2004

JAMA (Nov. 17, 2004)


Vol. 292 (19) (Six Original Contribution Papers)
mean (SD) for continuous variables proportions for categorical variables repeated-measures analysis of variance Cox proportional hazards models adjusted hazard ratios 95% confidence interval standard errors Wald 2 statistics Kaplan-Meier method 2-tailed P<.05 statistical significance to have 90% power a 2-tailed level of less than.05 2x2 analyses 2 tests Logistic regression models P<.05 considered statistically significant compared using t tests the Wilcoxon rank sum test generalized estimating equation methods 2-stage statistical model Poisson regression models the Bayesian estimates the degrees of freedom sample size Pearson 2 Fisher exact test Mann-Whitney test the log-rank test A 2-factor analysis of variance the Z test stratified Mantel-Haenszel analysis`

Outline of The Three Lectures


2. Goals of statistics; descriptive statistics; normal distribution; AUC 3. Sampling distribution; CI; hypothesis testing; p-value; power 4. Common statistical tests: one sample ttest, two independent sample t-test, two paired sample t-test, chi-sq. test, Fishers exact test

JAMA (Nov. 17, 2004)


Vol. 292 (19) (Six Original Contribution Papers)
mean (SD) for continuous variables proportions for categorical variables repeated-measures analysis of variance Cox proportional hazards models adjusted hazard ratios 95% confidence interval standard errors Wald 2 statistics Kaplan-Meier method 2-tailed P<.05 statistical significance to have 90% power a 2-tailed level of less than.05 2x2 analyses 2 tests Logistic regression models P<.05 considered statistically significant compared using t tests the Wilcoxon rank sum test generalized estimating equation methods 2-stage statistical model Poisson regression models the Bayesian estimates the degrees of freedom sample size Pearson 2 Fisher exact test Mann-Whitney test the log-rank test A 2-factor analysis of variance the Z test stratified Mantel-Haenszel analysis`

Outline of Lecture 1
2. 3. 4. 5. 6. 7. 8. 9. The importance of medical statistics The goal of statistics How to obtain a good sample? Types of data Descriptive statistics Graphical ways of presenting data Normal distribution Area under the curve

Objectives of Lecture 1
1. To be able to define statistics and to distinguish between population and sample 2. To be able to use real life examples to illustrate the goal of statistics 3. To be able to define and calculate different descriptive statistics 4. To be able to calculate AUC of a normal distribution

Research, Media, and the Public

Lies, Damned Lies, And Statistics

Warning Signs
Reader

Researcher

Statistics in Medical Training


Statistics is above all the subject most disliked by (medical) students.
From Making Doctors: An Institutional Apprenticeship by Simon Sinchair, 1997 (Berg Publishers).

Medical students may not like statistics, but as doctors they will.
Martin Bland, Letter to the Editor, 1998. BMJ; 316:1674.

Medical students may not like statistics, but as good doctors they will have to understand statistics.
John Chen, 2004, Advice to GCRC & Surgery Fellows and Residents

Definition of Statistics

The theory and methodology for study design and for describing, analyzing, and interpreting data generated from such studies.

The Goal of Statistics


Sampling

Probability
Descriptive Statistics

Descriptive Statistics

Inference (Inferential Statistics)

Population
Parameters (, )

Sample
Statistics X ( , S)

Properties of A Good Sample

Adequate sample size (statistical power) Random selection (representative)

Sampling Techniques
1. Simple random sample 2. Stratified sample 3. Systematic sample 4. Cluster sample 5. Convenience sample

Types of Data & Scales of Measurement


Variable Qualitative
Categorical Discrete

Scale nominal ordinal interval ratio

Example blood group grade of tumor body temp. body weight

Quantitative
Numeric Continuous

Notes: Dependent (response) versus Independent (explanatory) variable

Summarizing Quantitative Data


Descriptive statistics:

Measures of central tendency


Mean Median Mode

Measures of variability
Standard deviation Variance Range

Measures of Central Tendency


1. Mean - the average
n
X=
i=1

Xi

n N
i=1

( sample mean )

Xi

( population mean )

Measures of Central Tendency (cont.)


2. Median - the 50th percentile point (the
middle value)

If values are in ascending order, the median is: - the (n +1)/2 term, if n is an odd number - the average of the (n/2)th and (n/2+1)th terms, if n is an even number

The median is not affected by outliers

Measures of Central Tendency (cont.)


3. Mode - The value that occurs
most frequently

Unimodal Multimodal

Measures of Variability
1. Standard deviation (SD)
S=

i=1 (Xi - X) n-1

( sample SD )

N (X i - ) 2 = i=1 N

( population SD )

Measures of Variability (cont.)


2. Variance - The square of the standard deviation

s = sample SD

2 2

2= population SD

Measures of Variability (cont.)

3. Range = highest value - lowest value

An Example
Consider the following values {2, 3, 6, 9, 2} and calculate the following:
Mean? Median? Mode? Range? SD? Variance?

An Example (cont.)
{2, 3, 6, 9, 2}
Mean = n
i=1

Xi

2+2+3+6+9 = 5 = 4.4

An Example (cont.)
{2, 3, 6, 9, 2} Order from low to high { 2, 2, 3, 6, 9 }

n + 1 term = 5 + 1 Median=
2 2 2, 2, 3, 6, 9

=3

rd

term

An Example (cont.)
{2, 3, 6, 9, 2} Mode = value occurring most frequently =2

An Example (cont.)
{2, 3, 6, 9, 2} Order from low to high: { 2, 2, 3, 6, 9 } Range = highest value lowest value = 9-2 = 7

An Example (cont.)
{2, 3, 6, 9, 2} Standard deviation =
=

i=1 (X i - X) / ( n 1 )
2

2 2 (2 4.4) 2 (2 4.4) + (3 4.4) + 2 4.4) + (9 4.4) + (6

5-1 = 37.2 / 4 = 3.05

An Example (cont.)
{2, 3, 6, 9, 2} Variance = (Standard deviation) = 3.05 = 9.30
2 2

Ways of Presenting Data


Data table:

Ways of Presenting Data (cont.)


Summary table: one categorical variable

SEX Frequency 496 145 641 Percent 77.4 22.6 100.0 Cumulative Percent 77.4 100.0

Valid

F M Total

Ways of Presenting Data (cont.)


Bar chart: one categorical variable
600

500

400

300

200

Count

100 F M

SEX

Ways of Presenting Data (cont.)


Pie chart: one categorical variable

Ways of Presenting Data (cont.)


Histogram: one continuous variable
80

60

40

20 Std. Dev = 42.57 Mean = 76.3 0


0 0. .0 20 .0 40 .0 60 .0 80

N = 641.00
0 0. 22 0 0. 20 0 0. 18 0 0. 16 0 0. 14 0 0. 12 0 0. 10

Age at Event (month)

Ways of Presenting Data (cont.)


Box plot: one continuous variable, one categorical variable
300

200

100

Age at Event (month)

-100
N= 496 145

SEX

Ways of Presenting Data (cont.)


Cross-tabulation: two categorical variables
RACE * SEX Crosstabulation Count SEX F RACE B C O 31 463 2 496 M 31 112 2 145 Total 62 575 4 641

Total

Ways of Presenting Data (cont.)


Scatterplot: two continuous variables
200

100

Height (cm)

0 0 20 40 60 80 100 120 140 160

Weight (kg)

Different Ways of Presentation

The Dramatizing Way

The Normal Distribution


X ~ N( , 2)
50% 50%

- 2

+ 2

The Normal Distribution (cont.)


Carl Friedrich Gauss (1777-1855)

Bell shaped curve with highest point at Symmetric about Unimodal Continuous distribution Approaches horizontal axis but never touches

The Normal Distribution (cont.)


The most useful feature: AUC (area under the curve)

- 2

+ 2

The Normal Distribution (cont.)


Standard normal distribution: Z ~ N( = 0, = 1)
50% 50%

-2

-1

+1

+2

Given X~ N(, 2), we have Z=(X- )/ .

Standard Normal Table (Two-sided Tail Probabilities of the Normal Curve)


Z .0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 0.00 1.000 .9203 .8415 .7642 .6892 .6171 .5485 .4839 .4237 .3681 .3173 .2713 .2301 .1936 .1615 .1336 .1096 .0891 .0719 .0574 .0455 .0357 .0278 .0214 .0164 .0124 .0093 0069 .0051 .0037 0.01 .9920 .9124 .8337 .7566 .6818 .6101 .5419 .4777 .4179 .3628 .3125 .2670 .2263 .1902 .1585 .1310 .1074 .0873 .0703 .0561 .0444 .0349 .0271 .0209 .0160 .0121 .0091 .0067 .0050 .0036 0.02 .9840 .9045 .8259 .7490 .6745 .6031 .5353 .4715 .4122 .3576 .3077 .2627 .2225 .1868 .1556 .1285 .1052 .0854 .0688 .0549 .0434 .0340 .0264 .0203 .0155 .0117 .0088 .0065 .0048 .0035 0.03 .9761 .8966 .8181 .7414 .6672 .5961 .5287 .4654 .4065 .3524 .3030 .2585 .2187 .1835 .1527 .1260 .1031 .0836 .0672 .0536 .0424 .0332 .0257 0.04 .9681 .8887 .8103 .7339 .6599 .5892 .5222 .4593 .4009 .3472 .2983 .2543 .2150 .1802 .1499 .1236 .1010 .0819 .0658 .0524 .0414 .0324 .0251 0.05 .9601 .8808 .8026 .7263 .6527 .5823 .5157 .4533 .3953 .3421 2983 .2501 .2113 .1770 .1471 .1211 .0989 .0801 .0643 .0512 .0404 .0316 .0244 .0188 .0143 .0108 .0080 .0060 .0044 .0032 0.06 .9522 .8729 .7949 .7188 .6455 .5755 .5093 .4473 .3898 .3371 .2937 .2460 .2077 .1738 .1443 .1188 .0969 .0784 .0629 .0500 .0394 .0308 .0238 0.07 .9442 .8650 .7872 .7114 .6384 .5687 .5029 .4413 .3843 .3320 .2891 .2420 .2041 .1707 .1416 .1164 .0949 .0767 .0615 .0488 .0385 .0300 .0232 .0178 .0135 .0102 .0076 .0056 .0041 .0030 0.08 .9362 .8572 .7795 .7039 .6312 .5619 .4965 .4354 .3789 .3271 .2801 .2380 .2005 .1676 .1389 .1141 .0930 .0751 .0610 .0477 .0375 .0293 .0226 .0173 .0131 .0099 .0074 .0054 .0040 .0029 0.09 .9283 .8493 .7718 .6965 .6241 .5552 .4902 .4295 .3735 .3222 .2757 .2340 .1971 .1645 .1362 .1118 .0910 .0735 .0588 .0466 .0366 .0285 .0220 .0168 .0128 .0096 .0071 .0053 .0039 .0028

-Z

.0198 .0151 .0114

=0
.0147 .0111 .0083 .0061 .0045 .0033

.0193

.0183 .0139 .0105 .0078 .0058 .0042 .0031

.0085 .0063 .0047 .0034

Area Under Curve Within 1 Standard Deviation

-1

=0

Standard Normal Table


(Two-Sided Tail Probabilities of the Normal Curve)

Z
.0 . . . .9 1.0 1.1 . . .

0.00
1.000 . . . .3681 .3173 .2713 . . .

0.01
.9920 . . . .3628 .3125 .2670 . . .

0.02
.9840 . . . .3576 .3077 .2627 . . .

.
. . . . . . . . . .

.
. . . . . . . . . .

.
. . . . . . . . . .

0.09
. . . . . . . . . .

AUC (1 Standard Deviation)

0.3173 2

0.3173 2

-1

=0

AUC Within 1 Standard Deviation


0.6827

0.3173 2

0.3173 2

-1

=0

AUC Within 2 Standard Deviations

-2

=0

Standard Normal Table


(Two-Sided Tail Probabilities of the Normal Curve)

Z
.0 . . . 1.9 2.0 2.1 . . .

0.00
1.000 . . . .0574 .0455 .0357 . . .

0.01
.9920 . . . .0561 .0444 .0349 . . .

0.02
.9840 . . . .0549 .0434 .0340 . . .

.
. . . . . . . . . .

.
. . . . . . . . . .

.
. . . . . . . . . .

0.09
. . . . . . . . . .

AUC Within 2 Standard Deviations

0.0455 2

0.0455 2

-2

=0

AUC Within 2 Standard Deviations


0.9545

0.0455 2

0.0455 2

-2

=0

AUC For Normal Distribution


The Rule of Thumb: Within one s.d.: Within two s.d.: 2/3 95%

Within three s.d.: 99%

The Normal Distribution


Problem : The serum level of 1,25 dihydroxyvitamin D in adolescent girls is believed to be normally distribution with mean 65 pg/ml and standard deviation 12.5 pg/ml. A) What percent of adolescent girls will have a level higher than 65 pg/ml? B) What percent are lower than 65 pg/ml? C) What percent are between 40 pg/ml and 90 pg/ml?

The Normal Distribution


Solution :
50% 50%

=65 pg/ml

The Normal Distribution


Solution : What percent are between 40 pg/ml and 90 pg/ml ?
Z1 = 40 - 65 = -2 12.5 Z2 = 90 - 65 = 2 12.5

40

65

90

-2

Standard Normal Table


(Two-Sided Tail Probabilities of the Normal Curve)

Z
.0 . . . 1.9 2.0 2.1 . . .

0.00
1.000 . . . .0574 .0455 .0357 . . .

0.01
.9920 . . . .0561 .0444 .0349 . . .

0.02
.9840 . . . .0549 .0434 .0340 . . .

.
. . . . . . . . . .

.
. . . . . . . . . .

.
. . . . . . . . . .

0.09
. . . . . . . . . .

The Normal Distribution


Solution : What percent are between 40pg/ml and 90pg/ml ? 40 - 65 90 - 65
Z1 = 12.5 = -2 Z2 = 12.5 = 2 0.9544 0.0455 2 = 0.0227 40 65 90 -2 0 0.0455 2 = 0.0227 2

The Normal Distribution


Problem : A survey finds that the number of hours primary school children watch television per week is normally distributed with mean 10 and s.d. 2. If a primary school child is chosen at random, what is the probability that he/she watches TV for: A. Less than 4 hrs/week ?
B. More than 12 hrs/week ? C. Between 8 and 14 hrs/week ?

The Normal Distribution


Solution : A) P ( X < 4 hrs/week ) ?
4 - 10 Z1 = = -3 2

10

-3

Standard Normal Table


(Two-Sided Tail Probabilities of the Normal Curve)

Z
.0 . . . 2.9 3.0 3.1 . . .

0.00
1.000 . . . .0037 .0027 .0019 . . .

0.01
.9920 . . . .0036 .0026 .0019 . . .

0.02
.9840 . . . .0035 .0025 .0018 . . .

.
. . . . . . . . . .

.
. . . . . . . . . .

.
. . . . . . . . . .

0.09
. . . . . . . . . .

The Normal Distribution


Solution : A) P ( X < 4 hrs/week ) ?
4 - 10 Z1 = = -3 2
=

2 = 0.00135 4 10

0.0027

-3

The Normal Distribution


Solution : B) P ( X > 12 hrs/week ) ?
Z1 =

12 - 10 = 1 2

10

12

Standard Normal Table


(Two-Sided Tail Probabilities of the Normal Curve)

Z
.0 . . . .9 1.0 1.1 . . .

0.00
1.000 . . . .3681 .3173 .2713 . . .

0.01
.9920 . . . .3628 .3125 .2670 . . .

0.02
.9840 . . . .3576 .3077 .2627 . . .

.
. . . . . . . . . .

.
. . . . . . . . . .

.
. . . . . . . . . .

0.09
. . . . . . . . . .

The Normal Distribution


Solution : B) P ( X > 12 hrs/week ) ?
Z1 =

12 - 10= 1 2
= 0.3173 2 = 0.1587

10

12

The Normal Distribution


Solution : C) P ( 8 < X < 14 hrs/week ) ?
8 - 10 Z1 = 2 = -1 , 14 - 10 Z2 = =2 2

10

14

-1

Standard Normal Table


(Two-Sided Tail Probabilities of the Normal Curve)

Z
.0 . . 1.0 . . . 2.0 . .

0.00
1.000 . . .3173 . . . .0455 . .

0.01
.9920 . . .3125 . . . .0444 . .

0.02
.9840 . . .3077 . . . .0434 . .

.
. . . . . . . . . .

.
. . . . . . . . . .

.
. . . . . . . . . .

0.09
. . . . . . . . . .

The Normal Distribution


Solution : C) P ( 8 < X < 14 hrs/week ) ?
8 - 10 Z1 = 2 = -1 , 14 - 10 Z2 = =2 2
0.8186 0.0455 2 = 0.0227

= 0.3173 2 = 0.1587 8 10 14 -1 0

Review of Lecture 1
2. 3. 4. 5. 6. 7. 8. 9. The importance of medical statistics The goal of statistics How to obtain a good sample? Types of data Descriptive statistics Graphical ways of presenting data Normal distribution Area under the curve

Achieving Objectives
1. To be able to define statistics and to distinguish between population and sample 2. To be able to use real life examples to illustrate the goal of statistics 3. To be able to define and calculate different descriptive statistics 4. To be able to calculate AUC of a normal distribution

Next Month
2. Goals of statistics; descriptive statistics; normal distribution; AUC 3. Sampling distribution; CI; hypothesis testing; p-value; power 4. Common statistical tests: one sample ttest, two independent sample t-test, two paired sample t-test, chi-sq. test, Fishers exact test

Introduction to Medical Statistics


Lecture 2
John J. Chen, Ph.D. Assistant Professor in Biostatistics Stony Brook University School of Medicine
Biostatistics for Fellows and Residents of GCRC & Surgery Dept. January 5, 2005

Three Lectures
2. Goals of statistics; descriptive statistics; normal distribution; AUC 3. Sampling distribution; CI; hypothesis testing; p-value; power 4. Common statistical tests: one sample ttest, two independent sample t-test, two paired sample t-test, chi-sq. test, Fishers exact test Lecture notes:

http://ms.cc.sunysb.edu/~jjchen

The Goal of Statistics


Sampling

Probability
Descriptive Statistics

Descriptive Statistics

Inference (Inferential Statistics)

Population
Parameters (, )

Sample
Statistics X ( , S)

The Normal Distribution


Standard normal distribution: Z ~ N( = 0, 2 = 1)
50% 50%

-2

-1

+1

+2

Given X~ N(, 2), we have Z=(X- )/ .

An Example
Establishing Serum Creatinine Reference Range 200 healthy volunteers of age 25 to 35 yrs old were evaluated for sCr. The values (mg/dL) follow approximately a normal distribution, N(1.2, 0.22). What will be a 95% (middle range) reference (or normal) range? Sol.: Z = (X - ) / X = + Z*
Therefore, X(low) = 1.2 - 1.96*0.2 = 0.8 mg/dL X(high) = 1.2 + 1.96*0.2 = 1.6 mg/dL

Outline of Lecture 2
1. 2. 3. 4. 5. 6. 7. Sampling distribution Central Limit Theorem Confidence interval t-distribution Hypothesis testing Types I & II errors, statistical power p-value

Lecture notes:

http://ms.cc.sunysb.edu/~jjchen

Objectives of Lecture 2
1. To describe sampling distribution and Central Limit Theorem, and comprehend their importance in Statistics 2. To correctly construct and interpret confidence intervals for population means 3. To describe basic steps of hypothesis testing, using real life examples 4. To correctly define type I & II errors, statistical power, effect size 5. To correctly interpret p-value of a statistical test

Sampling Distribution
The distribution of individual observations versus the distribution of sample means
population

Central Limit Theorem


The distribution of sample means (sampling distribution) from a population is approximately normal if the sample size is large, i.e., 1. The population distribution can be non-normal.
Z=
2 X ~ N ( X , X )

X ~ N (0,1) / n

2. Given the population has mean , then the mean of X = the sampling distribution, 3. if the population has variance 2, the standard deviation of the sampling distribution, or the standard error (a measure of the amount of sampling error) is s.e.( X ) = = X n

CLT & Sampling Distribution


A simulation demo:
http://www.ruf.rice.edu/%7Elane/stat_sim/sampling_dist/index.html

Confidence Intervals
95% CI for ? Prob ( ?? < < ??) = 0.95
Z .0 . . . 1.8 1.9 2.0 . . . 0.00 1.000 . . . .0719 .0574 .0455 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.05 .9601 . . . .0643 .0512 .0404 . . . 0.06 .9522 . . . .0629 .0500 .0394 . . .

Prob ( ?? < Z < ??) = 0.95

(Two-Sided Tail Probabilities of the Normal Curve)


0.07 .9442 . . . .0615 .0488 .0385 . . . . . . . . . . . . . . 0.09 . . . . .5888 .0466 .0366 . . .

Confidence Intervals (cont.)


Therefore,

Prob ( 1.96 < Z < 1.96 ) = 0.95

2 Since X ~ N ( X , X ,) 95% CI for :

X Prob ( 1. 96 < < 1.96 ) = 0.95 / n

Prob ( X 1.96

< < X + 1. 96

) = 0.95

95% 2.5% -1.96 2.5%

1.96

Confidence Intervals
95% Confidence Interval for :
X 1. 96 n

Definition 1: You can be 95% sure that the true mean ( ) will fall within the upper and lower bounds. Definition 2: 95% of the intervals constructed using sample means ( x ) will contain the true mean ( ).

Confidence Interval
A simulation demo:
http://www.ruf.rice.edu/%7Elane/stat_sim/conf_interval/index.html

Confidence Intervals
90% CI for ?
(Two-Sided Tail Probabilities of the Normal Curve)
Z . . . 1.5 1.6 1.7 . . . . 0.00 . . . .1336 .1096 .0891 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.04 . . . .1236 .1010 .0819 . . . . 0.05 . . . .1211 .0989 .0801 . . . . 0.06 . . . .1188 .0969 .0784 . . . . . . . . . . . . . . . . . . . . . . . . . . 0.09 . . . .1118 .0910 .0735 . . . .

Confidence Intervals
CIs for : 90% CI : x 1.65 ( / 95% CI : x 1.96 ( / 99% CI : x 2.58 ( / n) n) n)

Confidence Intervals
Problem:
A fellow wanted to determine the average serum creatinine level among healthy elderly male subjects from Stony Brook village. From the literature he found that the standard deviation of serum creatinine is around 0.15 mg/dL for various studied patient groups. But he could not find any information about the of serum creatinine among local elderly males. The fellow decided to measure 30 healthy elderly male volunteers from Stony Brook, and the average creatinine level was 0.94 mg/dL. What is the 95% CI for ?

Confidence Intervals
Solution : 95% CI = x 1.96 ( / n)

= 0.94 1.96 (0.15 / 30 ) = (0.89, 0.99)

Confidence Intervals
Problem: Total knee replacement usually requires a few days of hospital stay after the surgery. The length of stay was recorded for 90 patients with total knee replacement at Hospital XYZ. The sample mean was 4.20 days and the sample s.d.=1.05. Construct a 90% CI for the population mean length of hospital stay for total knee replacement.

Confidence Intervals
Solution : As n=90 is relatively large, sample s.d. can be used to approximate population s.d. 90% CI = X 1.65 ( / n)

4.2 1.65 (1.05 / 90 ) = (4.02, 4.38)

The t - Distribution
WilliamS.Gosset (1876 1937)

A small sample from normal distribution Unknown population standard deviation, x- t= with n -1 degrees of freedom s/ n

The (Students) t-distribution is very similar to normal distribution, with heavier tails.

t Table (tail probabilities of the t-distributions)


Degrees of Freedom
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

2Q (Q)

0.10 (0.05)
6.3138 2.9200 2.3534 2.1318 2.0151 1.9432 1.8946 1.8595 1.8331 1.8125 1.7959 1.7823 1.7709 1.7613 1.7530 1.7459 1.7396 1.7341 1.7291 1.7247 1.7207 1.7171 1.7139 1.7109 1.7081 1.7056 1.7033 1.7011 1.6991 1.6973

0.05 (0.025)
12.706 4.3026 3.1825 2.7764 2.5706 2.4469 2.3646 2.3060 2.2621 2.2281 2.2010 2.1788 2.1604 2.1448 2.1314 2.1199 2.1098 2.1009 2.0930 2.0860 2.0796 2.0739 2.0687 2.0639 2.0595 2.0555 2.0518 2.0484 2.0452 2.0423

0.01 (0.005)
63.657 9.9251 5.8408 4.6040 4.0323 3.7075 3.4995 3.3555 3.2498 3.1693 3.1057 3.0545 3.0122 2.9768 2.9467 2.9207 2.8982 2.8784 2.8609 2.8453 2.8313 2.8187 2.8073 2.7969 2.7874 2.7787 2.7707 2.7632 2.7564 2.7500

0.005 (0.0025)
127.32 14.0911 7.4533 5.5980 4.7734 4.3169 4.0293 3.8326 3.6895 3.5814 3.4967 3.4285 3.3726 3.3258 3.2862 3.2521 3.2226 3.1967 3.1738 3.1535 3.1353 3.1189 3.1041 3.0906 3.0783 3.0670 3.0566 3.0470 3.0382 3.0299

0.001 (0.0005)
636.62 31.6075 12.9258 8.6087 6.8701 5.9590 5.4088 5.0421 4.7805 4.5871 4.4374 4.3184 4.2215 4.1412 4.0735 4.0157 3.9659 3.9224 3.8841 3.8502 3.8200 3.7928 3.7683 3.7461 3.7258 3.7073 3.6903 3.6746 3.6601 3.6466

Confidence Intervals
Problem:
A fellow wanted to determine the average serum creatinine level among healthy elderly adult male subjects from Stony Brook village. From the literature she could not find any information on on or of sCr among local healthy elderly males. She measured 15 health elderly male volunteers from Stony Brook and the sample mean sCr is 0.94 mg/dL with a sample standard deviation of 0.15 mg/dL. What should be the 95% CI for ?

Critical t-value
(tail probabilities of the t-distributions)
Degrees of Freedom 1 . . 2Q (Q) 0.10 (0.05) . . . 1.7709 1.7613 1.7530 . . 0.05 (0.025) . . . 2.1604 2.1448 2.1314 . . 0.01 (0.005) . . . 3.0122 2.9768 2.9467 . . 0.005 (0.0025) . . . . . . . . 0.001 (0.0005) . . . . . . . .

13 14 15
. .

Confidence Intervals
Solution : 95% CI for = x 2.14 ( s / = 0.94 2.14*(0.15 / = 0.94 0.08 = (0.86, 1.02) 15 )
n )

Hypothesis Testing
Example: As the serum creatinine normal range depends on the population studied. A fellow wanted to evaluate the mean serum creatinine among adult males living in Stony Brook. From the literature she found that one well-established study showed an average of sCr of 1.18 mg/dL for adult males living on the west coast. But based on her knowledge and experience, she believes that the of sCr among local adult males should be different. She decided to check this by measuring sCr for 49 local adult male volunteers.

Hypothesis Testing
Basic steps of hypothesis testing:
1. State null (H0: ) and alternative (H1:) hypotheses 2. Choose a significance level, (usually 0.05 or 0.01) 3. Determine the critical (or rejection) region and the non-rejection region, based on the sampling distribution 4. Based on the sample, calculate the test statistic and compare it with the critical values 5. Make a decision, and state the conclusion

Hypothesis Testing (cont.)


Rejection region:

(1- ) Non-rejection region

Critical value

Hypothesis Testing (cont.)


Example (cont.): Say, the average sCr of the sample of 49 locals is 1.22 mg/dL and the population

standard deviation is 0.15 mg/dL (based on literature for othe similar studies). Step 1. State H0: and H1: H0 : sCr= 1.18 vs. H1 : sCr 1.18.
X sCr 1 .22 1 .18 Z= = = 1 .87 . / n 0. 15 / 49

Step 2. Choose a significant level, say, =0.05.

Hypothesis Testing (cont.)


Step 4. Determine the critical region and the non-rejection region: The critical value: 1.96. The rejection region: | Z | 1.96. The non-rejection region: | Z | < 1.96. Step 5. Make a decision, based on the sample, and state the conclusion: As the test statistic Z = 1.87 < 1.96, it is within the non-critical region. Therefore, we do not reject the null hypothesis. We conclude that there is no evidence that the average sCr among local adult males is different from 1.18 mg/dL.

One-tailed vs. Two-tailed Test


Right Sided Test

Left Sided Test Two Sided Test


= rejection region

Errors & Power


Type I Error () False positives, errors due to chance - Reject H0 when H0 is true Type II Error () False negatives - Accept H0 when H0 is false Power: ( 1- ) = 1 - P (Type II Error)

Statistical Decision
Type II error:

(1- ) Non-rejection region

Power: 1-

Critical value Rejection region (Type I error):

Statistical Decision
Truth H0 True Reject H0 Decision Not reject H0 1- H0 False 1-

Note: Statistically significant does not necessarily mean biologica (or clinically) significant!!!

Hypothesis Testing vs. CI


There is a strong relationship between a CI and a hypothesis testing. In general, If the hypothesized value (say, H0 : =0) is included in the (1- )100% confidence interval, H0 is not rejected at level;

p-Values
Interpretation: The p-value is the probability of obtaining a result as extreme or more extreme than the one observed based on the current sample, given the null hypothesis is true.

p-Values
Stony Brook Adult Male sCr Example: H0: =1.18, = 0.15, X=1.22

p-value =?

p-value =?

1.18

1.22

1.87

Standard Normal Table


(Two-Sided Tail Probabilities of the Normal Curve)

Z
0.0 . . . . 1.7 1.8 . . .

0.02 0.9920 . . . . .0873 .0703 . . .

0.03 .9840 . . . . .0836 .0672 . . .

0.04 .9681 . . . . .0819 .0658 . . .

. . . . . . . . . . .

0.07 . . . . . . 0.061 . . .

. . . . . . . . . . .

0.09 . . . . . . . . . .

p-Values
Stony Brook Adult Male sCr Example H0: =1.18, = 0.15, X=1.22

p-value = 0.061

-1.87

1.87

p-Values
What if X=1.23? H0: =1.18, = 0.15, n=49.
X sCr 1.23 1 .18 Z= = = 2.31 . / n 0 .15 / 49

p-value =?

p-value =?

1.18

1.23

2.31

Checking the Z-Table, p-value = 0.021 .

Review of Lecture 2
1. 2. 3. 4. 5. 6. 7. Sampling distribution Central Limit Theorem Confidence interval t-distribution Hypothesis testing Types I & II errors, statistical power p-value

Achieving Objectives
1. To describe sampling distribution and Central Limit Theorem, and comprehend their importance in Statistics 2. To correctly construct and interpret confidence intervals for population means 3. To describe basic steps of hypothesis testing, using real life examples 4. To correctly define type I & II errors, statistical power, effect size 5. To correctly interpret p-value of a statistical test

Next Month
2. Goals of statistics; descriptive statistics; normal distribution; AUC 3. Sampling distribution; CI; hypothesis testing; p-value; power 4. Common statistical tests: one sample ttest, two independent sample t-test, two paired sample t-test, chi-sq. test, Fishers exact test Lecture notes:

http://ms.cc.sunysb.edu/~jjchen

Introduction to Medical Statistics


Lecture 3
John J. Chen, Ph.D. Assistant Professor in Biostatistics Stony Brook University School of Medicine
Biostatistics for Fellows and Residents of GCRC & Surgery Dept. February 2, 2005

Three Lectures
2. Goals of statistics; descriptive statistics; normal distribution; AUC 3. Sampling distribution; CI; hypothesis testing; p-value; power 4. Common statistical tests: one sample ttest, two independent sample t-test, two paired sample t-test, chi-sq. test, Fishers exact test

Outline of Lecture 3
1. 2. 3. 4. 5. 6. 7. 8. Hypothesis testing Types I & II errors, statistical power p-value One-sample t-test Two independent samples t-test Two paired samples t-test Chi-squared test & Fishers exact test Local biostatistical resources

Lecture notes:

http://ms.cc.sunysb.edu/~jjchen

Central Limit Theorem


The distribution of sample means (sampling distribution) from a population is approximately normal if the sample size is large, i.e., 1. The population distribution can be non-normal.
Z=
2 X ~ N ( X , X )

X ~ N (0,1) / n

2. Given the population has mean , then the mean of X = the sampling distribution, 3. if the population has variance 2, the standard deviation of the sampling distribution, or the standard error (a measure of the amount of sampling error) is s.e.( X ) = = X n

Hypothesis Testing
Example: As the serum creatinine normal range depends on the population studied. A fellow wanted to evaluate the mean serum creatinine among adult males living in Stony Brook. From the literature she found that one well-established study showed an average of sCr of 1.18 mg/dL for adult males living on the west coast. But based on her knowledge and experience, she believes that the of sCr among local adult males should be different. She decided to check this by measuring sCr for 49 local adult male volunteers.

Hypothesis Testing (cont.)


Basic steps of hypothesis testing:
1. State null (H0:) and alternative (H1:) hypotheses 2. Choose a significance level, (usually 0.05 or 0.01) 3. Determine the critical (or rejection) region and the non-rejection region, based on the sampling distribution under the null hypothesis 4. Based on the sample, calculate the test statistic and compare it with the critical values 5. Make a decision, and state the conclusion

Hypothesis Testing (cont.)


Rejection region:

(1- ) Non-rejection region

Critical value

Hypothesis Testing (cont.)


Example (cont.): Say, the average sCr of the sample of 49 locals is 1.22 mg/dL and the population standard deviation is 0.15 mg/dL (based on literature Step other similar studies). on 1. State H0: and H1: H0 : sCr= 1.18 vs. H1 : sCr 1.18.
X sCr 1 .22 1 .18 Z= = = 1 .87 . / n 0. 15 / 49

Step 2. Choose a significant level, say, =0.05.

Hypothesis Testing (cont.)


Step 4. Determine the critical region and the non-rejection region: The critical value: 1.96. The rejection region: | Z | 1.96. The non-rejection region: | Z | < 1.96. Step 5. Make a decision, based on the sample, and state the conclusion: As the test statistic Z = 1.87 < 1.96, it is within the non-critical region. Therefore, we do not reject the null hypothesis. We conclude that there is no evidence that the average sCr among local adult males is different from 1.18 mg/dL.

Errors & Power


Type I Error () False positives, errors due to chance - Reject H0 when H0 is true Type II Error () False negatives

Power:

- Dont reject H0 when H0 is false ( 1- ) = 1 - P (Type II Error)

Statistical Decision
Design factors: - effect size - power - alpha level - std. dev. - sample size (1- ) Non-rejection region Power: 1- Type II error:

Now, what is a p-value?

Critical value Rejection region (Type I error):

p-Values
Interpretation: The p-value is the probability of obtaining a result as extreme or more extreme than the one observed based on the current sample, given the null hypothesis is true.

p-Values
Stony Brook Adult Male sCr Example: H0: =1.18, = 0.15, X=1.22

p-value =?

p-value =?

1.18

1.22

1.87

Standard Normal Table


(Two-Sided Tail Probabilities of the Normal Curve)

Z
0.0 . . . . 1.7 1.8 . . .

0.02 0.9920 . . . . .0873 .0703 . . .

0.03 .9840 . . . . .0836 .0672 . . .

0.04 .9681 . . . . .0819 .0658 . . .

. . . . . . . . . . .

0.07 . . . . . . 0.061 . . .

. . . . . . . . . . .

0.09 . . . . . . . . . .

p-Values
Stony Brook Adult Male sCr Example H0: =1.18, = 0.15, X=1.22

p-value = 0.061

-1.87

1.87

p-Values
What if X=1.23? H0: =1.18, = 0.15, n=49.
X sCr 1.23 1 .18 Z= = = 2.31 . / n 0 .15 / 49

p-value =?

p-value =?

1.18

1.23

2.31

Checking the Z-Table, p-value = 0.021 .

One Sample t Test


Problem: Neonates gain (on average) 100 grams/wk in the first 4 weeks. A sample of 25 infants given a new nutrition formula gained 112 grams/wk (on average) with standard deviation = 30 grams. Is this statistically significant? Test: H0: = 100 H1: = 100

One Sample t Test (cont.)


Solution:

88

100

112

-2.0

2.0

t = 112/ 100 = 2.0 30 25

t 24, 0.05 = ?

Critical t Value
(tail probabilities of the t-distributions)
Degrees of Freedom 1 . . 2Q (Q) 0.10 (0.05) . . . 0.05 (0.025) . . . 0.01 (0.005) . . . . . . . . 0.005 (0.0025) . . . . . . . . 0.001 (0.0005) . . . . . . . .

23 24 25
. .

1.7139 1.7109 1.7081


. .

2.0687 2.0639 2.0595


. .

One Sample t - Test


Solution:
Non-rejection Region 88 100 112 -2.06 0 2.0 2.06

t=

112 100 30 / 25

= 2.0 < t 24, 0.05 = 2.06


Therefore, do not reject.

Two Independent Samples t - Test


Population 1
(1 , )

Sample 1 (X1 , S1)

Population 2
(2 , )

Sample 2 (X2 , S2)

Test: H0: 1 = 2 versus H1: 1 2 , assuming 12 = 22 = 2.

Two Independent Samples t - Test


Test statistic: (x1 x2) ( - ) t= Sx x
1 2 1 2

with (n1 + n2 2) d.f. 1

Sx
2

x2 = Sp

n1

n2
2

s (n 1) + s (n 1) Sp = n +n 2
1

Two Independent Samples t - Test


Problem: Two headache remedies Brand A: X1 = 20.1, S1 = 8.7, n1 = 12 Brand B: X2 = 18.9, S2 = 7.5, n2 = 12 Test: H0: 1 = 2 1 - 2 = 0 H1: 1 = 2

Two Independent Samples t - Test


Solution: x1 x2 t= Sx1 x2 Sp =
2
2

20.1 18.9

Sx1
2

x2

1.2

Sx1

x2 = 65.97

(12 1) 8.7 + ( 12 1) 7.5 12 + 12 - 2

= 832.59 + 618.75 22

Sp = 65.97 = 8.12 Sx1


x2 = 8.12*

1 1 = 3.3 + 12 12

t = 1.2 = 0.36

3.3

t 0.05, 22 = ?

t Table
(tail probabilities of the t-distributions)
Degrees of Freedom 1 . . 2Q (Q) 0.10 (0.05) . . . 0.05 (0.025) . . . 0.01 (0.005) . . . . . . . . 0.005 (0.0025) . . . . . . . . 0.001 (0.0005) . . . . . . . .

21 22 23
. .

1.7207 1.7171 1.7139


. .

2.0796 2.0739 2.0687


. .

Two Independent Samples t - Test


Solution (cont.):
Therefore, do not reject the null, i.e., no statistically significant difference was found between the two remedies.

-2.07

0 0.36

2.07

t=

1.2 = 0.36, 3.3

t 0.05, 22 = 2.07

Two Paired Samples t - Test

Same subject for both treatments: -- placebo (X1) versus active (X2) -- before (X1) versus after (X2) Intra individual comparison, e.g., left (X1) versus right (X2)

Approach: reduce data to one sample t-test problem. First, calculate the difference, d = X2 - X1 , for each subject; then, perform one sample t-test on the d scores, with d.f.=n-1.

Formulas for the Paired t - Test


Test: H0: population mean difference, D, is zero. Test statistic: di i d= n Sd = Sd / n ( d d) Sd = n-1
2

t= d-0 Sd

Paired t - Test
Problem: Does the medication significantly lower blood pressure?
Subject 1 2 3 4 5 Reaction to Placebo 150 180 148 172 160 Reaction to Med. 130 148 126 150 136

Paired t - Test
Subject 1 2 3 4 5 Total Reaction to Placebo 150 180 148 172 160 Reaction to Medication 130 148 126 150 136

d
20 32 22 22 24 120

Paired t - Test
Solution: Sd = di i d= n
2

120 = = 24 5
=

( d d) n-1

22

Sd = Sd / n = 22 / 5 = 2.1 d - 0 = 24 = 11.4 t= 2.1 Sd

Critical t value

5-1, 0.05

=?
(tail probabilities of the t-distributions)
2Q (Q) 0.10 (0.05) . . 0.05 (0.025) . . 0.01 (0.005) . . 0.005 (0.0025 ) . . . . . . . 0.001 (0.0005) . . . . . . .

Degrees of Freedom 1 .

3 4 5 .
.

2.3534 2.1318 2.0151 .


.

3.1825 2.7764 2.5706 .


.

5.8408 4.6040 4.0323


. .

Paired t - Test

11.4 -2.78 0 2.78

At 5% level, t - table = 2.78 For a two-sided test: reject!

There is a statistically significant effect!

Tests
2

Karl Pearson (1857 1936)

Expected and observed frequencies are compared


Goodness of Fit of a single variable Test of Independence of two variables

Percentiles of the 2 - Distributions Table


Degrees of freedom
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

0.100
2.7055 4.6052 6.2514 7.7795 9.2363 10.6447 12.0171 13.3616 14.6836 15.9872 17.2750 18.5493 19.8120 21.0641 22.3071 23.5418 24.7690 25.9894 27.2036 28.4120 29.6151 30.8133 32.0069 33.1962 34.3816

0.050
3.8415 5.9915 7.8147 9.4877 11.0705 12.5916 14.0671 15.5073 16.9190 18.3071 19.6751 21.0261 22.3621 23.6848 24.9958 26.2962 27.5871 28.8693 30.1435 31.4104 32.6706 33.9244 35.1725 36.4151 37.6525

0.010
6.6349 9.2102 11.3447 13.2768 15.0864 16.8118 18.4751 20.0900 21.6658 23.2095 24.7250 26.2170 27.6882 29.1409 30.5778 31.9998 33.4086 34.8052 36.1912 37.5660 38.9321 40.2893 41.6383 42.9797 44.3144

0.005
7.8944 10.5963 12.8383 14.8605 16.7495 18.5479 20.2776 21.9549 23.5891 25.1886 26.7569 28.2999 29.8195 31.3198 32.8014 34.2675 35.7186 37.1562 38.5823 39.9970 41.4017 42.7955 44.1808 45.2291 46.9280

0.001
10.828 13.8173 16.2672 18.4667 20.5165 22.4599 24.3219 26.1237 27.8768 29.5871 31.2628 32.9099 34.5283 36.1258 37.6973 39.2520 40.7908 42.3131 43.8206 45.3141 46.7982 48.2678 49.7262 51.1831 52.6165

Tests
2

Goodness of Fit observed frequencies on a single variable are compared with a corresponding set of expected values ( or theoretical frequencies) (OE) = E
2 2

with (# of categ. 1) d.f.

O = observed frequency E = expected frequency

Tests
2

(One variable Goodness of fit)

Example. Rolling a die 60 times:


obs exp (obs-exp) (obs-exp)2 (obs-exp)2 /exp 1 6 10 -4 16 1.6 2 3 4 5 6 8 12 15 14 5 10 10 10 10 10 -2 2 5 4 -5 4 4 25 16 25 0.4 0.4 2.5 1.6 2.5

Tests
2

2 = 1.6+0.4+0.4+2.5+1.6+2.5 = 9.0,
with d.f.= (# of categ.) 1 = 6 1 = 5. From Chi-sq. Table, 25, 0.10= 9.2363, i.e., do not reject the null hypothesis, and the p-value is about 0.15.

Tests
2

Test of Independence two categorical variables are involved, and the observed and expected frequencies are compared. Here the expected frequencies are those the researcher would expect if the two variables were independent of each other.

Tests Tests of Independence


2

Observed:
R1 R2

C1 A B A+B

C2 C D C+D A+C B+D A+B+C+D

Expected: (row total) * (column total) E= (grand total)


d.f. = (r-1)*(c-1)

Tests of Independence
2

(Two variables of the same sample)

Problem: Is the digital rectal exam result (DRE)


independent of the biopsy result (BIOP)?

Observed:
DRE+ BIOP+ BIOP50 10 60 DRE20 20 40 70 30 100

2 Tests of Independence

Solution:
O 50 10 20 20 E ? ? ? ? 50 10 60 20 20 40 70 30 100

60 * 70 E= 100 = 42

2 Tests of Independence

Solution:
O 50 10 20 20 E 42 18 28 12 (O-E) 8 -8 -8 8 (O-E) 64 64 64 64
2

(O-E) / E 1.52 3.56 2.29 5.33 2 = 12.70

2with (r-1) * (c-1) d.f.


= (2-1) * (2-1) d.f. = 1 d.f.

2 Table

(Percentiles of the 2 - Distributions Table)


Degrees of Freedom

0.100 2.7055 4.6052 6.2514 7.7795 . .


.

0.050 3.8415 5.9915 7.8147 9.4877 . .


.

0.010 6.6349 9.2102 11.3447 13.2768


. . .

0.005 7.8944
. . . . . .

0.001 10.828
. . . . . .

1 2 3 4 . .
.

Therefore, the results from the two are not independent (p < 0.001).

Fishers Exact Tests


Fishers Exact Test for small expected frequencies If any expected frequencies are < 2 or if half of the expected frequencies are < 5, you should use Fishers Exact Test instead of 2.

Fishers Exact Tests


R.A. Fisher (1890-1962)

Fishers Tea Tasting Experiment


Guess Poured First Milk Poured First Milk Tea 3 1 4 Tea 1 3 4 4 4 8

Fishers Exact Tests


Fishers Tea Tasting Experiment (cont.):
Based on hypergeometric distribution, the p-value is the sum of all probabilities for tables that give even or more evidence in favor of the ladys claim.
Guess Poured First
Milk Tea

Guess Poured First


Milk Tea

Guess Poured First


Milk Tea Milk Tea

Poured First

Milk Tea

0 4

4 0

Milk Tea

3 1

1 3

4 0

0 4

p-value = P(1,1) (3) + P(1,1) (4) = 0.229 + 0.014 = 0.243

Therefore, the experiment did not establish a significant association between the actual order of pouring and the womans guess.

A Summary of Statistics
Sampling Population Parameters (, ) Probability
Descriptive Statistics Sample Statistics (X, S)
Normal, t, and Chi-sq Sampling distribution Central Limit Theorem CI, hypothesis testing p-value, Type I & II error

Descriptive Statistics Continuous variable

Inference (Inferential Statistics)


Outcome
One sample t-test If paired

Two samples t-test

Categorical variable
One categorical variable Chi-sq GOF test Two categorical variables Chi-sq independence test Fishers exact test( exp.)

Continuous independent variables Linear regression

Categorical independent variables Analysis of variance (ANOVA)

If a mixture of cont. and categ. indep. variables, General linear models Time to event data Survival analysis If no distribution assumptions, Non-parametrics Others:

Mixture of independent variables Logistic regression (binary outcome) longitudinal analysis, factor analysis,

cluster analysis, principal component analysis, etc.

Local Biostatistical Resources


Courses:
1. SOM: Introduction to Preventive Medicine Biostatistics section: 6 hours 2. Preventive Medicine Residency Program: Principles of Biostatistics: ~24 hours 3. MPH: Introduction to Biostatistics; Advanced Biostatistics 4. Clinical Scholar Training Program

GCRC: biostatistical support for GCRC projects Biostatistical Consulting Core:


http://www.uhmc.sunysb.edu/prevmed/biostat.htm

Review of Lecture 3
1. 2. 3. 4. 5. 6. 7. 8. Hypothesis testing Types I & II errors, statistical power p-value One-sample t-test Two independent samples t-test Two paired samples t-test Chi-squared test & Fishers exact test Local biostatistical resources

Lecture notes:

http://ms.cc.sunysb.edu/~jjchen

You might also like