You are on page 1of 42

E370

10/26/2014
Inferential Methods:
Confidence Interval
Estimation
Discrete w/ Probability Distribution Single variable with probability distribution; E(X), V(X)

Linear Combinations Multiple variables with known expected values E(aX + bY), V(aX + bY)

Bernoulli Single trial resulting in one of two mutually exclusive and collectively exhaustive outcomes.

One parameter, E(X)=, V(X) = (1-)

Binomial: X~B(n, ) Repeated, independent Bernoulli trials with constant probability of success.

Parameters n, E(X)=n* V(X) = n**(1-)


Uniform: X~U(a, b) Simplest continuous distribution; pdf is horizontal, parallel to X axis
() ()
= Parameters a, b () = =

Normal: X~N(, ) Bell-shaped and symmetric Parameters ,

Utility Distributions Standard Normal=Z~N(0, 1)

Students t: t(df)

Sampling Distributions Sample mean ~ ,



if X~N or, if n > 30.


Sample proportion ~(, ) if X is a binomial, and n* > 5 AND n*(1-)> 5

Confidence Interval Estimation



Population Mean, known /


/
Population Mean, unknown

( )
Population Proportion, n*>5, n*(1-) > 5 /

Confidence Interval Estimation To estimate
Population /
Mean when
is known

To estimate
Population /
Mean when
is unknown

To estimate
Population ( )
Proportion /

when n*>5
& n*(1-) > 5
Parameters are important!
We never know parameters,
but, we WANT to know parameters!
About the closest we are going to get is to make
educated guesses about parameter values.
Our first inferential task is to figure out how to
estimate parameters.
We will focus on estimating the population mean, ,
and the population proportion, .

The first real inferential method


Thereare two kinds of estimators we will think
about in this class. They are point estimators
and interval estimators.
Point estimators are single values used to estimate
the parameter of interest.
There are several possible estimators for different
parameters; the population mean can be estimated by a
single observation drawn from the population, the sample
mean and the sample median.
Interval estimators produce a range of values a
parameter might reasonably be. The most common
interval estimator is the Confidence Interval.

Estimators
We have options when we choose point
estimators, so criteria have been developed for
selecting the best point estimator.
The best estimators are unbiased; they are close to
the parameter they estimate.
An estimator is unbiased if its expected value is equal to the
parameter being estimated.
The best unbiased estimators are those that vary the
least.
An efficient estimator is one that has the smallest variance
of all estimators.

A word about point estimators


The
unbiased estimator for the population
mean with the smallest variance is the sample
mean.

= and = < (and < )

Theunbiased estimator for the population


proportion with the smallest variance is the
sample proportion.
()
= and =

So, the best are . . .?


. . .about these best estimators.

We know the circumstances under which their


distributions will be normal.

~ if the population is normally distributed or if


n> 30.
p~N if n* >5 and n*(1-) >5

We know something else . . .


Point
estimators are good as far as they go,
which is to give us a single best guess for the
value of the parameter we want to know.
But we know from our work with sampling
distributions that sample means and sample
proportions change with the sample.
So, while a sample mean is the best single value we
have, we dont know how good it is. We need some
idea of just how close our guess is to the parameter.

But we dont stop here . . .


Turns out that using an interval estimator with
the best point estimators gives us just what we
need.
A Confidence Interval is a point estimate in the
middle of an interval estimate.
The interval provides wiggle room which captures
the variation due to the randomness of the point
estimate.
We select how precise we want our estimate to be
and calculate the correct amount of wiggle room.

A point estimator with friends


An interval estimate centered at a point estimate.
They are most often (and always in this class) a
statistic (point estimate) plus and minus a margin of
error (the wiggle room.)
The distance between the point estimate and the
lower end of the interval is always the same as the
distance between the point estimate and the upper
end of the interval.
We pick the level of precision we want and calculate
an interval in the units we need to solve our problem.

The Confidence Interval


Point Estimate the center point of the interval
estimate.
Level of Confidence how sure we are of our estimate
expressed as a probability; the area above the interval.
Critical Values The level of confidence translated into
a number of standard deviations away from the point
estimate.
Alpha The area under the curve outside the interval,
1- Level of Confidence.
Margin of Error The wiggle room, the distance from
the point estimate to either end of the interval.
The Interval The upper and lower limits of the
interval, or the point estimate the margin of error.

Confidence Interval Language


Level of Confidence
Alpha
Parts:
Critical Values
Point Estimate
Margin of Error
The Interval

Anatomy
of a Confidence Interval
=ABS(NORM.S.INV(/2))
=
= NORM.S.INV(1- /2)

=ABS(T.INV(/2, n-1))
=

=T.INV(1- /2, n-1)

() =ABS(NORM.S.INV(/2))
=

= NORM.S.INV(1- /2)

Margins of Error
The Cherry Farmer's Co-op on the peninsula in
Grand Traverse Bay in Michigan is responsible for
packing and shipping the annual cherry harvest.
The co-op coordinator needs to know the average
diameter of the cherries at harvest so that
appropriate packing boxes can be ready for the
different cherry varieties. A random sample of 100
Bing cherries was drawn and measured. The mean
of the sample was 1.2 inches. The standard
deviation of Bing cherries is believed to be 0.1
inches. Calculate a 99% confidence interval for the
population mean diameter of Bing cherries.

An application
What we know:
n=100 = . = .
What we want: An estimate of mean Bing cherry
diameter at a level of confidence of 99%.
Point Estimate
1.2 inches
Margin of error, e

=

=NORM.S.INV(0.995)= 2.576
.
= = . = . . = .

Calculations
We have 99% confidence that the mean
diameter of Bing cherries is between 1.17 and
1.23 inches.
State the level of confidence.
Specify the parameter you are estimating.
State the interval.
You should always know the sample size that was
used in the estimate.

Required Components for


an interpretation
In Denver, Colorado, there was a proposal to establish
an Extraterrestrial Affairs Commission to track space
aliens. "We hope to get the message out that
technology from extraterrestrial origins has been
withheld, and that it can cure cancer," says Jeff
Peckman, the director of the ballot initiative. Mr.
Peckman paid for his own polling. He questioned a
random sample of 199 of the 3974 signatures he
obtained to get the initiative on the ballot. Of those
questioned, 102 were in favor of the initiative.
Calculate a 90% confidence interval for the proportion
of Denver residents in favor of this proposal.

Extraterrestrial Affairs in Denver


What we know:
n=199 =
What we want: An estimate of the proportion of Denver residents
in favor of establishing an Extraterrestrial Affairs Commission at a
level of confidence of 90%.
Alpha?
= 0.10
Point Estimate

= = = .
The Random Variable?
. .
~ . ,
= .
Margin of error, e
()
=

What we know

=NORM.S.INV(0.95)= 1.645
= = . . = .

The 90% confidence interval:


0.5126 0.0582
OR
[0.4544, 0.5708]

Calculations
Mr.Peckman is 90% confident that the
proportion of all Denver residents in favor of
the establishment of an Extraterrestrial Affairs
Commission is 51% plus or minus 6%.

OR

Mr.Peckman is 90% confident that the


proportion of all Denver residents in favor of
the establishment of an Extraterrestrial Affairs
Commission is between 45% and 57%.

Interpret this interval.


Take a look at this set of 100 90% confidence
intervals, calculated assuming that the true
proportion of support for Mr. Peckams
proposal is 51%.
What do you think it means to be 90%
confident?
Lets take a look at the picture.

What does it mean to be


confident?
Mr. Peckmans prognostication was unsound.
The actual proportion of Denver residents in
favor of his initiative was 15%.
Lets calculate a 90% confidence interval using
the actual percentage and compare the two
intervals.
The center of the interval changes and so does the
standard error of the proportion.
..
~(. , = . )

Unfortunately,
Dr. Gupta of CNN's "Paging Dr. Gupta" reported that a poll
of 49 American adults reported 48% exceeding the
recommended dose of over-the-counter (non-prescription)
drugs. Calculate a 95% confidence interval for the
proportion of American adults who exceed the
recommended dose of over-the-counter (non-prescription)
drugs.
What is the random variable?

. .
~ . , = .

What is e?
= . . = . . .
Or [. , . ]

Another application
An important use of confidence intervals is
calculating the minimum n for a particular level
of confidence. We use the margin of error, e, to
estimate n.

=



Solve for n: =

Minimum Sample Size


Lara Giddings, the attorney general for the island
state of Tasmania, stated that Australian wallabies
had been found creating crop circles in fields of
poppies after consuming some of the opiate-laden
crop and running in circles. In order to estimate the
average length of time the wallabies would run in
circles after consuming poppies, a small study
revealed a standard deviation of 5.2 minutes. At a
level of 98%, what size sample is necessary to
estimate the mean circle time for all wallabies
within 30 seconds?

Some practice


=

.

=

= . . . = .

..
=

30 seconds = 0.5 minutes
..
= = .
.

How many Wallabies?


Proportions work the same way as the process
for means.
()
=


= ( )

. . .and proportions. . .
In2008, when Doritos broadcast the first ever
advertisement directed towards potential extra
terrestrial life, 61% of UK residents sampled by
BBC-Lite thought that regular communication
with an alien species was an excellent idea. At
a level of 99%, how many UK residents would
need to be polled to estimate the proportion of
UK residents in favor of communication with
aliens with an interval no wider than 5%?

more practice

= ( )



= . ( . )

= . . . = .

.
= . ( . )

A width of 5%? What is e?
.
= . . = .
.

How many UK residents?


Precision! a function of . . .
a high level of confidence
a narrow interval

How do we get these?


Three things affect the width of the interval
level of confidence
sample size
population standard deviation/population
proportion

What do we want in a
Confidence Interval?
Two are under our control
level of confidence
sample size
Level of Confidence: Ceteris paribus, the higher the
level of confidence, the wider the interval.
Sample Size, n: Ceteris paribus, the larger the
sample size, the narrower the interval.
Population Standard Deviation/Population Proportion
Ceteris paribus, the larger the Population Standard
Deviation, the wider the interval.
Ceteris paribus, as approaches 0.5, the wider the
interval.

How do they affect it?


Itis often forgotten that the choice of Z or t
affects the width of an interval.
Lets review the relationship between Z and t.
t is an approximation of Z.
They are both centered at 0.
Their units are the number of standard deviations
away from the mean.
The t is shorter in the center and taller in the tails
than the Z, that is, there is less probability in the center
of a t and more in the tails compared to the Z.

Anything else?
Z
t

You might also like