Professional Documents
Culture Documents
DeAnna Brewster
Math 1040
Professor Hilton
12/5/2016
Math 1040 Skittles Term Project-Part 2
Candy Color is a Qualitative Variable because it describes the color (a characteristic) of the candy. The level of
measurement Nominal because color describes a category, there is no natural order, differences do not make
sense and ratios do not make sense.
Number of Candies per bag is a Quantitative-Discrete Variable because it is a numerical measure and is
countable without skipping any values. The level of measurement is Ratio because it is quantitative data, there
is a natural order, differences do make sense and the natural zero means none.
Total number of Skittles Counted = Sample Size = 3551
Summary:
716
Red
Orange
Yellow
Green
Purple
Total
20.16% 698
19.66% 726
20.44% 710
19.99% 701
19.74% 3551
100%
Pie Chart
Pareto Chart
Summary statistics:
Column
Total
Candies Per
Bag
Mean
60 59.183333 3.1108976
59
14
50
64 58 61.5
3.5
59
Fences:
Lower Fence = 58 1.5(3.5) = 52.75
Upper Fence = 61.5 + 1.5(3.5) = 66.75
Outliers = 50, 52
The Total Number of Candies from my bag was 61 therefore my bag was not one of the outliers.
Shape: For the qualitative data the graphs dont show dispersion (how spread out the data is) and
on a Pareto chart the data is entered from highest to lowest. This information seems pretty
uniform.
For the quantitative data the graphs seem to be a little skewed right with the Mean being
a little larger than the median. But just by looking at them they look a little skewed Left maybe
because of the outliers.
2
12
22
32
42
52
Systematic sample:
Height
64
70
61
80
65
66
3 Candies
52
57
58
59
61
62
Problem 1: Suppose you are going to randomly select two Skittles from the bag YOU purchased.
(a) What is the probability that both Skittles are purple if you select them with replacement?
9/61 * 9/61 = .0218
(b) What is the probability that both Skittles are purple if you select them without replacement?
9/61 *8/60 = .0197
(c) What is the probability that at least one Skittle is purple if you select them with replacement?
= .2733
Problem 2: Suppose all of the Skittles in the class data set are combined into one large bowl and you are going
to randomly select one Skittle.
(a) What is the probability that you select a green Skittle?
710/3551 = .1999
(b) What is the probability that you select a Skittle that is NOT green?
1- P(all are green) = 1- 710/3551 = .8001
(c) What is the probability that you select a Skittle that is red OR yellow?
716/3551 +726/3551 = .4061
(d) What is the probability that you select a Skittle that is orange GIVEN that it is a secondary color
(secondary colors are green, orange and purple)?
698/2109 = .3310
Problem 3: Suppose all of the Skittles in the class data set are combined into one large bowl and you are going
to randomly select ten Skittles with replacement and count how many are yellow.
(a) Show that this meets the requirements of the binomial probability distribution and identify n and p.
n = 10, p = .1966
1.
2.
3.
4.
(b) What is the probability that exactly 4 of the 10 Skittles are yellow? Calc>Vars>Binompdf
Trials = 10, P = .2044, x = 4 P(4 of 10 will be yellow) = .0930
(c) For samples of size 10, what is the expected value and standard deviation for the number of yellow
skittles that will be included?
N = 10, p = .2044, x = 4
Expected value = n*p = 10*.2044 = 2.044
Standard deviation =
= 1.2752
Problem 4: For this problem, treat a 2.17 ounce bag of Skittles as an individual. Suppose the values for our class
data are the parameter values for all 2.17 ounce bags of Skittles. In other words, assume = mean number of
candies per bag in our class data set and = standard deviation of number of candies per bag in our class data.
Mean = 59.18 SD= 3.111
(a) Describe the sampling distribution for the mean number of candies per bag for samples of 32 bags.
Include center, spread and shape. Note: The shape of the SAMPLING DISTRIBUTION is different
from the shape of the population, which you determined in Part 2 of the project.
Center: The mean number of candies per bag for the sample size of 32 bags equals the mean of the
population at 59.18. The balancing point stays the same.
Spread: The standard deviation of the distribution of the sample mean = .5499522991, less than the
standard deviation of the population. The spread will get smaller as the sample size gets bigger.
Shape: The shape is approximately normal since the sample size is greator than 30, in this case the
sample size is 32.
(b) What is the probability that the mean number of candies per bag for a sample of 32 bags is greater than
58.5? Calc>2nd>Vars>normalcdf
lower 58.5, upper 1E99, 59.18, 3.11/
P(x > 58.5) = 1- .1081418575 = .8919 or approximately 89.2%
Population Proportion
Verify that n (1- ) >10 (the normality condition)
Verify that n 0.05N (the sample size is no more than 5% of the population size, the
independence condition.
Population Mean
Sample data came from a Simple Random Sample or randomized experiment
Sample size is small relative to the population size (n 0.05N)
The data came from a population that is normally distributed, or the sample size is large.
A(1-a)100% confidence is interval for is given by
Lower Bound - *
Upper Bound + *
Where
Using values for the class data that you computed in Part 2 of the project, construct a 99% confidence
interval estimate for the true proportion of yellow candies using the class data as your sample.
Remember that for this computation, n is the number of CANDIES for the entire class data. Include all
your work, showing the formula used and appropriate values inserted (neatly written and scanned or
typed). (10 points) Calc>Stats>Tests> 1-PropZInt
X = 726 # of yellow Skittles in class data
n= 3551 total # of all skittles in class data
C= .99
Lower Bound (.18702, .22188) Upper Bound
Based on your interval for the true proportion of yellow candies, was the proportion of yellow candies in
the single bag of candy you purchased a likely value for the true population proportion? Explain how
you know using actual values from your data and computations. (5 points)
No, from the bag I purchased, I had 11 yellow candies out of a total of 61 for a proportion of .180. So
my bag falls a little short of the .187 needed and does not fall within the range of .187 and .222.
Using values you computed in Part 2 of the project, construct a 95% confidence interval estimate for the
true mean number of candies per bag using the class data as your sample, but for this computation, n is
the number of BAGS. Include all your work, showing the formula used and appropriate values inserted
(neatly written and scanned or typed). (10 points)
Calc>Stats> Tests> Tinterval> Stats
n= 60 which is greater than 30 so I did not take out the outliers
= 59.183333
= 3.1108976
= .95
Lower Bound (58.38, 59.987) Upper Bound
Based on your interval for the true mean number of candies per bag, was the total number of candies in
the single bag you purchased a likely value for the population mean? Explain how you know using
actual values from your data and computations. (5 points)
My bag of skittles contained 61 candies so, it does not fall within the likely values of 58.38 and 59.987
for the population mean.
A hypothesis test is a procedure based on sample results and probability that tests hypotheses about a
population. It is used to determine whether there is enough evidence in a sample of data to infer that a certain
condition is true for the entire population.
A hypothesis test examines two opposing hypotheses about a population: the null hypothesis and the alternative
hypothesis. The null hypothesis is the statement being tested indicating no change, effect, difference or
relationship in the population. It is assumed to be true until evidence indicates otherwise. The alternative
hypothesis is the statement that you are trying to find evidence to support.
Based on the sample data, the test determines whether to reject the null hypothesis.
Using values for the class data that you computed in Part 2 of the project and a 0.05 significance level,
test the claim that 20% of all Skittles candies are red. Show all the steps (neatly written and scanned,
typed, or copied from StatCrunch) including:
1. The hypotheses with correct notation (4 points)
2. The conditions for performing the hypothesis test, along with checking that they are methint:
they are not all met! (5 points)
1. Simple Random Sample-This requirement was not met because our entire class was assigned
to purchase a bag of skittles we did not use chance or an objective device to select people to
purchase bags of skittles from the population to be included in the sample. This was a
convenience sample.
2.
3551(.20)(1-.20) = 568.16 10
3. n< .05N The sample size of 3551skittles is less than all of the skittles in the population
= .2433
4. The p-value (2 points)
Calc>Stat>Tests> 1-PropZTest (
P= .8078
5. The appropriate decision about the null hypothesis and an appropriate conclusion (4 points) P=
.8078 which is greater than .05
We do not reject the null hypothesis because there is insufficient evidence to conclude that
true.
is
There is insufficient evidence to conclude that the proportion of red skittles does not equal .20.
6. Also describe the Type I and Type II errors for this test. (8 points)
Type I Error- A Type I Error would conclude that the proportion of red skittles is not equal to .20
when it really is.
Type II Error- A Type II Error would be that we fail to conclude that the proportion of red skittles
does not equal .20 when the proportion really does not equal .20.
Using values for the class data that you computed in Part 2 of the project and a 0.01 significance level,
test the claim that the mean number of candies in a bag of Skittles is more than 58. Show all the steps
(neatly written and scanned, typed, or copied from StatCrunch) including:
1. The hypotheses with correct notation (4 points)
2. The conditions for performing the hypothesis test, along with checking that they are methint:
they are not all met! (5 points)
1. The sample is obtained using a simple Random Sample or from a randomized experiment.
This requirement was not met because our entire class was assigned to purchase a bag of
skittles we did not use chance or an objective device to select people to purchase bags of
skittles from the population to be included in the sample. This is a convenience sample.
2. The Sample has no outliers and comes from a normal population, or the Sample size (n) is
30. There are outliers in our sample but we have a sample size of 60 30.
3. The sample values are independent of each other.
3. The test statistic (2 points) Calc>Stats>Tests> T-Test
(
: 58,
59.183333,
.1108976,
t = 2.9464
4. The p-value (2 points) Calc>Stats>Tests> T-Test
: 58,
59.183333,
.1108976,
P= .0023
5. The appropriate decision about the null hypothesis and an appropriate conclusion (4 points)
.0023 < 0.01 therefore we reject the null hypothesis.
There is sufficient evidence to conclude that the mean number of candies per bag is more than 58.
6. Also interpret the p-value for this test. (4 points)
If the mean number of candies in the bag is 58 then the probability of getting a sample mean of
59.183 or more is .0023.