You are on page 1of 8

James Tomsik

Rania Shalabi

Math 1040

12/13/17

Hello, the purpose of this assignment I will be showing the results of the statistics found in color

per skittles bag. I will be providing the colors for my bag, the color for my class, and the colors for all

students taking statistics. In this paper, you will see many things such as the mean amount of skittles per

bag, the standard deviation of skittles per bag, and the five-number summary. The five-number

summary includes minimum value, the first quartile, second quartile or also known the median, the

third quartile, and the maximum value. Tests will be conducted. I will use confidence intervals and

hypothesis testing to show proper analysis of data. Graphs will be included, ranging from box plots to

histograms. Pareto charts to pie charts.


Proportions for entire class are listed below:

Red: 20.871%

Orange: 20.254%

Yellow: 18.889%

Green: 20.221%

Purple: 19.766%

Proportions for my bag of skittles are listed below:

Red: 17.74%

Orange: 25.81%

Yellow: 11.29%

Green: 24.19%

Purple: 20.97%

Using this information along with the information gathered from the rest of the bags, I was able

to conclude that that the highest color in the skittles bag is red. Followed by orange, yellow, green,

purple. In my bag of skittles, it was slightly different with orange being the highest color and yellow

being the lowest.


The information in the graphs shown above is for the total number of skittles per bag for the

entire class. The mean number of candies per bag was 60.3 Skittles. The standard deviation was 3.44

Skittles per bag. Along with the mean and standard deviation, I was able to calculate the five-number

summary of all skittles in the classes. The minimum amount of Skittles per bag was 50. The maximum

amount per bags was 65. The first quartile range was 59 Skittles, the second quartile (median) was 61
Skittles, and the third quartile was 62 Skittles. Along with this, the shape of the distribution of Skittles

for the entire class was skewed left. The information varies but my guess was close because I had 62

Skittles in my bag.

Data can be very informative when it comes to breaking down and describing statistics. The

charts shown up above do a great job with teaching people the visual aspect of statistics. They show

numbers being broken down, proportions of items, etc. When trying to understand in depth on just how

great statistics can be, quantitative explanations do the trick. So, allow me to explain. Box plots and

frequency tables are great for understanding specific number of Skittles per student. If you used a pie

chart on that large of data, youd have some thin slices! Its almost like mix and match. If you have a

large amount of numbers, use frequencies! Pie charts and such give a general idea.

So, the general purpose of confident intervals is to show the level of confidence a value will fall

within a specific set of parameters. To make it sound easier, confidence intervals are how confident you

are that one value will fall between two other values. To apply this information to the data above. The

number of yellow Skittles for all the classes is 581. The total number of all the Skittles is 3076. So, I could

say that I am 99% confident that the true proportion of yellow skittles should be between 526 Skittles

and 637 Skittles. So, the number of Skittles is 581, therefore the number falls into the set data! You

could apply this to the mean as well. Lets say were looking for the 95% chance regarding the mean

number of Skittles. After plugging in the data in my trusted TI-84 Plus Silver Edition, I am 95% confident

that the true mean number of Skittles is between 587.11 and 643.29. The mean number of Skittles for

all classes is 615.2

Next, Id like to discuss the purpose of hypothesis testing. We use hypothesis statements

regarding a characteristic of one or more populations based on sample evidence and probability.
Applying this to the Skittles project can show great things. We use a significance level of .05 to test the

claim that 20% of skittles are red. In this, our null hypothesis would be H0: P=.20 and our alternative

hypothesis would be H1: P.20. Since our p value equals .23 which is greater than .05 and our z value

equals 1.21 which is less than 1.645, we fail to reject the null hypothesis. Because of this, there is not

sufficient evidence to support the claim that 20% of the skittles are not red. Next, we are using a

significance level of .01 to test the claim that the mean number of candies in a bag is 55. In this, our null

hypothesis is H0:=55 and our alternative hypothesis is H1:55. After inputting the data in the

calculator, we can determine that the p value for this test is 5.85E-15 and our t-value is 11.00. The p-

value in this is significantly smaller than our significance level of .01 and our t-value does not fall

between the critical values of |2.675| so we reject the null hypothesis. Because of this, there is

sufficient evident to support the claim that the mean number of Skittles per bag is not 55.
To conclude, the conditions that must be met for confidence intervals and hypothesis testing are

that the samples need to be random, np(1-p)>10, and that the samples are independent of each other.

All the samples taken for this exercise met those conditions. There are many human errors that couldve

occurred during these tests. But most have been eliminated due to technology though. A suggestion

would be to improve sample size by reaching out to other classes. It would improve calculations. The

average number of skittles per bag was 60.3 with a standard deviation of 3.44. Red tends to be the most

frequent color per bag with a proportion of 20.871%.

You might also like