Professional Documents
Culture Documents
Introduction
Thus far, we have learned how to conduct a hypothesis test for a population mean and population proportion (Ch. 9), and we have also compared the proportion of successes for two populations (Ch. 10). But what if we wanted to look at the distribution for a categorical variable in a population? Does there exist a statistically significant difference between observed and expected counts? The chi-square test (read "kai;" Greek symbol 2) allows us to determine whether a hypothesized distribution is valid, the details of which will be flushed out in this chapter.
Example 1: Mars, Inc. says that the distribution of M&Ms is as follows: 24% Blue, 20% Orange, 16% Green, 14% Yellow, 13% Red, and 13% Brown. Suppose that the following _____-_____ table gives the data from Enze's bag of M&Ms. Color: Count: Blue 9 Orange 8 Green 12 Yellow 15 Red 10 Brown 6 Total 60
Does this differ from the stated distribution? Let's look at the proportion of Blue M&Ms, which Page 1 of 16
happens to be 9/60 = 0.15, while the given is 0.24. Using what we have learned before, we could perform a one-sample z test for a proportion (Ch. 9) to test the hypotheses H0: ___________ Ha: ___________ This is bad. Not only is it inefficient because we would have to test each color, but this method also leads to multiple (possibly contradicting![1]) comparisons and calculates the probability for subgroups, rather than selecting a random sample of all 60 candies together. Therefore, we turn to the 2 goodnessof-fit test, analyzing the distribution of color collectively.
Parameter: We wish to analyze the color distribution of M&Ms. You can just mention the distribution of interest. No need to search for a specific p/.
Page 2 of 16
Hypothesis and Test Statistic We begin by stating our hypotheses for the categorical variable, color. This can be done in two ways. Using words: H0: ________________________________________________________________ ________________________________________________________________ Ha: ________________________________________________________________ ________________________________________________________________ Using symbols: H0: ________________________________________________________________ ________________________________________________________________ Ha: ________________________________________________________________ ________________________________________________________________ Either way is fine and sufficient. However, in the alternate hypothesis, you cannot say that _______ the hypothesized proportions are incorrect; only one has to differ.
Now, while we listed the hypotheses in terms of proportions, we are actually looking for discrete counts (9 Blue; NOT 0.15 Blue) when running the 2 test. We wish to compare the ____________ counts from our sample with the ____________ counts if H0 is true. The greater the difference between these values, the lower the probability is of randomly selecting a sample as extreme as ours, and the greater evidence we have of _____________ H0. One way of analyzing the data is to construct a data table as shown: Color
(Categorical Variable)
Observed
Expected
(Observed - Expected)2
Red Brown To find the expected counts, multiply each of hypothesized proportions by the total number of candies in the sample, so for Blue, it would be EBlue = (0.24)(60) = 14.40. DON'T use just proportions!
Finally, we want to add up all the values in the last column to find the chi-square statistic[2].
General Formula: 2 =
In the M&M example, this value comes out to be ____________. [2] See Appendix II for more information. 2 Distribution and P-values Now that we have found 2 = 10.180, we wonder if this value is significant. We can locate this value on a chi-square distribution[3] and find the corresponding __________, similar to the procedure we would use for a z-test or t-test. However, it is important to note that the 2 distribution is NOT ___________. In fact, it is ________-___________, with degrees of freedom = __________ ___ _____________ 1.
We can find the P-value in two ways, using Table C or the calculator (recommended). The P-value is the probability of getting a value of 2 as large as or larger than the test statistic, in this case 10.180, when H0 is true.
Using Table C, we look in the row with df = ______. Our 2 value of 10.180 lies between 9.24 and 11.07, corresponding to a P-value between _______ and _______ (found in corresponding top row). Usually, Table C can only give us an interval in which P falls.
Using the calculator (pg 683), we use the 2-cdf command in the DISTR menu, asking for the area underneath the 2 distribution with df = ______ greater than 2 = 10.180. Choosing an arbitrarily large end value (i.e. 1000), we input 2-cdf (10.180, 1000, 5) = 0.070293 P = _______. This method is Page 4 of 16
more precise.
In either situation, both of which are valid, we find that our P-value is greater than ___ = _______, and so we _________ ____ ___________ H0. We _________ have sufficient evidence to conclude that the company's claimed color distribution is incorrect.
Assumptions In order to carry out a 2 test, we need to check the Random, Large Sample Size, and Independent conditions. Random: The data come from a __________ sample or _____________ experiment. Large Sample Size: All ____________ counts are at least ______. This is different from the previous tests! Also, these are __________ counts, not ___________.
Independent: Individual observations are ______________. When sampling, check that ___________.
We have now learned all the steps for performing a 2 goodness-of-fit test. We shall demonstrate in the following two examples, taken from the textbook for guidance. Like other significance tests, we will refer to the _______________ acronym to cover all necessary steps.
Page 5 of 16
Example 2: Birthdays (from pg. 686) Are births evenly distributed across the days of the week? The one-way table below shows the distribution of births across the days of the week in a random sample of 140 births from local records in a large city: Day: Births: Sunday 13 Monday 23 Tuesday 24 Wednesday Thursday 20 27 Friday 18 Saturday 15
Do these data give significant evidence that local births are not equally likely on all days of the week?
Parameter:
Hypotheses: H0:
Ha:
Test Statistic: =
Page 6 of 16
Monday 23
Tuesday 24
Wednesday Thursday 20 27
Friday 18
Saturday 15
Degrees of Freedom =
Obtain a P-value:
Make a decision:
Statement in Context:
Example 2: Birthdays (continued) If fewer babies are actually born on Saturday and Sunday than on other days, what type of error did we make based on the conclusion drawn?
Using Technology (Refer to pg. 687) As always, there is the option of calculating the test statistic on your calculator to save time (thank goodness). Enter the observed counts into L1/list1 and the expected counts into L2/list2. Find the appropriate test function, "2 GOF-Test," and calculate. Be sure to still write down the test statistic, P-value, and degrees of freedom. "2 GOF-Test" is not on Ti-83 models. Look on Mrs. Carson's website for a program.
Page 7 of 16
Example 3: Genetics AP Biology aficionados should be familiar with chi-square goodness-of-fit tests in the context of genetics and Punnett squares (2 test was a recent addition to the AP Biology Exam). If the ratio GG:Gg:gg is predicted to be 1:2:1 and we observe 23:50:11 out of a total of 84 samples, do these data differ significantly from the predicted values at = 0.05? (Condensed for space; use PHANTOMS; refer to pg. 689)
Follow-up Analysis If results are ever significant, perform a follow-up analysis to see which individual components, the Page 8 of 16
(O E)2/E for each category, affected 2 the most. In the genetics example, it would be the gg group. On the calculator, the components are stored in a list called CNTRB ("contribution").
Page 9 of 16
Example 1: We will follow the given example in the book Does Background Music Influence What Customers Buy? Below is the data table. Music - Observed Wine French Italian Other Total None 30 11 43 84 French 39 1 35 75 Italian 30 19 35 84 Total 99 31 113 243 84 75 84 None Music - Expected French Italian Total 99 31 113 243
To analyze the data for similarities and differences, we compute the _______________ distributions of the type of wine sold for each treatment [math omitted]. We see that the proportion of French wine sold is considerably higher when French music is playing, while the proportion of Italian wine sold is considerably lower. Previously, we learned how to perform a two-sample z test for a difference in two proportions, but here we are trying to compare many more variables, and do not want multiple comparisons;[1] instead, we can perform a chi-square test for homogeneity. While different, there are many parallels to the 2 goodness-of-fit test we learned previously.
Parameter: _________________________________________________________________ _________________________________________________________________ Hypothesis and 2: The hypotheses are stated as follows (statement of no difference): H0: __________________________________________________________________ __________________________________________________________________ Ha: __________________________________________________________________ Page 10 of 16
__________________________________________________________________ Again, we are looking for an overall difference/deviation, so any significant difference, inexclusive of being _____-sided or _____-sided, will lead us to ___________ H0.
Fortunately for 2 tests, the 2 statistic is calculated the same way each time. Since we are given our observed counts, we need to find the _____________ counts; here, the two-way table comes in handy.
For example, for French wine bought when no music is playing, we multiply the "categorical totals" and divide by the overall total. 9984/243 = 34.22. Fill out the table above. In our computations for 2, we can just write out the first few terms and last term, using ellipsis (...) to fill in the middle stuff; the calculator will take care of the rest. The degrees of freedom is the product (____________ ___ ________ - 1)(_____________ ___ __________ - 1). In this problem, our 2 = ____________ and df = ______.
Assumptions Our assumptions here are the same. Random: The data come from separate ___________ ___________. Large Sample Size: All expected counts are at least ______. Independent: Samples and individual observations are _________________. When sampling, check that _______ < ____.
P-value and Conclusion We can use either Table C or technology to the associated P-value given 2 and df. The calculator gives a more precise answer (2cdf) and is used here to obtain a P-value of _________.
Since our P-value of _________ is ____________ than = 0.05, we __________ have sufficient Page 11 of 16
evidence to reject H0 and conclude there ______ a difference in the distributions of wine purchases at this store when no music, French music, or Italian music is played. Remember, we cannot state for Ha that "all the proportions are different;" we can only say "some of the proportions are not equal."
Follow-up Analysis If we reject the null hypothesis, we should perform a follow-up analysis to see which of the individual components contributed most to the 2 statistic. In the above example, two of the categories contributed to a large proportion of the overall statistic, so we are led to believe that the sale of _________ wine is strongly affected by Italian and French music.
2 test for Homogeneity on the Calculator (Refer to pg. 705-706) We can perform the 2 test on the calculator (2 2-way test), which greatly simplifies calculations. Make sure to write down relevant assumptions, test statistic, degrees of freedom, P-value, components, and conclusion in context.
Two more good examples are found on pages 707 and 710, and there are many more in the homework problems. In accordance with the other hypothesis tests, use PHANTOMS to help guide your thought process through the solution. I will omit them here for sake of space.
2 test for Association The final test you will learn is called a chi-square test for association (also known as a chi-square test for independence; I will stick with the former for consistency/preference). There does exist a subtle difference between this test and the 2 test for homogeneity (in the Hypotheses and Conclusion), but for the most part the two tests are performed in a similar fashion. In particular, a close observation of what we did in the first part (in music and phone calls examples) is take data from many independent Page 12 of 16
samples/groups (different types of music playing and cell vs. landline) and compared their distributions. However, what we are about to do next is take a single random sample of individuals chosen from a single population and analyze the relationship between designated categorical variables.
Let's look at an example. The one given in the book is "Do Angry People Have More Heart Disease?" We wish to compare CHD vs. Anger. However, rather than sampling different groups of individuals based on their level of anger and seeing whether the distribution of CHD is the same (homogenous) across groups, we sampled 8474 people collectively and compared the two variables, sorting the people into a two-way table. We wish to see if an association exists between the variables in the sample evidence for the entire population. Ask yourself this to decide which of the two tests to use. Do not forget the statistic mantra, "correlation does NOT imply causation." Having an association means that knowing one variable will affect the probability of another variable, not necessarily that one causes the other (in this manner, it is helpful to think of "independent" vs. "dependent").
Hypotheses: Like before, the null hypothesis is a state of no difference (no association; independent) H0: _____________________________________________________________ _____________________________________________________________ Ha: _____________________________________________________________ _____________________________________________________________
2 statistic and P-value These, along with expected counts, are legit found in the same way as in the test for homogeneity. 2 = _____________, df = ____________ P-value = _____________. Page 13 of 16
Conclusion: Because our P-value is ________ than = 0.05, we ________ have sufficient evidence to __________ H0 and conclude that anger level and heart disease _________ associated in the population of people with normal blood pressure.
If we had sufficient evidence to reject the null hypothesis in the previous problem, don't forget to conduct a follow-up analysis. The calculator option is the same as for homogeneity (2 2-way test).
Once again, if you haven't noticed, these last two 2 tests are very similar, which makes it easy to perform, but difficult to distinguish between which one applies. In my opinion, the best way is to look at the method of sampling and decide from there. On FRQs, be precise with your diction during the Hypothesis and Conclusion steps.
Interesting Tidbits (pg. 720-721) Sometimes we are given some quantitative data (i.e. income) and wish to conduct a 2 test. To do so, we can group the individual incomes into an income range, effectively making each group a categorical variable.
Sometimes, depending on how we sample, the large sample size condition may not always be met. In that case, we might be able to combine some rows together, and meet the sample size condition without screwing up the results. Neat.
Page 14 of 16
Page 15 of 16
Page 16 of 16