Professional Documents
Culture Documents
2 test for independence: hypothesis test used to evaluate assertions about the relationship
between cross tabulated variables.
Example Research Problem
Miller Brewing Company hires you to determine if peoples preferences for light beer depend
upon their demographic characteristics. They intend to use your results to develop an
advertising plan that will target those who prefer light beer. Your research design includes
collecting data about beer preferences, sex, and income from a randomly selected group of
beer drinkers. One of the hypotheses you test is that preference for light beer depends upon
the consumer's income. To cross tabulate the variables and test for a relationship you code the
variables as:
INCOME
low middle high Total
PEFERENCE dislikes Count 22 13 3 38
Expected
13.8 14.3 9.9 38.0
Count
% within
57.9% 34.2% 7.9% 100.0%
PEFERENCE
% within
40.7% 23.2% 7.7% 25.5%
INCOME
indifferent Count 18 21 11 50
Expected
18.1 18.8 13.1 50.0
Count
% within
36.0% 42.0% 22.0% 100.0%
PEFERENCE
% within
33.3% 37.5% 28.2% 33.6%
INCOME
prefers Count 14 22 25 61
Expected
22.1 22.9 16.0 61.0
Count
% within
23.0% 36.1% 41.0% 100.0%
PEFERENCE
% within
25.9% 39.3% 64.1% 40.9%
INCOME
Total Count 54 56 39 149
Expected
54.0 56.0 39.0 149.0
Count
% within
36.2% 37.6% 26.2% 100.0%
PEFERENCE
% within
100.0% 100.0% 100.0% 100.0%
INCOME
1
Chi-Square Tests
Asymp.
Sig.
Value df (2-sided)
Pearson a
18.597 4 .001
Chi-Square
Likelihood Ratio 19.390 4 .001
Linear-by-Linear
17.698 1 .000
Association
N of Valid Cases 149
a. 0 cells (.0%) have expected count less than 5.
The minimum expected count is 9.95.
Symmetric Measures
Asymp. Approx.
Value Std. Errora Approx. Tb Sig.
Ordinal by Ordinal Gamma .458 .093 4.592 .000
N of Valid Cases
149
1) The hypothesis test is based on the presumption that the variables are not related,
i.e., the value of the dependent variable is not being affected by the value of the
independent variable. When this is the case, you would expect the number of
observations from each independent variable category to be approximately
proportionally the same for each dependent variable category.
2) The hypothesis test evaluates the difference between the values actually observed
(O) and the values that would be expected (E) if the variables are not related.
3) As part of the analysis, SPSS calculates the Es and compares them to the Os. The
comparison will find either:
little difference between O and E. This means that what was actually observed is
what you would have expected to observe if the variables were not related. In this
case, the variables are independent (not related) because the value of the
dependent variable is not correlated with the value of the independent variable.
considerable difference between O and E. This means that what was actually
observed is different than what you would have expected to observe if the variables
were not related. In this case, the variables are not independent (are related)
because the value of the dependent variable is correlated with the value of the
independent variable.
2
4) STATISTICAL ACCURACY REQUIRES:
The null hypothesis states that no relationship exists between the two variables.
The alternative hypothesis states that a relationship exists between the two variables.
3
TESTING FOR INDEPENDENCE BETWEEN INCOME AND BEER PREFERENCE
INCOME
low middle high Total
PEFERENCE dislikes Count 22 13 3 38
Expected
13.8 14.3 9.9 38.0
Count
% within
57.9% 34.2% 7.9% 100.0%
PEFERENCE
% within
40.7% 23.2% 7.7% 25.5%
INCOME
Chi-Square Tests
indifferent Count 18 21 11 50
Expected Asymp.
18.1 18.8 13.1 50.0
Count Sig.
% within Value df (2-sided)
36.0% 42.0% 22.0% 100.0%
PEFERENCE Pearson a
% within 18.597 4 .001
Chi-Square
33.3% 37.5% 28.2% 33.6%
INCOME Likelihood Ratio 19.390 4 .001
prefers Count 14 22 25 61 Linear-by-Linear
Expected 17.698 1 .000
22.1 22.9 16.0 61.0 Association
Count
N of Valid Cases 149
% within
23.0% 36.1% 41.0% 100.0% a. 0 cells (.0%) have expected count less than 5.
PEFERENCE
% within The minimum expected count is 9.95.
25.9% 39.3% 64.1% 40.9%
INCOME
Total Count 54 56 39 149
Expected
54.0 56.0 39.0 149.0
Count
% within
36.2% 37.6% 26.2% 100.0%
PEFERENCE
% within
100.0% 100.0% 100.0% 100.0%
INCOME
Symmetric Measures
Asymp. Approx.
Value Std. Errora Approx. Tb Sig.
Ordinal by Ordinal Gamma .458 .093 4.592 .000
N of Valid Cases
149
STEP 2: Conduct the test: Reject H0 because the obtained level [Asymp.
Sig. (2-sided)] is .001indicating highly significant results.
2) Cramer's V is used to measure the strength of the relationship when the cross tab
involves a nominal scaled variable.
a) 0V1
b) The closer v is to 1 (0), the stronger (weaker) the relationship.
5
Example Research Problem:
The manager of the Onalaska Best Buy hires you to determine whether there is a relationship between the
advertising medium the customer heard about the store's 24 Hour Sale and the amount they spend. During
the 24 Hour Sale you randomly sample 200 customers that have made purchases. The output from the
analysis of these data appears below. The MEDIUM and EXPENDITURE variables are coded:
Medium
Newspaper Radio Television Total
EXPENDITURE Under $100 Count 21 10 25 56
Expected Count 15.1 18.2 22.7 56.0
% within
37.5% 17.9% 44.6% 100.0%
EXPENDITURE
% within
38.9% 15.4% 30.9% 28.0%
Medium
$100-199.99 Count 26 27 13 66
Expected Count 17.8 21.5 26.7 66.0
% within
39.4% 40.9% 19.7% 100.0%
EXPENDITURE
% within
48.1% 41.5% 16.0% 33.0%
Medium
$200 or more Count 7 28 43 78
Expected Count 21.1 25.4 31.6 78.0
% within
9.0% 35.9% 55.1% 100.0%
EXPENDITURE
% within
13.0% 43.1% 53.1% 39.0%
Medium
Total Count 54 65 81 200
Expected Count 54.0 65.0 81.0 200.0
% within
27.0% 32.5% 40.5% 100.0%
EXPENDITURE
% within
100.0% 100.0% 100.0% 100.0%
Medium
Approx. Asymp.
Value Sig. Sig.
Nominal by Phi .402 .000 Value df (2-sided)
Nominal Cramer's V .284 .000 Pearson a
32.247 4 .000
Contingency Chi-Square
.373 .000 Likelihood Ratio 36.685 4 .000
Coefficient
N of Valid Cases Linear-by-Linear
200 9.703 1 .002
Association
a. Not assuming the null hypothesis. N of Valid Cases 200
b. Using the asymptotic standard error assuming the a. 0 cells (.0%) have expected count less than 5.
null hypothesis. The minimum expected count is 15.12.
Taking everything into consideration, what do you conclude about the influence advertising media have on
expenditures?