Professional Documents
Culture Documents
15
10 25
5
20 25
Are frequencies sufficient to allow us to make comparisons about groups? What other information do we need?
Males
15 (60%) 10 (40%) 25 (100%)
Female
5 (20%) 20 (80%) 25 (100%)
How would you write a sentence or two to describe what is in this table?
Male
20 (33.3%) 30 ( )
Female
20 ( %)
10 (16.7%) 60 (100.0%)
Questions:
What group had the largest percentage of Ph.Ds?
What are the ways in which you could find the missing numbers? Is it obvious why you would use percentages to make comparisons among two or more groups?
In the following table, were people with drug, alcohol, or a combination of both most likely
to be referred for individual treatment?
Services
Individual Treatment Group Treatment AA Total
Alcohol
10 (25%) 10 (25%) 20 (50%) 40 (100%)
Drugs
30 (60%) 10 (20%) 10 (20%) 50 (100%)
Both
5 (50%) 2 (20%) 3 (30%) 10 (100%)
We can use cross-tabs to visually assess whether independent and dependent variables might be related. In addition, we also use cross-tabs to find out if demographic variables such as gender and ethnicity are related to the second variable.
For example, gender may determine if someone votes Democratic or Republican or if income is high, medium, or low. Ethnicity might be related to where someone lives or attitudes about whether undocumented workers should receive drivers licenses.
Because we use tables in these ways, we can set up some decision rules about how to use
tables.
Independent variables should be column variables. If you are not looking at independent and dependent variable relationships, use the variable that can logically be said to influence the other as your column variable. Using this rule, always calculate column percentages rather than row percentages. Use the column percentages to interpret your results.
For example,
If we were looking at the relationship between gender and income, gender would be the column variable and income would be the row variable. Logically gender can determine income. Income does not determine your gender. If we were looking at the relationship between ethnicity and location of a persons home, ethnicity would be the column variable. However, if we were looking at the relationship between gender and ethnicity, one does not influence the other. Either variable could be the column variable.
SPSS will allow you to choose a column variable and row variable and whether or not your table will include column or row percents.
Chi-square is simply an extension of a cross-tabulation that gives you more information about the relationship. However, it provides no information about the direction of the relationship (positive or negative) between the two variables.
High 40
Low
Total 50 50
50
50
100
I have not filled in all of the information because we need to talk about two concepts before we start calculations: Degrees of Freedom: In any table, there are a limited number of choices for the values in each cell. Marginals: Total frequencies in columns and rows.
High 40
Low
Total 50 50
50
50
100
High 40 10 50
Low 10 40 50
Total 50 50 100
The rules for determining degrees of freedom in cross-tabulations or contingency tables: In any two by two tables (two columns, two rows, excluding marginals) DF = 1. For all other tables, calculate DF as: (c -1 ) * (r-1) where c = columns and r = rows. ( So for a table with 3 columns and 4 rows, DF = ____. )
Calculating Chi-Square
Formula is [0 - E]2 E Where 0 is the observed value in a cell E is the expected value in the same cell we would see if there was no association
First steps
Alternative hypothesis is: There is a relationship between income level and education for respondents in a survey of BA students.
Null hypothesis is: There is no relationship between income level and education for respondents in a survey of BA students Confidence level set at .05
Rules for determining whether the chi-square statistic and probability are large enough to verify a relationship. For hand calculations, use the degree(s) of freedom and the confidence level you set to check the Chi-square table found in most statistics books. For the chi-square to be statistically significant, it must be the same size or larger than the number in the table. On an SPSS print out, the p. or significance value must be the same size or smaller than your significance level.
High 25 25 50
Low 25 25 50
Total 50 50 100
High 40 10 50
Low 10 40 50
Total 50 50 100
Chi-square calculation is
Expected Values 50 * 50/100 Chi-square 25 (40-25)2/25
Cell 1
9 9 9
36
Urban
20 30 50
Rural
40 10 50
Total
60 40 100
Cell 1
3.33
16.67 At 1 DF at .01 chi-square must be greater than 6.64. Do we accept or reject the null hypothesis?
Chi-Squar e Tes ts V alue 2.569a 2.590 .087 336 df 5 5 1 A sy mp. Sig. (2-s ided) .766 .763 .768
a. 2 cells (16.7%) have ex pected c ount less than 5. The minimum ex pected c ount is 1.57.
Recode
To run ratio or interval level variables into SPSS you need to recode or change the variable into a categorical or nominal or ordinal variable. You first need to decide how you will set up categories and assign a number to them. For example if your ratio variables for Age are: 25, 37, 42, 50, and 64, you might decide on two categories: 1 = under 50 2 = 50 and over
Recode Instructions
Go to Transform menu Go to Recode Select different variable Type in new variable name Click continue Enter range of ratio numbers for first category (25 to 49) Enter number for first category (1) in right hand screen. Click Add Enter range of ratio numbers (50 to 54) for category two Enter number for second category (2) Click Add Click Continue Click Change Click o.k.