You are on page 1of 2

PL4201 Psychometrics and Psychological Testing Item Analysis

Item Analysis in an Educational Setting Sometimes, especially in an educational context, we are interested in the relation between an item and the overall test scores (e.g., which items on a math test best discriminate among students who scored high versus those who scored low on the test). We might use extreme groups (the top 25-33% versus the bottom 25-33% of the students) to examine some item statistics. Example: We have 60 students in a class and we will take the top and bottom 33% as extreme groups. Therefore, there are 20 students in each group. Item 1 2 3 4 5 6 x x x Top 33% (T) 15 20 3 10 17 12 Middle 33% (M) 9 20 1 11 10 11 Bottom 33% (B) 7 16 0 16 10 11 Difficulty (T + M + B) 31 56 4 37 37 34 Discrimination (T - B) 8 4 3 -6 7 1

Items 1 and 5 are good items. Items 2 and 3 have extreme difficulty levels; too easy for Item 2 and too difficult for Item 3. Items 4 and 6 have low indices of discrimination; items not differentiating top versus bottom students well. One can also translate the frequencies to proportions and obtain the difference in proportions (e.g., Item 4; T = .50 passing and B = .80 passing and therefore difference = -.30).

Obviously, one can also calculate the phi coefficient in the above example by taking the T and the B groups. Hence, for Item 1: Groups Pass 1 0 T (1) 15 (.375) 5 (.125) .50 B (0) 7 (.175) 13 (.325) .50 .55 .45 1.00

NB: Proportions are in parentheses (e.g., for cell11, proportion = 15/40 = .375). phi correlation = pij pipj / (piqi) (pjqj) = 0.375 (0.5*0.55) / (0.5*0.5)(0.55*0.45) = 0.40. Or, one can compute a biserial correlation between passing of one item (1, 0) and the total score (a continuous scale) for the 60 students. The biserial correlation is used here because the items are assessing a continuous variable (math ability) that is artificially dichotomized (i.e., pass or fail on math item).

In this context, item analysis can be used to identify possible deficiencies in the items (i.e., too difficult, ambiguous phrasing, or that two answer options are very likely in a MCQ question) so as to either revise or discard them. Or, the teacher might want to clarify certain teaching materials again to ensure that the students understand them. The underlying assumption in this analysis is that students who score high on the test overall should good well in most of the items (compared to low-scoring students). If an item functions in such a way that high-scoring students do not differ from low-scoring students, then one might want to investigate the item closely. The phi coefficient and the biserial correlation indices for each item can be used to select good items for a subsequent test. As such, we would want to select items with high indices.

Item Analysis Exercise


Dr Smartypants gave a 30-item pop quiz on psychometrics to his class. Presented below in the table is a selection of 5 items from this pop quiz. All 30 questions were in the format of Multiple-Choice Questions in which students had 4 alternatives to choose from in each question. Some relevant item statistics are shown in the table.
Prop. Correct .38 Index of Discrim. .41 Biserial r .24 Prop. Endorsing .24 .14 .38 .24 .24 .00 .09 .67 .81 .14 .05 .00 .00 1.00 .00 .00 .51. .03 .17 .29

Item 1

Alternatives A B C D A B C D A B C D A B C D A B C D

Key

.67

.28

.31

.14

.10

-.14

1.00

.00

.00

.51

.48

.38

Questions: 1. Can you identify problematic items? Explain why they are problematic. 2. Which item would you say is the best item? Why? 3. Which items would you choose for a revised test?

You might also like