Professional Documents
Culture Documents
7. Juli 2015
Applications of Factorial Surveys in Social Sciences
7. Juli 2015 3
Factorial Surveys: Idea and Motivation
7. Juli 2015 4
Factorial Surveys: Idea and Motivation
7. Juli 2015 5
Factorial Surveys: Idea and Motivation
7. Juli 2015 6
Research Aims and Advantages
Advantages:
Experimental design
Heterogeneous respondents
Less social desirability bias
7. Juli 2015 7
Example: Discrimination of Female Employees?
Caused by discrimination?
Underlying mechanisms?
7. Juli 2015 8
Results for Germany: Mean Justice Evaluations
7. Juli 2015 9
Results for Germany: Mean Justice Evaluations
7. Juli 2015 10
Just Gender Pay Gaps in %
8.88 9.78
7.25 7.18
5.81
7. Juli 2015 11
Hints on Underlying Mechanisms
7. Juli 2015 12
Or nothing more than artifacts?
7. Juli 2015 13
Criticism: A Test of Intelligence (Faia 1980)
7. Juli 2015
Criticism: Untenable Results? (Markovsky/Eriksson 2012)
The indirect evaluations could lead to untenable, biased results in regard to fair earnings
(M/E 2012 199); one reason would be strong range and anchor effects.
Using direct measurements as a reference point (directly asking for fair amounts of earnings without
providing anchors in the vignettes) M/E found indirect measurements to be completely unreliable:
800 r = - .07
(without outlier: .25)
indirect measurement [100,000 USD]
600
400
200
0
0 20 40 60 80 100
direct measurement [100,000$]
7. Juli 2015
Criticism: Untenable Results? (Markovsky/Eriksson 2012)
The indirect evaluations could lead to untenable, biased results in regard to fair earnings
(M/E 2012 199); one reason would be strong range and anchor effects.
Using direct measurements as a reference point (directly asking for fair amounts of earnings without
providing anchors in the vignettes) M/E found indirect measurements to be completely unreliable:
600
0
Justice Evaluation Function
0 20 40 60 80 100 (replication of Jasso/Meyersson Milgrom 2000).
direct measurement [100,000$]
7. Juli 2015
Revisiting Research with Split Ballot Experiments
Several online surveys with university students (all in all > n = 2,000 students)
[Several large scale population surveys is Germany, Ukraine, Kirgizstan]
7. Juli 2015
Simply Artifacts? Construct Validity and Anchor Effects
7. Juli 2015
Range and Anchor Effects Reasons
One reasons could be priming effects caused by the income values indicated in the vignettes
(assimilation or contrast effects; M/E 2012; Groves et al. 2011).
Another reason could exist with respondents tendency to use the whole response scale,
including their endpoints (Tourangeau et al. 2000):
7. Juli 2015
Simply Artifacts? Impact of Range (Anchors)
Mean evaluations with 95%-CIs.
low high
estimation with robust standard errors
7. Juli 2015
Simply Artefacts? Impact of Evaluation Task
9,000
8,000
7,000
6,000
euros
5,000
Vignette earnings
4,000
3,000
2,000
1,000
0
7. Juli 2015
Simply Artefacts? Impact of Evaluation Task
9,000
8,000
7,000
6,000
Vignette earnings
euros
5,000
4,000 Fair earnings
3,000
2,000
1,000
0
7. Juli 2015
Simply Artefacts? Impact of Evaluation Task
9,000
8,000
7,000
6,000 Vignette earnings
euros
5,000
Fair earnings
4,000
Plausible earnings
3,000
2,000
1,000
0
7. Juli 2015
Simply Artefacts? Impact of Evaluation Task
9,000
8,000
7,000
6,000 Vignette earnings
euros
7. Juli 2015
Reliability Revisited
Further experiment, comparing rating scales and direct evaluations; D-efficient sample of
vignettes, simple cross-elasticities to calculate fair earnings, n = 68 university students.
indirect measurement
600
5000
400
4000
200
3000
2000
0
0 20 40 60 80 100 2000 4000 6000 8000 10000 12000
direct measurement [100,000$] direct measurement
7. Juli 2015 26
Use Standard Response Scales
7. Juli 2015 27
Example for a Magnitude Scale
7. Juli 2015
Own Survey (Split Ballot)
Experiment with n = 448 university students; online-survey; three-step answer scale (a) fair
or not? b) if not: too high or too low? c) if not: which number would indicate the amount of
injustice?
Respondents didnt use magnitude scales as metric scales.
25
20
15
percent
10
5
0
7. Juli 2015
Own Survey (Split Ballot)
Rating scales clearly outperformed magnitude scales (less missing values, less response-
sets, less implausible results).
Rating Magnitude
Proportion missing values 3% 14%
Proportion of vignettes
23% 32%
evaluated as fair
With magnitude scales we got implausible and unreliable (or untenable) results that
resembled the findings of Markovsky/Erikson (2012).
At least in self-administered surveys we recommend to use standard response-scales.
7. Juli 2015
Use smart experimental designs for sampling vignettes
7. Juli 2015 31
Sampling Means Confounding
7. Juli 2015 32
Sampling Means Confounding
SampleA
SampleB
7. Juli 2015 33
Sampling Means Confounding
Maineffectsgetconfounded
withinteractionterms.
Datadonotallowdeciding
SampleA whichequationsholds:
Y=1D1 +2D2
SampleB
Y=1D1 +2D1D2
7. Juli 2015 34
Dont Leave Confoundings to Chance!
With random samples one leaves confounding patterns simply to chance (Kuhfeld et al. 1994;
Carson et al. 2009).
In contrast, deliberate, fractionalized D-efficient samples allow controlling which effects get
confounded.
At the same time they allow optimizing further target criteria (like precision of parameter
estimates): The higher the D-efficiency, the more precise are parameter estimates.
(With one sample being x-times as efficient as another sample, one would need only about 1/x as many respondents
to get the same amount of precision of parameter estimates; see also Dlmer 2007, forthocoming)
7. Juli 2015 35
Summary and Recommendations
7. Juli 2015 36
Thank you for your attention!
Further information:
https://study.sagepub.com/auspurg_hinz
http://www.fb03.uni-frankfurt.de/49649875/publikationen
7. Juli 2015 37
References
7. Juli 2015 38
Appendix: Censoring of Responses? Vignette Order Effects
7. Juli 2015 39
Appendix: Simulations Efficiency by Sample Size
With D-efficient samples much less respondents are needed for the same amount of precision
of parameter estimates, and one has control which effects get confounded.
7. Juli 2015 40