You are on page 1of 10

International Journal of Educational Development 33 (2013) 106115

Contents lists available at SciVerse ScienceDirect

International Journal of Educational Development


journal homepage: www.elsevier.com/locate/ijedudev

Achievement versus aptitude in college admissions: A cautionary note based on evidence from Chile
nica Silva, Rodrigo Cofre Mladen Koljatic *, Mo
n, Ponticia Universidad Cato lica de Chile, Vicun a Mackenna 4860, Santiago, Chile Escuela de Administracio

A R T I C L E I N F O

A B S T R A C T

Keywords: College admission Achievement tests Aptitude tests Fairness in testing Chile

In recent years there has been a debate over the alleged superiority of achievement tests over aptitude tests on the grounds that the rst would be fairer for college admissions and less inuenced by family background. The switch from aptitude tests to achievement tests in Chile presented a unique opportunity to examine this claim. Regression analysis was used to assess the impact of the change in test performance using data from seven cohorts of test-takers. The evidence does not support the superiority of achievement tests, particularly when these assess extensive contents. 2012 Elsevier Ltd. All rights reserved.

1. Introduction Most countries in the world face an expanding number of students who demand college education, particularly in developing countries (World Bank, 2000). Sorting out those who have the abilities to pursue college education is a difcult task. Not only is it necessary to devise a systematic decision-making process to choose the best qualied among applicants, but such a process needs to be equitable with respect to subgroups dened by ethnicity, gender and socioeconomic status. From Brazil to China, policy makers debate which is the best and fairest method to select students for higher education. In recent years Brazil has moved away from custom-designed institutional admission tests for a national entrance exam used by public universities (Wildavsky, 2010). In China, some ofcials at the ministry of education are questioning the fairness of their traditional gaokao exam because of the gulf in quality between rural and urban schools (The Chronicle, 2010). In the U.S. the quest for fair college admission has a long history and was the starting point of aptitude testing in the early twentieth century, when James B. Connant, president of Harvard University aimed to open a venue for the more capable students to gain admittance to the institution. He was aware that the tests in use at the time were much too inuenced by the prior academic opportunities of the students and sought to nd a way to detect talented individuals independently of the quality of their educational experiences. Unlike achievement tests which assessed mastery of specic subjects taught in school, aptitude or reasoning

* Corresponding author. Tel.: +56 2 354 4371; fax: +56 2 553 1672. E-mail addresses: mkoljati@uc.cl (M. Koljatic), msilvara@uc.cl (M. Silva), ). racofre@uc.cl (R. Cofre 0738-0593/$ see front matter 2012 Elsevier Ltd. All rights reserved. doi:10.1016/j.ijedudev.2012.03.001

tests focused on measuring verbal and mathematical abilities not directly tied to the curriculum, with an emphasis on reasoning skills, critical thinking and problem solving that were deemed relevant for college-level studies. The idea behind the quest for aptitude tests was to help nd extraordinarily talented students whom youd otherwise miss because they havent had the chance to go to good schools (Lemann, 2004, p. 11). Almost one century later, the controversy over the nature of content and fairness of tests is still unsettled. While some have argued for a return to achievement tests (e.g., see Atkinson, 2001), others believe that aptitude tests deserve consideration if the aim is to identify talent (Lohman, 2004). Still, others afrm that the pursuit of academic excellence and the enhancement of diversity are possible through the use of appropriate measurement instruments, albeit not the traditional ones (Sternberg, 2006). In the U.S., the recent changes introduced to the SAT in 2005 were fueled, among others, by concerns of equity and fair access. As expressed by the President of the University of California: Achievement tests are fairer to students because they measure accomplishment rather than ill-dened notions of aptitude. . . they are less vulnerable to charges of cultural or socioeconomic bias (Atkinson, 2001, p. 35). Supporters of achievement tests have argued that admission tests based on material taught in the classroom would yield smaller score differences among economically deprived and afuent groups than traditional aptitude tests, a notion that was questioned by Zwick (2004) who found that the stronger linkage of the ACT test to high school curricula did not translate into smaller score gaps when compared with the SAT. Even though the controversy regarding the benets of achievement tests appears to be far from settled, a recent report from the National Association for College Admission Counseling (NACAC) has also claimed alleged benets of achievement tests

M. Koljatic et al. / International Journal of Educational Development 33 (2013) 106115

107

over aptitude tests, echoing Atkinsons (2001) view that an emphasis on achievement tests would encourage improvement of high school teaching and also reduce the inequities inherent in the current system (NACAC, 2008, p. 11). The controversy in the U.S. over the benets of achievement or content-based tests for students of lower socioeconomic level and the convenience of eliminating the use of the SAT I in college admissions had its reverberations in Chile, where local authorities in charge of the national university admission process decided in 2001 to eliminate the aptitude tests in use and replace them by achievement tests. The arguments proffered for the need to change the existing tests were strikingly similar to those forwarded by Atkinson (2001), as pointed out by a high ofcial from the quez, 2003). University of California while on a visit to Chile (Henr As in the U.S., the new achievement tests were presented as a venue to increase the opportunities of access for low SES students and to simultaneously improve the quality of secondary education n, 2000). Aptitude tests (Bravo et al., 2000; Ministerio de Educacio were declared to be unfair since scores reected the cultural capital of the home (Brunner, 2002). Conversely, achievement tests, by focusing on school learning and personal effort, were assumed to be more equitable. According to one of the leaders behind the move to switch tests, increasing the amount of contents assessed in the admission tests should help advance fairness. He argued that the more contents added to the tests the greater chance to make them more equitable, since the impact of factors such as family background and cultural capitalpresent in aptitude testswould be reduced (Diario Austral, 2002). The following sections offer an account of the context of the change of tests in Chile, address the purpose of the study and the methods employed for the analysis, report the results, and discuss the implications of the ndings. 2. The context of the change of tests In the year 2000, an educational reform was underway in Chile. The national secondary education curriculum was being revised and ministry ofcials were committed to change the SAT-type tests (APT) that had been in use for three decades for a new set of achievement tests (ACH) aligned with the revised secondary school curricula. In addition to sorting students for admission purposes, the new achievement tests were aimed to promote the sustainability of the secondary education reform and to assess its n, 2000; World educational outcomes (Ministerio de Educacio Bank, 2001). Since 1967 and until the change of tests in 2003, the APT Verbal and Mathematics tests in use were similar to the American SAT, tapping on what were considered basic reasoning abilities required for success in college. The tests were offered once a year and required for admission to any university receiving public funds. The APT tests were complemented by a set of achievement tests in the vein of the SAT-II Subject Tests that assessed the knowledge of applicants in advanced math, physics, biology, chemistry and social sciences. Still, the pillar of the admission system was the APT since the subject tests were only required by a few of the more prestigious institutions for admittance to some of their programs. Unlike the case of the U.S. where admission decisions were made on the basis of a comprehensive review of the background of the studentincluding test scores, grades, recommendation forms, essays, and diversity considerationsadmission decisions in Chile were made solely on the basis of the scores obtained in the admission tests and high-school grade point average (HSGPA). Additionally, scores in the admission tests carried weighty consequences for students, since state-funded scholarship programs and nancial aid were tied to outcomes on the national admission tests. Student test score performance was also relevant

for institutions of higher education, with universities vying to attract the top scorers in the admission tests because the enrollment of the best translated into additional public funding for the institutions. 3. The development of the new tests In 2001, the Ministry of Education informed the public that a project to develop new admission tests was under way to eliminate the APT battery and substitute it for a set of four mandatory achievement tests in Mathematics, Verbal (Language Skills), Social Sciences, and Science. The new admission battery would examine 100% of the newly reformed national curriculum from grades 9 to 12 that could be assessed via multiple choice tests. All applicants to public-funded universities would be required to take the four tests in order to be eligible for admission. The project was the focus of a public debate on the convenience of the change. The reservations regarding the new tests stemmed from the nature of the new tests and the highly segmented secondary education system characterized by a markedly heterogeneous quality of schools in the nation (Beyer, 2002; Eyzaguirre and Le Foulon, 2002). In Chile, schools fell into three broad categories: municipal, which were state funded public schools; private-subsidized, which received some funding from the state and charged a small tuition fee; and private-paying schools. The type of school attended was associated with socioeconomic status (SES) and the quality of education varied signicantly among municipal, private-subsidized and private-paying schools according to national and international tests. The private-paying schools educated the most socioeconomically advantaged student group; the private-subsidized schools attracted lower to middle-income families while the municipal schools catered to the poorest sections of society. Both the private-subsidized and the municipal schools offered a dual track curricular system: the general track (GT) and the vocational track (VT). The new ACH tests were to be aligned with the national curriculum. Although the required curricular or minimum contents in subjects such as Mathematics and Language were allegedly the same, the VT and GT curricula differed in the number of class hours devoted to academic subjects. For example, while the GT curriculum included Philosophy, Biology, Chemistry and Physics in the last 2 years of high-school, the VT curriculum did not. Although all schools were expected to cover the minimum contents prescribed in the national curriculum not all schools achieved this goal. Thus, the quality of the education received by the students and the coverage of school content depended largely on the kind of school attended and track choice, with privatepaying schools providing the best quality of education for those that could afford it (OECD, 2009). The students attending the municipal and private-subsidized schools offering the vocational tracks lagged behind in the quality of education they received, particularly those attending the municipal schools that catered to the poorest segment of the population. Still, admittance to prestigious public funded universities was an aspiration for all, including the students from the VTs where over 60% of its graduates took the admission tests and only 9% expressed a desire to enter the labor force straight out of high n, 2008). Many prestigious universchool (Ministerio de Educacio sities in the nation offered a variety of technical programs and professional degrees via the centralized admission process that selected students through a combination of standardized test scores and grades. In lieu of the public controversy that ensued over the elimination of the APT tests and the concerns expressed by educational experts about the hasty process of change and the risks entailed in it for the educationally disadvantaged test-takers

108

M. Koljatic et al. / International Journal of Educational Development 33 (2013) 106115

Fig. 1. Selected milestones of the process of change of tests.

(Fontaine, 2002; Eyzaguirre and Le Foulon, 2002), the authorities made some concessions in order to assuage the transition process between the two batteries. Applicants would only be required to take three of the four ACH tests. The Verbal and Math tests would be compulsory and the choice of the third test would depend on the specic requirements of the academic programs where the applicant sought admission. Additionally, contents included in the rst versions of the ACH tests were reduced. However, the reduction in the contents was only temporary since additional topics would be added yearly until the admission process of 2007, when the ACH battery would include 100% of the contents of the national GT high-school curriculum. The ACH tests were rst applied for the 2004 university admission process with a 15% fall in the number of test-takers relative to the previous year (Koljatic and Silva, 2006). The loss of test-takers mostly affected graduates from municipal schools. The number of test-takers recovered and surpassed the 2003 level only after the Ministry of Education started a fee-waiver policy in 2007 aimed at low income test-takers (Fig. 1 shows some milestones of the change process). In the years that followed the introduction of the ACH tests, the performance gap between municipal and private-paying schools grew. Newspaper headlines underscored the trend, but test developers responded that the growing gap between types of school was merely a consequence of the change in the SES prole of test-takers across the years, particularly after the onset of the fee waiver policy of 2007, ruling out that the increase could be in part attributable to the type of test or the amount of contents assessed in them (PSU: Consejo de Rectores, 2008; Dalgalarrando, 2007; Molina, 2008). 4. Purpose of the study The type of admission test and the amount of contents assessed are matters of consequence in developing nations with centralized admission processes that heavily rely on standardized tests for admission purposes. Whereas SES and access to high quality education cannot be readily changed to enhance fair access to higher education, the type of admission tests used and the amounts of contents assessed in these can be modied in the shortrun. Under conditions of equal predictive capacity, the type of test that proves to be less detrimental to the opportunities of the most vulnerable groups should be preferred. The purpose of this study is to assess whether ACH tests are indeed benecial for the assessment of students from low socioeconomic groups as has been claimed in the literature. The superiority of ACH tests when assessing socioeconomically disadvantaged groups has not been examined under conditions of markedly unequal quality of education and varying amounts of contents assessed in the tests. For this reason, the Chilean data offered a good, albeit not ideal, opportunity to test this claim. The ideal way for assessing the relative merits of both types of tests

would have been to require all applicants to take both the APT and the ACH tests and directly compare the outcomes in terms of score performance gaps and prediction in the same sample before opting for one type of test. However, test developers in Chile downplayed the need to conduct this sort of comparative study before n Universitaria, 2002). The eliminating the APT tests in use (Visio educational authorities opted to abide by the test-developers judgment, forgoing further studies. Still, the data available can be examined to assess whether ACH tests appear to be better suited than APT tests for fair assessment of disadvantaged students. Two research questions are addressed: (1) Are low-content ACH tests better suited than APT tests to assess test-takers from low SES? After accounting for differences in SES, does the use of achievement tests favor those exposed to high quality of education? (2) Are high-content ACH tests better suited than APT tests to assess test-takers from low SES? After accounting for differences in SES, does the use of achievement tests favor those exposed to high quality of education? 5. Data and methods 5.1. Data The study used a set of data les that contained the ofcial test score records of all test-takers between 2002 and 2008 and lica de were provided by ofcials at Ponticia Universidad Cato Chile, who received the original les directly from the testing agency. In the 2002 and 2003 data les the test scores corresponded to APT tests. From 2004 onwards the scores represented performance in ACH tests. The version of the ACH test that examined the lower amount of contents was offered in 2004. Additional contents were added to the ACH tests yearly until 2007 when full-curricular content was included in all ACH tests.1 The data les also included socio-demographic variables and high-school grades, but no data was available on motivation to pursue studies or attendance to coaching programs since it was not required by the testing agency. 5.2. Measures Information on high-school achievement in the database was the grade point average (HSGPA) obtained by the student during the 4 years of high-school expressed in a scale that ranged between 200 and 800 points, with a mean of 540 points and a standard deviation of 100 points.
1 The 2006 version of the ACH tests was run separately because of conicting information as to whether it included full contents. The coefcients for the type of school and test interactions were consistent with those for the high-content ACH tests.

M. Koljatic et al. / International Journal of Educational Development 33 (2013) 106115

109

Three variables associated with SESself-reported family income, mothers education and fathers educationwere coded as ordinal scales in the database. The number of scale categories for each variable changed across the years rendering non-equivalent category descriptions, so they had to be redened in order to make them comparable. A 6-point scale measured self-reported family income between 2002 and 2007; an 8-point scale was employed in 2008. To have a common and comparable measure of income for the period under study, the data was collapsed into three broad categories: low, middle and high income. Similarly, the number of scale categories for parental education changed from 10-categories to 13-categories in 2006. In this case, the categories were made homogeneous by collapsing them into four levels of educational attainment: elementary education, high school, technical training, and university education. A proxy index for SES was calculated following Donaldson et al. (2008) that linearly combined three demographic variables: selfreported income, father education and mother education. The index ranged from 0 to 1, with higher values indicating higher SES. Schools were categorized in six types according to the type of school (TS) attended (i.e. municipal, private-subsidized or privatepaying) and the choice of program (vocational or general track). Yet, only ve categories were used in the analysisprivate-paying/ general track (PP_GT), private-subsidized/general track (PS_GT), private-subsidized/vocational track (PS_VT), municipal/general track (MU_GT) and municipal/vocational track (MU_VT). The last category private-paying/vocational track was not included since over 99% of the students from private-paying schools followed the general track curriculum. The combination of type of school attended and track (vocational vs. general track) were considered to be a proxy of the quality of education. The best quality of education was offered by the PP_GT schools and the lower end was represented by the vocational track at the municipal schools. Finally, in order to control for the changes in the scaling procedures applied to the APT and ACH tests and render scores in a common scale, test scores prior to 2006 were rescaled to approximate the scores to the same percentiles across the years. The re-scaled score distributions had a mean of 500 and a standard deviation of 110 (see Appendix A). The re-scaling procedure was based on the total scores since item-level data was not provided in the databases. 5.3. Methods The decision of test makers and educational authorities to switch tests without allowing for a period of simultaneous application of both the APT and the ACH tests required the use of statistical control of confounding variables in order to address the relative merits of APT tests, low-content ACH tests, and highcontent ACH tests as college admission tools. A hierarchical multiple regression analysis was employed to examine the impact of the selected predictors of interest and their interactions on the dependent variable (test scores). The regression analysis was conducted in two steps. The rst step estimated the impact of type of test (TT), SES, and the interaction variable SES*TT on test scores. Type of test (TT) was included as a dummy variable (APT = 0, ACH = 1). The SES*TT interaction component served to examine the claim that achievement tests are better suited for assessing students from low SES. It has a value equal to zero for APT, while it takes the value of SES for ACH. If the argument espoused by proponents of achievement tests holdsi.e. that achievement tests are better suited for assessing students of lower SESthe regression coefcient associated with SES*TT should be signicant and negative. The second step introduced of the four school dummy variables that captured the combinations of type of school and track (PS_VT;

MU_GT; PS_VT; PP_GT), where the reference category comprised students attending the vocational track at municipal schools (MU_VT), along with four two-way interaction terms representing the types of schools by type of test interactions (PS_VT*TT, PS_GT*TT, MU_GT*TT and PP_GT*TT). This model assessed whether type of school (TS) and type of school by type of test interactions (TS*TT) contributed to the explanation of score variance over and beyond that explained by the variables introduced in the rst step. Positive and signicant regression coefcients for these interactions represented the gains associated with the use of ACH tests by those attending the different types of schools, after controlling for SES. This strategy provided a stringent test of the additional variance contributed by second step variables to the prediction because variables that are entered in the rst step capture variance shared with variables entered later in the model. Given that the focus of the present analysis lies in the interaction components, both standardized and unstandardized regression coefcients will be reported, since the latter are nonarbitrary metrics that allow for meaningful comparisons between groups (as recommended by Jaccard and Turrisi, 2003). As is often the case in educational research, the predictor variables under study were correlated with each other, therefore the weights and the interpretations arising from them were context specic and subject to change radically with the addition or the deletion of a single predictor. Because of the correlation among SES and type of school predictors, the unique contribution of each depends on the order in which these are entered in the analysis, and thus the customary test of added subsets could be misleading and the magnitude of the regression coefcients per se may not be a good indicator of the usefulness of the predictor. Consequently, in order to correctly interpret the relative importance of the predictors in the model we opted to also include information on structure coefcients and a commonality analysis, as recommended by Courville and Thompson (2001), Zienteck and Thompson (2006), and Nimon (2010). The benet of employing a commonality analysis in conjunction with the analysis of structure coefcients is that it is possible to determine how much variance each variable (or set of variables) uniquely contributes and how much each shares with every other variable in the regression. Commonality analysis partitions a regression effect into constituent, non-overlapping components. The partitioning process produces unique and common effects between variables (Nimon, 2010; Zienteck and Thompson, 2006). The results presented below are based on data for the rst-time test-takers who took the tests immediately following high-school graduation. First-timers differ from the deferred-cohort of testtakers in terms of demographic variables; the latter are an older group and obtain higher test scores. There is no evidence to gauge whether this pattern is due to differences in motivation, coaching or a combined effect of these and other variables.2 Separate regressions were estimated for the entire sample of rst-time test-takers and the subset of higher-performers in highschool (i.e. those in the upper tercile of HSGPA). The group of highperformers in high school was deemed to be an important segment of the test-taking population since needy students in the upper range of high-school grades have a greater opportunity of securing merit grants and nancial aid if they perform well in the admission tests. Statistical analyses were conducted for the scores for the two mandatory tests, Verbal and Mathematics. Since both yielded

2 As a validity check, the regression analyses were also run for the group comprising the deferred cohort of test-takers and repeat test-takers. In general, the results were consistent with those obtained for the rst-timers, albeit with less variance explained by the model for the deferred cohort.

110

M. Koljatic et al. / International Journal of Educational Development 33 (2013) 106115

similar results with slightly higher variance explained for the Mathematics test, only the latter are reported here. 6. Results 6.1. Descriptive statistics Table 1 shows the profound socioeconomic stratication prevalent in the Chilean secondary school educational system. Students enrolled in the vocational tracks at the municipal schools ranked lowest in all SES-related variables, followed by those in the vocational tracks at the private-subsidized schools. Table 2 shows the evolution of test score averages across the years. As expected, high correlations were observed between test scores and SES and test scores and type of school (see Table 3). The correlation between type of school and SES (rTS/SES) remained stable throughout the years indicating the pervasive segmentation of SES by type of school. The following sections deal with the magnitude of the changes and the extent to which the gap growth can be attributed to both the changes in the type of test and the amount of contents assessed in them. 6.2. APT test vs. low-content ACH test The results of the regression analysis that examined whether lower-content ACH tests (ACH2004 and ACH2005) are better

suited than APT tests (APT2002 and APT2003) to assess disadvantaged groups are presented in the rst four columns of Table 4, for all rst-time test-takers and the subset of high-performers in highschool. For all test-takers and for the subset of high-performers in high school, the set of variables at the rst step explained 22% (F = 36,190; p < .00) and 31.5% (F = 17,313; p < .00) of the variance, respectively. For the full sample, albeit small, the regression coefcient for the SES*TT interaction appeared to support the claim by proponents of achievement tests that its use favored test-takers from low SES (b = 13.4, t = 11.63, p < .00). However, for highperformers the SES*TT coefcient was non-signicant (b = 1.7, t = .98, p < .33). The inclusion of the type of school (TS) variables and their interactions at Step 2 increased variance explained by the model by approximately 2.7% (Fchange = 1732; p < .00) for the full sample and 5.6% (Fchange = 1260; p < .00) for the high-performing group, respectively, indicating an additional contribution of TS and TS*TT interactions beyond variance accounted for by Step 1 variables. For both the entire sample and the high-performers, the values of the regression coefcients for SES*TT showed a small positive effect for disadvantaged groups (b = 18.8, t = 12.60, p < .00; b = 11.9, t = 4.93, p < .00). However, the TS*TT interactions indicated small gains in scores for those attending general tracks when assessed by ACH tests, i.e. favoring the more afuent groups of the population. The highest gains when assessed by ACH tests were associated with attendance to private-paying schools (b = 17.2, t = 11.90, p < .00) and the municipal schools (b = 15.8, t = 14.90, p < .00). For

Table 1 Selected socio-economic statistics of test-takers according to type of school, track and year (percent). 2002 Father % Attaining elementary education only Private-paying/GT 2.9 Private subsidized/GT 14.0 Private subsidized/VT 31.8 Municipal/GT 28.7 Municipal/VT 38.8 % Of fathers employed in blue collar/menial jobs Private-paying/GT 1.4 Private subsidized/GT 10.1 Private subsidized/VT 26.5 Municipal/GT 23.6 Municipal/VT 33.7 Mother % Attaining elementary education only Private-paying/GT 2.8 Private subsidized/GT 15.0 Private subsidized/VT 35.5 Municipal/GT 30.8 Municipal/VT 40.8 % Of mothers employed in blue collar/menial jobsa Private-paying/GT 1.0 Private subsidized/GT 6.1 Private subsidized/VT 15.8 Municipal/GT 10.8 Municipal/VT 16.1 2003 2004 2005 2006 2007 2008

2.4 13.4 32.7 27.9 38.4

1.8 12.3 31.2 26.7 37.2

1.7 11.2 30.5 26.2 35.7

1.6 11.0 28.6 26.2 36.7

1.6 12.3 31.9 28.4 38.9

1.3 12.3 32.4 28.6 39.0

1.2 10.1 27.2 22.8 33.7

1.2 10.5 26.6 23.4 35.3

1.2 11.6 29.5 26.2 36.0

1.2 12.4 29.6 27.3 38.4

1.4 14.5 34.0 31.1 42.0

1.2 15.6 35.4 32.9 43.3

2.5 14.4 35.1 29.7 41.1

1.8 12.6 32.0 28.1 38.9

1.5 11.7 30.0 26.0 35.4

1.5 10.7 28.1 25.8 36.9

1.3 11.5 31.7 27.8 38.2

1.3 11.9 31.5 28.1 38.6

.9 6.3 15.4 10.8 16.7

.8 6.4 16.1 11.4 16.1

.7 6.3 14.8 10.8 14.7

.8 6.5 15.1 11.6 16.1

0.8 7.3 15.7 12.1 16.2

.8 8.0 17.5 13.1 17.5

Family % Reporting family income of US $ 500 dollars/month or less Private-paying/GT 9.7 7.6 Private subsidized/GT 49.8 48.3 Private subsidized/VT 76.4 74.9 Municipal/GT 71.7 70.1 Municipal/VT 84.3 83.4

7.0 45.4 73.8 69.5 82.2

8.1 46.9 75.8 71.9 83.2

6.4 45.2 73.0 71.1 83.4

6.6 48.2 82.4 75.1 88.7

15.8 46.1 77.4 72.1 85.4

GTgeneral track; VTvocational track. a The relatively lower rates of maternal employment in menial jobs is consistent with data from international studies where Chile ranks among the countries with lower female employment rates (OECD, 2011). Paternal unemployment of timely test-takers was less than 4% across the years while maternal unemployment climbed over 50%. The group attending private-paying schools reported higher rates of maternal employment.

M. Koljatic et al. / International Journal of Educational Development 33 (2013) 106115 Table 2 Mathematics test score means, standard deviations and sample size of rst-time test-takers by year. APT2002 Private-paying_GT 588 Mean SD 145 N 20.917 Private-subsidized_GT Mean 492 131 SD N 30.005 Private-subsidized_VT Mean 411 SD 93 N 12.669 Municipal_GT Mean SD N Municipal_VT Mean SD N Total Mean SD N APT2003 603 145 19.116 ACH2004 577 112 18.011 ACH2005 586 110 18.433 ACH2006 591 111 18.333 ACH2007 603 106 18.708

111

ACH2008 609 103 18.871

495 134 32.887

500 100 33.566

503 101 40.203

502 101 46.634

511 101 49.460

510 100 54.191

406 96 11.930

430 85 10.560

430 88 11.589

432 87 12.364

434 85 18.563

433 85 19.981

445 123 37.969

449 130 36.252

466 109 32.251

476 109 33.913

475 108 35.616

475 107 39.051

475 108 39.780

418 93 16.176

411 96 16.729

434 83 12.956

435 86 14.072

431 85 14.581

431 82 22.040

427 84 22.554

475 137 117.736

477 141 116.914

488 112 107.344

493 113 118.210

492 112 124.528

491 112 148.622

491 112 155.377

high-performers in high-school, the same trend was observed, with even larger gains for those attending the general tracks at private-paying schools or municipal schools when assessed by ACH tests (b = 22.5, t = 9.54, p < .00; b = 22.1, t = 12.47, p < .00). The squared structure coefcients showed a consistent pattern with the regression coefcients for both the total sample and the subset of high-performers, indicating that the highest predictor of scores was SES (rs(SES)2 = 0.887; rs(SES)2 = 0.848) and attendance to private-paying schools (rs(PP_GT)2 = 0.528; rs(PP_GT)2 = 0.518). Among the interactions, the highest squared structure coefcients corresponded to the PP_GT*TT interactions (rs(PP_GT*TT)2 = 0.249; rs(PP_GT*TT)2 = 0.239) and the SES*TT interactions (rs(SES*TT)2 = 0.236; rs(SES*TT)2 = 0.218) indicative of their relevance for score prediction for the total sample and the high-performers. The partitioning of variance through commonality analysis for the full sample revealed that Step 1 variables, contributed 6.9% of unique variance while school-related variables accounted for 2.7% of unique variance. The common contribution of SES and TS variables (i.e. shared variance) amounted to 15.1%. For the high performers the unique variance accounted by Step 1 variables was 6.8%; the unique variance associated with TS and interaction variables was 5.6% while shared variance was 24.7%. Common variance between SES and TS exceeded their unique contributions for both groups. The unique variance of TS variables, albeit smaller that that for SES was still substantive, particularly for the group of high-performers in high-school.

6.3. APT test vs. high-content ACH test The last four columns of Table 4 present the results of the regression analyses that examines whether APT tests (APT2002/ APT2003) are better suited than high-content ACH tests (ACH2007/ ACH2008) for disadvantaged groups. The set of variables at Step 1 explained 24.6% (F = 47,177; p < .00) and 33.4% of the variance (F = 20,836, p <. 00) for the entire sample and the high-performers, respectively. The inclusion of the type of school variables and their interactions at Step 2 increased variance explained by the model by 2.9% (Fchange = 2190; p < .00) for the entire sample and 6.3% (Fchange = 1619; p < .00) for the subset of high performers, respectively. The SES*TT interaction was not signicant for the full sample (b = 2.3, t = 1.63; p > .01) whereas for high performers the regression coefcient albeit small, was in the predicted direction favoring the lower SES students (b = 9.20, t = 4.10; p < .00). However, just as in the case of the lower content ACH test, the inclusion of type of school dummies along with their interaction variables revealed overriding gains associated with the use of ACH tests, for those attending the general tracks, particularly private-paying schools (b = 25.9, t = 18.9; p < .00) with even larger gains for high-performers (b = 31.7, t = 14.3; p < .00). The squared structure coefcients again showed a relatively consistent pattern with the regression coefcients, revealing that the highest predictor of scores was SES both for the total sample and the high-performing group (rs(ses)2 = 0.885, rs(ses)2 = 0.837) followed by attendance to private-paying schools (rs(PP_GT)2 = 0.488, rs(PP_GT)2 = 0.496). Among the interactions, the coefcients for (PP_GT)*TT and SES*TT were consistently high for both groups. The commonality analysis indicated that the unique variance associated with SES and test-related variables amounted to 7.4% for the full sample. School and test-related variables contributed 2.9% of unique variance. Shared variance between the two set of predictors was 17.2%. For high-performers the differences between the unique contributions of SES and TS variables were considerably less pronounced (6.8% and 6.3%, respectively) with shared variance between the two amounting to 26.6%.

Table 3 Correlations of Mathematics test scores, SES and type of school (TS). rMath/SES 2002 2003 2004 2005 2006 2007 2008 .48 .49 .46 .45 .45 .50 .52 RSES/TS .58 .58 .58 .56 .56 .58 .55 rMath/TS .37 .39 .39 .39 .40 .43 .44

112

M. Koljatic et al. / International Journal of Educational Development 33 (2013) 106115

Table 4 Regression of math scores for low- and high-content tests (standard errors in parentheses). Low-content All b N Step 1 Intercept SES TT SES*TT R2 adj. Step 2 Intercept SES TT SES*TT PP_GT PS_GT MU_GT PS_VT PP_GT*TT PS_GT*TT MU_GT*TT PS_VT*TT R2 adj. R2 change 385,579 404.9 194.8 9.9 13.4 (.40)* (.80)* (.58)* (1.15)* High-performers High-content All High-performers

b 112,971 470.1 198.9 4.6 1.7 (.69)* (1.22)* (.98)* (1.74)

b 434,340 404.9 (.39)* 194.8 (.79)* 5.54 (.52)* 11.7 (1.07)*

b 124,632 470.1 198.9 6.6 5.7 (.67)* (1.20)* (.91)* (1.63)*

.484 .043 .034 .220

.563 .021 .005 .315

.480 .024 .029 .246

.568 .030 .017 .334

403.2 146.3 1.8 18.8 59.0 32.7 12.3 11.4 17.2 8.3 15.8 2.4

(.64)* (1.05)* (.94) (1.49)* (.99)* (.77)* (.72)* (.91)* (1.49)* (1.10)* (1.06)* (1.32)

.363 .008 .048 .193 .132 .050 .03 .041 .027 .049 .005 .247* .027*

451.1 136.7 6.2 11.9 84.4 61.0 44.0 3.0 22.5 16.0 22.1 3.9

(1.05)* (1.69)* (1.51)* (.2.41)* (1.63)* (1.29)* (1.24)* (1.59) (2.36)* (1.84)* (1.77)* (2.27)

.387 .028 .035 .323 .252 .177 .008 .064 .052 .067 .007 .371* .056*

403.23(.62)* 146.3 (1.03)* 4.6 (.82)* 2.3 (1.40)* 59.0 (.97)* 32.7 (.75)* 12.3 (.70)* 11.4 (.88)* 25.9 (1.37)* 13.9 (.98)* 21.7 (.94)* 10.7 (1.15)*

.360 .020 .006 .181 .134 .049 .033 .056 .048 .068 .024 .275* .029*

451.1 (1.02)* 136.7 (1.64)* 5.5 (1.38)* 9.2 (2.23)* 84.4 (1.58)* 61.0 (1.54)* 44.1(1.20)* 3.0 (1.54) 31.7 (2.21)* 20.6 (1.68)* 26.6 (1.63)* 10.8 (2.04)*

.390 .025 .027 .315 .259 .177 .008 .087 .073 .083 .022 .397* .063*

Squared structure coefcients at Step 2 SES TT SES*TT PP_GT PS_GT MU_GT PS_VT PP_GT*TT PS_GT*TT MU_GT*TT PS_VT*TT Unique variance (%) Step 1 variables Step 2 variables Shared variance (%)
*

.887 .005 .236 .528 .026 .006 .123 .249 .016 .011 .054

.848 .001 .218 .518 .019 .028 .155 .239 .012 .005 .077

.885 .002 .302 .488 .045 .040 .116 .272 .004 .006 .061

.837 .002 .272 .496 .028 .023 .155 .269 .024 .003 .084

6.9 2.7 15.1

6.8 5.6 24.7

7.4 2.9 17.2

6.8 6.3 26.6

p < .001.

In summary, there is an increase of variance explained in test scores associated with SES and type of school variables when achievement tests assess higher amounts of contents. Thus, under conditions of unequal quality of education, the amount of contents assessed in ACH tests does not appear to be an irrelevant variable in terms of test scores. The ndings hold for all test-takers, but particularly for the group of high-performers in high-school. The examination of the regression models does not support the claim of ACH test proponents of their inherent superiority for the assessment of underprivileged groups when quality of schooling is taken into account. In particular, high-content tests appear to provide an additional edge to test-takers from afuent groups who can afford quality education. 7. Discussion The change of admission tests that took place in Chile sheds light from a practical and also a conceptual perspective on the benets of achievement tests over aptitude tests to promote equitable access. The ndings from the present studyalbeit correlational in nature provide credible evidence that the benets of achievement tests for the assessment of disadvantaged students do not hold independently of amount of contents assessed in them and under conditions of unequal access to educational opportunities. Any marginally

positive effects of using achievement tests for disadvantaged groups of society may only hold for light content tests; when achievement tests evaluate larger amounts of contents, the benets are washed away. The gains associated with the interaction of type of school and type of test were consistent throughout the analyses, with larger increases in scores accrued for those attending private-paying schools when ACH tests assessed larger amounts of contents. The less favored groups by the switch in the tests were those attending the vocational tracks at municipal and private-subsidized schools that cater to the poorer sectors of society. As expected, SES explained the largest proportion of unique variance in test performance. However, quality of education variables should be taken into account when deciding on the implementation of admission tests. Particularly in nations where the poor receive a lower quality of education it cannot be assumed that the use of achievement tests per se will be benecial for disadvantaged students, as was done in Chile. It has been argued that the difference between aptitude and achievement tests may have been overstated, since aptitude tests require knowledge of vocabulary, reading skills, and mathematical operations that are taught in schools and quality achievement tests reect curriculum standards that emphasize reasoning over mere memorization of facts (Bridgeman et al., 2004, p. 286). However, the similarity

M. Koljatic et al. / International Journal of Educational Development 33 (2013) 106115

113

between aptitude and achievement tests may hold up better when the comparison involves low-content achievement tests. The present ndings offer empirical evidence that the switch from aptitude to high-content achievement tests may be detrimental to underprivileged test-takers in nations where access lite. The detrimental to quality education is the privilege of an e effects may seem marginal, but in a nation like Chile where equity of access to higher education has been so hard to achieve, a small step backwards can be of consequence, as reected by the progressive drop in the enrollment of students from municipal schools at top Chilean universities (Simonsen, 2008). This trend is consistent with ndings by Koretz (2000) in the U.S., who reported that when scores count heavily in admissions differences in scores between extreme groups will have a major impact on the probability that students from the disadvantaged group will be admitted, affecting the composition of the student body. Although there is a need to guarantee that the tests cover an adequate range of requisite knowledge and skills that are required in order to succeed in college, aligning the tests with the curriculum does not require an unrestrictive inclusion of contents, even if these are prescribed in the high-school curriculum. In developing nations that suffer from unequal access to quality education, admission tests may contribute to enhance equity if focused on selected contents that are demonstrably predictive of success in the rst year of college. In Chile, the inclusion of extensive contents in the admission tests did not favor equitable access nor improved prediction. Recent data from a system-wide study of predictive validity for the new ACH tests revealed a substantial drop in the predictive capacity of the Math ACH Test, from an average correlation of .29 for the admission cohort of 2004 (i.e. the test with the lightest curricular content) to .25 in 2007 when the tests included full contents (Bravo et al., 2010). Thus, the increase of contents assessed in the tests may have been detrimental not only for fair access but also in terms of the tests predictive capacity. Although Atkinson (2001, p. 36) expressed that the movement away from aptitude tests towards achievement tests [was] an appropriate step for U.S. students, schools and universities, the risks in generalizing outside the realm of the U.S. need to be underscored. In nations where educational systems are characterized by widely variable educational inputs and where the poor receive the lowest quality of schooling, the claim that admission criteria that emphasize demonstrated achievement over potential ability are better aligned with the needs of disadvantaged students and schools (Geisser, 2009, p. 18) remains unsupported. On the contrary, the evidence from Chile is more consistent with the assertion that in social contexts where school quality varies widely, the use of achievement tests for selection purposes should be avoided, since these are likely to measure the individuals opportunity to learn rather than ability to learn (Heyneman, 1987). 7.1. Limitations of the study The limitations of the analyses presented here stem from the nature of the data available to study the question of interest. Test

developers did not make provisions to properly examine whether ACH tests would be fairer for the assessment of low SES students. If both APT and ACH tests had been applied simultaneously to the same pool of applicants, the questions regarding the relative merits of APT and ACH tests and the impact of increasing the amounts of contents tested when assessing disadvantaged groups could have been responded in a straightforward manner. This oversight required the analysis of data arising from different samples taking different tests with varying amounts of contents. The analysis was also constrained by lack of access to item-level data which would have allowed for a more precise rescaling of the tests through the years. The re-scaling procedure employed was the best approximation possible given the limited data access. Additionally, information on key variables such as coaching and motivation to pursue university education was unavailable. Type of test training, number of hours devoted to it and amount of resources invested in coaching for the test, if available could have allowed improving prediction and gaining a fuller understanding of the relevance of coaching for the different types of admission tests. International evidence indicates that the implementation of highly competitive tests may play against the disadvantaged groups when these have received a low quality of secondary education unless they can compensate their handicap through coaching or tutoring courses (Lewis and Dundar, 2002). Thus, the role of coaching practices in test performance remains to be examined. 7.2. Lessons learned Policy makers in developing nations that have a national curriculum should be wary about assuming that achievement tests are superior to aptitude tests to promote fair access. Although proponents of achievement tests for selection purposes have argued that the benets of national achievement tests may be impossible to attain in the absence of a national curriculum (Geisser, 2009), this statement should be further qualied. Even in nations that have a national curriculum, such as Chile, the alleged benets of achievement tests for the socially and educationally deprived groups have not been attained. The conception of fairness as opportunity to learn should be critically analyzed when considering the use of achievement tests, particularly high-content tests for admission purposes in nations where access to quality education is the privilege of an elite. As stated by the Standards for Educational and Psychological Testing (1999, p. 76) when test-takers have not had the opportunity to learn the material tested, the policy of using their test scores as a basis for [a high-stakes decision]. . . is viewed as unfair. Finally, the Chilean experience with the change of tests should serve to underscore the need to avoid simplistic approaches in the pursuit of equity of access to higher education and highlights the costs of ignoring evaluative research in the process of change. Any revisions and modications in national admission systems should be backed by a solid validity framework in order to guarantee that the expected benets of the change are attained and to prevent the occurrence of negative unanticipated consequences, particularly, for disadvantaged groups.

114

M. Koljatic et al. / International Journal of Educational Development 33 (2013) 106115

Appendix A Scores corresponding to percentiles of the Math tests for APT and ACH tests (prior to re-scaling).Math test.
Percentile 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 96 97 98 99 APT2002 319 327 344 353 361 378 387 404 412 438 455 480 506 540 574 608 650 693 744 752 769 778 795 APT2003 308 325 334 351 359 368 385 402 419 436 462 488 514 548 582 617 651 702 754 771 780 788 814 ACH2004 301 342 377 403 414 425 442 457 470 487 497 511 527 541 559 577 602 631 667 680 720 759 780 ACH2005 304 345 378 392 416 435 450 464 476 493 507 520 532 551 569 587 609 636 680 694 713 725 795 ACH2006 310 358 380 398 415 429 441 462 471 486 505 516 534 547 565 586 607 637 680 690 707 740 777 ACH2007 300 353 376 395 413 428 441 463 472 487 501 517 531 549 568 585 609 636 681 692 705 728 763 ACH2008 310 355 375 393 408 433 444 463 471 486 506 517 532 550 567 585 608 636 682 688 708 727 761 ACH2009 312 356 376 395 410 425 449 460 469 485 498 515 529 546 564 584 607 636 677 694 708 724 763

Note: As may be observed from the table, scores in the APT tests and ACH were not comparable before re-scaling. For example, a student who scored in the 95th percentile in the APT 2003 had 754 points. The following year, the score corresponding to the same 95th percentile was 87 points lower: 667. Conversely, the score assigned to the 40th percentile in the ACH 2004 was 55 points higher than the one assigned in APT 2003. In essence, the new ACH scaling boosted the scores at the lower end of the distribution and deated the upper end of the distribution. Consequently, scores prior to 2006 were rescaled before conducting the statistical analyses.

References
Atkinson, R., 2001. Achievement versus aptitude in college admissions. Issues in Science and Technology 18 (2), 3136. Beyer, H., April 25, 2002. Sobre las pruebas de ingreso a las universidades. El Mercurio, p. A2. Bravo, D., Bosch, A., del Pino, G., Donoso, G., Manzi, J., Martinez, M., Pizarro, R., 2010. Validez diferencial y sesgo de predictividad de las pruebas de n a las universidades chilenas. Mimeo. Documento del Comite Te cadmisio nico Asesor Consejo de Rectores de las Universidades Chilenas, Santiago, Chile. aga, O., Contreras, D., Himmel, E., Rosas, R., Sevilla, M., Bravo, D., Manzi, J., Larran n de las pruebas de seleccio n a la educacio n superior. 2000. Reformulacio Unpublished manuscript. Octavo Concurso Nacional de Proyectos de Investiga n y Desarrollo, FONDEF-CONICYT, Santiago, Chile. cio Bridgeman, B., Burton, N., Cline, F., 2004. Replacing reasoning tests with achievement tests in university admission: does it make any difference? In: Zwick, R. (Ed.), Rethinking the SAT: The Future of Standardized Testing in University Admissions. Routledge Falmer, New York, pp. 277288. Brunner, J., June 19, 2002. SIES: Tres preguntas y la responsabilidad de las universidades. La Segunda, p. 14. Courville, T., Thompson, B., 2001. Use of structure coefcients in published multiple regression articles: b is not enough. Educational and Psychological Measurement 61 (2), 229248. n Dalgalarrando, G., January 9, 2007. Crece brecha de puntajes entre educacio blica y privada. El Mercurio, p. C1. pu a ser aplicado a contar del 2003. , Diario Austral, July 21, 2002. El SIES comenzara p. A12. Donaldson, K., Lichtenstein, G., Sheppard, S., 2008. Socioeconomic status and the undergraduate engineering experience: preliminary ndings from four American universities. Paper Presented at the Meeting of the American Society for Engineering Education, 2225 June, Pittsburgh, PA. Eyzaguirre, B., Le Foulon, C., 2002. SIES: Un proyecto prematuro, vol. 87. Estudios blicos, pp. 3953. Pu blicos, pp. 2555. Fontaine, A., 2002. Peligro en el SIES, vol. 86. Estudios Pu Geisser, S., 2009. Back to the basics: in defense of achievement (and achievement tests) in college admissions. Change (January/February), 1623. quez, J., August 4, 2003. Me impresiona que estemos haciendo el mismo Henr cambio en EE.UU y Chile. La Segunda, p. 9. Heyneman, S., 1987. Uses of examinations in developing countries: selection, research, and education sector management. International Journal of Educational Development 7 (4), 251263. Jaccard., J., Turrisi, R., 2003. Interaction Effects in Multiple Regression. Sage Publications, Thousand Oakes.

Koljatic, M., Silva, M., 2006. Equity issues associated with the change of college admission tests in Chile. Equal Opportunities International 25 (7), 544561. Koretz, D., 2000. The impact of scores differences on the admission of minority students: an illustration. NBETPP Statements 1 (5), 116. Lemann, N., 2004. A history of admission testing. In: Zwick, R. (Ed.), Rethinking the SAT: The Future of Standardized Testing in University Admissions. Routledge Falmer, New York, pp. 514. Lewis, D., Dundar, H., 2002. Equity effects of higher education in developing countries: access, choice, and persistence. In: Chapman, D., Austin, A.E. (Eds.), Higher Education in the Developing World: changing Contexts and Institutional Responses. Greenwood, Westport, CT, pp. 169194. Lohman, D., 2004. Aptitude for college: the importance of reasoning tests for minority admissions. In: Zwick, R. (Ed.), Rethinking the SAT: The Future of Standardized Testing in University Admissions. Routledge Falmer, New York, pp. 4156. n, November 22, 2000. Comisio n nuevo curriculum de la Ministerio de Educacio n a la educacio n superior: anza media y pruebas del sistema de admisio ensen n. Mimeo, Informe sometido en consulta previa a la Ministra de Educacio n, Chile. Ministerio de Educacio n, 2008. Bases para una Pol tica de Formacio n Te cnicoMinisterio de Educacio n, Chile. Profesional en Chile. Informe Ejecutivo. Mimeo, Ministerio de Educacio Molina, P., December 18, 2008. La PSU aumenta la brecha. El Mercurio, p. D3. NACAC, 2008. Report of the Commission on the Use of Standardized Tests in Undergraduate Admission. National Association for College Admission Counseling, Arlington, VA. Nimon, K., 2010. Regression commonality analysis: demonstration of an SPSS solution. Multiple Linear Regression Viewpoints 36 (1), 1017. OECD, 2009. Reviews of National Policies for Education: Tertiary Education in Chile. OECD, Paris. OECD, 2011. Families are changing. OECD, Paris. n universitaria, 2008. PSU: Consejo de Rectores analiza mecanismo de seleccio http://uchile.cl/un49226 (accessed 20.07.11). Simonsen, E., November 30, 2008. Ingreso de alumnos municipales a universidades os. La Tercera, p. 40. top baja 10% en 6 an Standards for Educational and Psychological Testing, 1999. American Educational Research Association, American Psychological Association and National Council on Measurement in Education. American Educational Research Association, Washington, DC. Sternberg, R., 2006. How can we simultaneously enhance both academic excellence and diversity? College & University 82, 39. The Chronicle of Higher Education, June 7, 2010. China Begins to Reform its Controversial College-entrance Exam. http://chronicle.com/article/ChinaBegins-to-Reform-Its/65804/ (accessed 20.07.11).

M. Koljatic et al. / International Journal of Educational Development 33 (2013) 106115 n Universitaria, Julio 2002. Seminario prueba de admisio n a las universidades Visio chilenas: habilidades generales o conocimientos. Ponticia Universidad Cato lica de Chile, Santiago, Chile. Wildavsky, B., November 11, 2010. Why Brazils Standardized Entrance Test Deserves to be Salvaged. The Chronicle of Higher Education. http://chronicle. com/blogs/worldwise/why-brazils-standardized-entrance-test-deserves-to-besalvaged/27577 (accessed 20.07.11). World Bank, 2000. Higher Education in Developing Countries: Peril and Promise. World Bank, Washington, DC.

115

World Bank, 2001. Implementation Completion Report, CL-Secondary Education: World Bank Report No. 22979. Zienteck, L., Thompson, B., 2006. Commonality analysis: partitioning variance to facilitate better understanding of data. Journal of Early Intervention 28 (4), 299307. Zwick, R., 2004. Is the SAT a wealth test? The link between educational achievement and socioeconomic status. In: Zwick, R. (Ed.), Rethinking the SAT: The Future of Standardized Testing in University Admissions. Routledge Falmer, New York, pp. 203216.

You might also like