Little research has been conducted to assess students’ performance in low-stakes testing situations in comparison to high-stakes situations. Low-stakes tests do not heavily impact the test-taker’s grade in a class or the course of their academic career. On the contrary, high-stakes tests usually do impact the test-taker’s grade or academic career.
Previous studies have only made statistical comparisons using a between-subjects design, meaning that a different sample of students was used for each condition (high-stakes and low-stakes). This leads to more individual differences between groups and, consequently, less statistical power. A comparison using a repeated-measures design, in which the same sample of participants is used for each condition, could be valuable, as it may reveal that students’ performance on low-stakes tests does not actually match the student's’ proficiency in the tested subject, as a result of a deficiency in effort compared to high stakes testing.
Yigal Attali (2016), influenced by past studies that revealed a negative difference in low-stakes scores from high-stakes scores, performed such a comparison, where he used data from an existing study of the effect of extra time on GRE scores by Bridgeman, Cline and Hessinger (2004) in order to determine the effect of effort in low stakes assessments.
Attali measured the participants’ operational scores (the score received on the actual GRE), research scores (score received on the research section, or the extra section of the GRE the participants had been asked to complete), and the amount of effort exerted by each participant during the research section (time taken to complete). The author justified his definition of effort as the amount of time taken to complete the research section by citing that self-report measures are often biased as subjects are often reluctant to admit they did not try their hardest. He argues that the longer a test-taker spends on a test, the fewer short, rapid-guess responses they will give, indicating a larger amount of effort.
Results of statistical analyses showed that the average research score was considerably lower than the average operational score. This was true for both the verbal and quantitative sections. Average time spent on the research section was only 63% of the time spent on the operational section. It was also found that 2% of the research group scored the lowest possible score on their section, compared to only 0.2% in the operational group. This indicates that many test-takers did not perform to their potential by giving less effort on the research section. Attali also checked for effects of motivational filtering, or filtering out scores of students who clearly gave minimal effort on the research section. He accomplished this by examining what the data would look like if minimal effort attempts were removed at increasing intervals. Removing the bottom 18% (in terms of time taken to complete the research section) of test takers greatly decreased the effect of the stakes and increased the correlation between the research scores and operational scores.
In summary, low-stakes test scores and high-stakes test scores can have a strong correlation when filtering out the test-takers who give effort unrepresentative of their ability. This is something to be kept in mind by organizations who seek to draw conclusions about students’ proficiency from low-stakes tests.
About the Author:
Connor is a Psychology major and incoming senior at California State University, Long Beach. He worked in The Changing Brain Lab during Summer 2017 and currently works as an ABA therapist. Any questions can be emailed to firstname.lastname@example.org.