Photo: AP

In a study published Wednesday, a pair of Dartmouth researchers found that a popular risk assessment algorithm was no better at predicting a criminal offender’s likelihood of reoffending than an internet survey of humans with little or no relevant experience.

The study compared the crime-predicting powers of an algorithm called COMPAS, already used by multiple states, to those of Amazon’s Mechanical Turk, a sort of micro TaskRabbit where people are paid to complete small assignments. Using an online poll, the researcher asked “turks” to predict recidivism based on a few scant facts about offenders. Given the sex, age, crime charge, criminal degree, and prior convictions in juvenile, felony and misdemeanor courts of 50 offenders, each of the 400 survey takers had to assess their likelihood of reoffending. The Dartmouth researchers had information on whether the offenders in question actually did reoffend.

In the end, the authors of the study found that the risk assessment algorithm was no more accurate than people without criminal justice experience. From Wired:

Overall, the turks predicted recidivism with 67 percent accuracy, compared to Compas’ 65 percent. Even without access to a defendant’s race, they also incorrectly predicted that black defendants would reoffend more often than they incorrectly predicted white defendants would reoffend, known as a false positive rate. That indicates that even when racial data isn’t available, certain data points—like number of convictions—can become proxies for race, a central issue with eradicating bias in these algorithms.

The high number of false positives is telling. Even without knowing a given defendant’s race, black defendants were erroneously believed to be more likely to offend more frequently. While it’s wildly unethical to explicitly include race as a factor in likelihood of reoffending, race nonetheless colors each data point. Racial segregation, for example, impacts where offenders live and go to school. If a school is underserved (as are a number of schools in minority neighborhoods), that impacts students’ education level and thus their income and, more broadly speaking, their opportunities in life. There’s no variable for race specifically, but it broadly affects each factor that goes into calculating recidivism.


Whether using humans or machines, there’s no real way to extricate race from any of the indicators of crime. The problem is when the reality of bias becomes concealed behind an algorithmic veneer of objectivity. We don’t expect supposedly impartial machines to repeat human biases and, as a result, those biases become invisible.

“Underlying the whole conversation about algorithms was this assumption that algorithmic prediction was inherently superior to human prediction,” Julia Dressel, the paper’s co-author, told Wired.

In a statement, the company that makes COMPAS claimed the study only confirmed “the valid performance of the COMPAS risk model,” writing, “The findings of ‘virtually equal predictive accuracy’ in this study, instead of being a criticism of the COMPAS assessment, actually adds to a growing number of independent studies that have confirmed that COMPAS achieves good predictability and matches the increasingly accepted AUC standard of 0.70 for well-designed risk assessment tools used in criminal justice.”


So what actually correlates with recidivism? It’s startlingly simple: age and prior convictions. Older people were less likely to get into trouble again; younger people, more.