The Problem With Professional Test Subjects

Image for article titled The Problem With Professional Test Subjects

PBS has an article about the the rising use of Mechanical Turk — Amazon's marketplace for low-paying, one-shot jobs — for academic research surveys. In the old days, psychology students were the primary subjects. Not anymore.


Research shows that the median "Turker" has completed 300 academic studies. Jenny Marder at PBS writes about the problems this causes:

First, there's the question of dropout rates. Turkers are more likely to drop out mid-study, and that can skew the results. Then there's the question of environmental control. Read: There is none. In the lab, it's easy to monitor survey takers; not so online. Who's to say they're not watching reality television while working, or drinking a few beers on the job? To guard against this, researchers test a worker's focus by planting "attention checks" in their surveys. "Have you ever eaten a sandwich on Mars," a question might read. Or "Have you ever had a fatal heart attack?" But the attention check questions are often recycled, and experienced workers spot them immediately. ("Whenever I see the word vacuum, I know it's an attention check," Marshall has said.)

But it's the absence of gut intuition from experienced workers that concerns Rand the most.

A person's gut response to a question is an important measurement in many social psychology studies. It's common to compare the automatic, intuitive part of the decision-making brain with the part that's rational and deliberate. But a psychologist testing for this among professional survey takers may very well be on a fool's errand.

... It could be argued that the qualities that make these subjects natural and fallible, the very things that make a human human, get swallowed up by experience.

They do point out that some kinds of tests won't be affected by experience as the effect they're testing is too strong to be undone by knowledge. However, knowing what kind of surveys can and can't be put onto Mechanical Turk, given its rising prevalence in research, is very important.

Read the rest of the article — which is fascinating — at PBS.

[via Metafilter]



I use Mechanical Turk for some of my research, and the problems you list aren't really the problems with MTurk, from a research point of view.

On attention: there have been several studies done demonstrating that there is no reason to believe that people are paying less attention during an MTurk study than in the lab or in a classroom. So yes, absolutely, people not paying attention is a problem - but it's not a new problem, or one unique to MTurk. As an aside, we also don't recycle attention checks, and we see a slightly lower rate of attention check failure in MTurk than anywhere else.

You can also use software (although Amazon doesn't really like it) that tracks how fast it takes people to respond, and thus exclude the too fast and too slow responses.

The problem really is if you're doing a task (besides a survey) that requires either deception or that this be the first time they've done a task like this. You pretty much can't do those on MTurk, because everybody's done many surveys and many tasks.

MTurk isn't perfect, but it's a very, very good way to do a fast and cheap first pass look at data. It is also wildly more diverse than the average college campus, which is very important for almost all research.

So my take is that it's much better than psychology students, but it's not a panacea.