Programs meant to distinguish chatbot text from human writing have more than a few problems. Here’s a new one to add to the list: AI detectors often incorrectly categorize writing by non-native English speakers as bot-produced. More than half the time, AI detectors wrongly assumed that writing from non-native English speakers was AI-generated, according to a study published Monday in the journal Patterns.
In a world where generative AI is popping up everywhere (and I mean everywhere), the ability to separate AI-generated slop from words written by an actual human is increasingly important. Job applicants, students, and others who are routinely evaluated based on their ability to write should be able to submit work without fear that it will be misattributed to a computer program. Simultaneously, teachers, professors, and hiring managers should ideally be able to know when someone is presenting their efforts and themselves honestly.
But thanks to ever-larger language models—trained on enormous data sets— it’s becoming more and more difficult to tell apart a person’s work from a chatbot’s automated, algorithmically determined output (at least until you fact-check it). In the same way that image, voice, and video deepfakes are becoming disconcertingly difficult to spot, AI text is getting trickier to identify.
Multiple companies have begun to try to address the problem by developing AI-detection software, meant to be able to parse out a person from ‘puter. Even Open AI, the company largely responsible for the current boom in generative AI, has tried its hand at creating an AI detection tool. But spoiler alert: Most of these AI-detection tools don’t work very well, or have limited use cases, despite developer claims of unverifiable metrics like “99% accurate.”
On top of not being all that great, in a general sense, the tools might also reproduce human biases—just as generative AI itself does.
In the new study, the researchers assessed 91 TOEFL (Test of English as a Foreign Language) essays written by non-native speakers, using seven “widely used” GPT detectors. For comparison, they also ran 99 U.S. eighth graders’ essays through the same set of AI detection tools. Despite the detectors correctly classifying more than 90% of the eight-grade essays as human-written, the categorization tools didn’t fair nearly as well with the TOEFL work.
Across all seven GPT detectors, the average false detection rate for the essays written by non-native English speakers was 61.3%. At least one of the detectors erroneously labeled nearly 98% of the TOEFL essays as AI-generated. All of the detectors unanimously identified the same ~20% chunk of the TOEFL work as AI-produced, despite having been human-written.
Most AI detectors work by assessing text on a measure called “perplexity,” the study authors explained. Perplexity is essentially a metric of how unexpected a word is in the context of a string of text. If a word is easy to predict given the preceding words, then the chances are theoretically higher that AI is responsible for the sentence, as these large language models use probabilistic algorithms to pump out a convincingly organized word salad. It’s auto-complete on steroids.
Yet non-native speakers of any language tend to write in that language with a relatively limited vocabulary and predictable range of grammar. which can lead to more predictable sentences and paragraphs. The researchers found that, by simply reducing word repetition in the TOEFL sample essays, they were able to significantly reduce the number of false positives that came up in the AI detection software. Conversely, simplifying the language in the eighth-grade essays led to more of them being mistaken for AI creations.
As the new research points out, this could spell significant trouble for non-native English speakers, who already face discrimination in the job market and academic environments. On the broader internet too, such consistent AI-detector screw-ups could amplify existing inequities.
“Within social media, GPT detectors could spuriously flag non-native authors’ content as AI plagiarism, paving the way for undue harassment of specific non-native communities,” the authors write. “Internet search engines, such as Google, that implement mechanisms to devalue AI-generated content may inadvertently restrict the visibility of non-native communities, potentially silencing diverse perspectives.”
Until AI detection markedly improves, “we strongly caution against the use of GPT detectors in evaluative or educational settings, particularly when assessing the work of non-native English speakers.” Yet it’s difficult to see how AI detection (which often runs on a comparable AI model) could ever truly learn to outsmart itself.