The Future Is Here
We may earn a commission from links on this page

OpenAI’s New AI-Detector Isn’t Great at Detecting AI

The company unleashed ChatGPT, and a bevy of questions about copyright, academic honesty, and misinformation came along with it.

We may earn a commission from links on this page.
Stock image of robot hand and face
Maybe one day, the robot takeover will lead to less silly robot stock images.
Illustration: Tatiana Shepeleva (Shutterstock)

OpenAI, the artificial intelligence company behind viral text-generator ChatGPT, has released a new AI tool intended to help manage the mess wrought by its previous creation. Unfortunately, it’s not very good. 

The company announced a free web-based AI-detection widget on Tuesday. The application is intended to classify text samples based on how likely they are to have been generated by artificial intelligence vs. written by an actual person. Given a sample of text, it spits out one of five possible assessments: “Very unlikely to have been AI-generated,” “unlikely,” “unclear,” “possible,” or “likely.” 

Advertisement

However, in OpenAI’s own tests, the tool only correctly identified generated text as “likely AI-written” about a quarter of the time. Moreover, about one in ten times, the classifier falsely lists human-made words as computer-generated, the company noted in a blog post.

According to OpenAI, even these meh results are an improvement on the company’s previous stab at AI-text detection. And the tech startup acknowledged that, thanks to its own invention, we need improvement.

Advertisement
Advertisement

OpenAI admits that ChatGPT has thrown a complicating wrench into classrooms, newsrooms, and beyond—where the tool and others like it have stoked fears of rampant cheating, misleading info, and copyright violations. In response, the company now says it wants to help. “We recognize that identifying AI-written text has been an important point of discussion among educators, and equally important is recognizing the limits and impacts of AI generated text classifiers in the classroom,” the company said in its Tuesday blog. “While this resource is focused on educators, we expect our classifier and associated classifier tools to have an impact on journalists, mis/dis-information researchers, and other groups.” 

But in its current form, this new detection tool probably still isn’t accurate enough to meaningfully address growing concern over AI-enabled plagiarism, academic dishonesty, and the propagation of misinformation. “Our classifier is not fully reliable,” the company wrote. “It should not be used as a primary decision-making tool.”

In other words, if you suspect a news article or classroom assignment is AI-generated, whatever OpenAI’s classifier tells you may or may not be true.

In Gizmodo’s own tests, the classifier didn’t yield particularly impressive results. With multiple tests of AI-generated text, the detector gave me lukewarm results. “Possibly AI-generated,” it said about a fake news article I generated in ChatGPT moments earlier.

Advertisement
Screenshot of OpenAI classifier
This text was generated by OpenAI’s own ChatGPT, but the company’s new classifier isn’t so sure.
Screenshot: OpenAI / Gizmodo

I got the same result using a chunk of AI-produced text from ChatGPT’s stab at writing an article about itself.

Advertisement
Screenshot of OpenAI's AI-classifier
Once again: AI-generated text, and an unsure AI-detector.
Screenshot: OpenAI / Gizmodo

In response to a clip from a CNET article produced via “assist[ance] by an AI engine,” OpenAI’s detector told me it was “unlikely” to have been AI-generated.

Advertisement
Screenshot of OpenAI's AI-detector
This clip, from a CNET article produced with the help of AI, fooled OpenAI’s detector the best.
Screenshot: OpenAI / Gizmodo

However, to the tool’s credit, in 10 or so tries, I didn’t get a false positive on any text from recently published Gizmodo articles. The only response the classifier yielded on the Gizmodo posts I tested was “very unlikely AI-generated.” OpenAI noted that it purposely adjusted the confidence threshold to “keep the false positive rate very low,” in the web version of its new AI-tool. So potentially, that adjustment is working out well. Though the 9% false-positive rate that OpenAI self-reported is still pretty high.

Advertisement

Some additional limitations of the tool include that it only passably works with English and not other languages, that AI-written text can easily be edited to bypass the classifier, and that only lengthy text samples yield sort of accurate results with any reliability, according to the company.

Theoretically though, the AI-detector should get better with more use, because it itself is AI-based. The classifier is a language model trained on pairs of AI-generated/human-written text samples on the same topic. And, by opening up this stage of the classifier to the public, OpenAI is hoping to get feedback on it and “share improved methods in the future.”

Advertisement

Gizmodo contacted OpenAI with questions about its new tool, and was directed back to the blog post.

The company isn’t the first to try its hand at AI detection. A college student, Edward Tian, recently released his own program. And if you write about AI publicly like this Gizmodo author, then you’ll know that press releases touting the hottest new AI-detection software abound. But across the board, existing tools don’t seem to hold up so well against the forward march of AI-production capabilities. Like humans, automated AI detectors keep getting things wrong, as in one recent pre-print study where an automatic detector failed to clock AI-generated text more than one-third of the time. 

Advertisement

Ultimately, it’s hard to imagine how AI could learn to outsmart itself, especially as the results of AI-generation become increasingly convincing. In trying to develop reliable AI-detection, OpenAI has entered a race with itself. The better an AI text-generator, the harder it should be to suss out the resulting sentences’ AI origins. And since OpenAI is presumably trying to improve upon ChatGPT at the same time as it’s trying to improve its classification detection tool, it seems like an impossible race to win.