Image: Gizmodo

Australia uses five different tests to evaluate potential immigrants’ mastery of English, but only one of those tests uses computer-assisted voice recognition. Louise Kennedy, a native English speaker from Ireland, was unfortunate enough to receive the automated test when she was applying for residency in Australia. She failed.

According to The Guardian, Kennedy has been in Australia for two years on a skilled worker visa. She’s been working as an equine vet and has two college degrees. She liked the country and this year, she applied for permanent residency on the basis that her profession is short-staffed in Australia. Then the machines apparently got in the way.

Advertisement

The Pearson Test of English (PTE) Academic utilizes an automated question system that asks applicants a series of questions on a monitor and records their vocal responses. The recordings are then analyzed by a system and a score is generated. Despite the fact that Kennedy received high marks on the reading and writing portions of the exam, she scored a 74 on the oral section. Australia requires a 79 or higher, so she’s out of luck and now searching for new options.

“There’s obviously a flaw in their computer software when a person with perfect oral fluency cannot get enough points,” she tells The Guardian.

As governments begin to employ more speech analysis and automated screening technology in their immigration and travel procedures, experts have warned that the tech may not be ready yet. In March, news broke that Germany would be using voice recognition technology to identify refugees’ country of origin.

Advertisement

University of Essex linguistics Professor Monika Schmid told Deutsche Welle in March that analyzing speech can be very difficult and factors like a subject’s age, use of slang, and human nature’s ability to adapt the way one speaks are all complex hurdles for software to overcome. But in Kennedy’s case, she was being tested on her basic fluency in English rather than what country she’s from. Still, the system didn’t recognize that she speaks the language.

Pearson PLC, the makers of the PTE test “categorically denied” that their system is to blame for Kennedy’s failure. In an exchange with The Guardian, Sasha Hampson, the head of English for Pearson Asia Pacific, explained that the test is used worldwide in many different circumstances. PTE only issues a score, not a pass or fail grade. Hampson argued that Australia simply has very high standards for their English requirements.

It’s true that Australia has been very strict in its immigration policies recently and Prime Minister Malcolm Turnbull has faced criticism for his approach towards immigrants and refugees. When discussing a refugee program with Donald Trump on a phone call in January, Trump admiringly told Turnbull, “you are worse than I am,” on immigration issues. So, the idea that Australia is creating an unnecessarily high standard for its testing is believable.

At the same time, that still raises questions about automated tech’s role as a scapegoat. Facebook announced last week that it will rank websites with fast load times higher in its newsfeed. The decision was framed as a way to improve user experience, but it could also have the effect of turning down the volume on “fake news” websites, an issue that has plagued Facebook since the 2016 election. Automation gives Facebook the ability to deny that it’s making any editorial decisions. Likewise, if a government wanted fewer people to pass immigration tests, an imperfect automated system would be a great way to do so.

While speech recognition technology has advanced in recent years, services that utilize it—like smart assistants and in-car navigation systems—often fail to understand users, sometimes due to their accent (this happens in Ireland, Scotland, and elsewhere).

There’s no reason to believe for certain that the PTE is a flawed test, aside from the fact that this particular woman speaks English and couldn’t pass. We’ve reached out to Pearson and asked for statistics on failure rates versus non-automated tests and we’ll update this post when we receive a reply. Update: A Pearson spokesperson directs readers to this FAQ page and highlights a few points. “Each of the major tests of English reports a reliability estimate, from 0 to 1 with 1 being the highest.” they claim. “At 0.97 PTE Academic has the highest reliability estimates for both the overall score and the communicative skills scores of all the major academic English tests.” They go on to explain that if the system indicates low confidence in its score, the test is referred to two human evaluators and a third examiner adjudicates their marks. Less than one percent of tests are flagged according to Pearson. Individual pass/fail statistics from Australia were not provided. They did say that cases can be re-evaluated based on “individual merit.”

Advertisement

Whether this system is flawed or not is almost beside the point. The fact is we’re in a rush to turn over decision-making to machines and the results can have real world consequences.

[The Guardian]