When it comes to tracking the incremental advances of AI potential, humans have an odd tendency to think in terms of board games we probably haven’t played since childhood. Though there’s no shortage of examples, even recent ones, highlighting AI’s ability to utterly own the cardboard gaming space, those tests only go so far in illustrating the tech’s effectiveness at solving real world problems.
A potentially far better “challenge,” would be to put an AI side by side with humans in a programming competition. Alphabet-owned DeepMind did just that with its AlphaCode model. The results? Well, AlphaCode performed well but not exceptional. The model’s overall performance, according to a paper published in Science shared with Gizmodo, corresponds to a “novice programmer” with a few months to a year of training. Part of those findings were made public by DeepMind earlier this year.
In the test, AlphaCode was able to achieve “approximately human-level performance” and solve previously unseen, natural language problems in a competition by predicting segments of code and creating millions of potential solutions. After generating the plethora of solutions, AlphaCode then filtered them down to a maximum of 10 solutions, all of which the researchers say were generated, “without any built-in knowledge about the structure of computer code.”
AlphaCode received an average ranking in the top 54.3% in simulated evaluations in recent coding competitions on the Codeforces competitive coding platform when limited to generation 10 solutions per problem. 66% of those problems, however, were solved using its first submission.
That might not sound all that impressive, particularly when compared to seemingly stronger model performances against humans in complex board games, though the researchers note that succeeding at coding competitions are uniquely difficult. To succeed, AlphaCode had to first understand complex coding problems in natural languages and then “reason” about unforeseen problems rather than simply memorizing code snippets. AlphaCode was able to solve problems it hadn’t seen before, and the researchers claim they found no evidence that their model simply copied core logix from the training data. Combined, the researchers say those factors make AlphaCode’s performance a “big step forward.”
“Ultimately, AlphaCode performs remarkably well on previously unseen coding challenges, regardless of the degree to which it ‘truly’ understands the task,” Carnegie Mellon University, Bosch Center for AI Professor J. Zico Kolter wrote in a recent Perspective article commenting on the study.
AlphaCode isn’t the only AI model being developed with coding in mind. Most notably, OpenAI has adapted its GPT-3 natural language model to create an autocomplete function that can prejudice lines of code. GitHub also has its own popular AI programming tool called Copilot. Neither of those programs however, have shown as much prowess competing against humans in solving complex competitive problems.
Though we’re still in the relatively early days of AI assisted code generation, the DeepMind researchers are confident AlphaCode’s recent successes will lead to useful applications for human programmers down the line. In addition to increasing general productivity, the researchers say AlphaCode could also “make programming more accessible to a new generation of developers.” At the highest level, researchers says AlphaCode could one day potentially lead to a cultural shift in programming where humans mainly exist to formulate problems which AI’s are then tasked to solve.
At the same time, some detractors in the AI space have called into question the efficacy of the core training models underpinning many advanced AI models. Just last month, a programmer named Matthew Butterick filed a first of its kind lawsuit against Microsoft-owned GitHub, arguing its Copilot AI assistant tool blatantly ignores or removes licenses presented by software engineers during its learning and testing phase. That liberal use of other programmers’ code, Butterick argues, amounts to “software piracy on an unprecedented scale.” The results of that lawsuit could play an important role in determining the ease with which AI developers, particularly those training their models on past humans’ code, can improve and advance their models.