On Thursday, OpenAI announced the release of GPT-5.5, the latest update to its flagship model. It is exactly as much of an upgrade as the jump from 5.4 to 5.5 would suggest.
The company called the model its “most intuitive to use” and a “next step toward a new way of getting work done on a computer,” which seems like the kind of thing you say when you don’t have major improvements to show right now. In a blog post, it did say that the model runs faster than previous iterations and shows “gains” in “agentic coding, computer use, knowledge work, and early scientific research—areas where progress depends on reasoning across context and taking action over time.” For now, we’ll have to take their word on that.
On X, OpenAI CEO Sam Altman expressed his excitement for GPT-5.5, stating, “I personally like it.” Wow. Altman also praised the team behind it, saying, “Really excellent work by the inference team to serve this model so efficiently,” in reference to the model’s reportedly improved performance. “To a significant degree, we have to become an AI inference company now,” he said. Altman has also been reposting praise, including a post from Magic Path CEO Pietro Schirano that said GPT-5.5 gave him “my first taste of AGI.”
As is always the case when companies push a new model update, OpenAI has included a slew of benchmarks that suggest its output is better than ever. It notes that it tops rival Anthropic’s Claude Opus 4.7, that company’s newest publicly available model, in several cybersecurity standards and computer-use benchmarks that test an AI agent’s ability to operate autonomously. However, the company is still lagging behind Anthropic in coding tests. Menlo Ventures partner Deedy Das said OpenAI’s model doesn’t reach state-of-the-art status when it comes to coding capabilities.
And of course, Anthropic would probably argue that Claude Opus 4.7 isn’t even the model that should be the standard. When the company dropped that model last week, it loaded up basically every benchmark with evidence that its Claude Mythos Preview, the too-powerful-to-be-made-public model that it’s only giving out limited access to, blows away all the alternatives, including Opus.
Regardless, GPT-5.5 seems like an incremental improvement over GPT-5.4. And it’s becoming unclear just how useful benchmarks really are for evaluating these tools. Increasingly, companies train to the test, and when they’re pushed in unexpected ways, they still break. How much anyone outside of the industry really cares about these test scores also seems questionable.
But you definitely can’t say that OpenAI isn’t iterating. Over the past week, the company announced a new image generation model, “workspace agents” that can complete tasks autonomously, a model for detecting and redacting personally identifiable information in text, and dropped an update to its coding agent Codex. Sam Altman’s army is definitely pumping out updates, even if they all start to blend together.