OpenAI’s new GPT 4.1 model performs well in coding

Openai announced today that it will release a new AI model to prioritize excellence in coding as it step up efforts to withstand increasingly fierce competition from companies like Google and Anthropic. These models can be provided to developers through OpenAI’s application programming interface (API).
Openai releases three sizes of models: GPT 4.1, GPT 4.1 Mini, and GPT 4.1 Nano. Kevin Weil, chief product officer of Openai, said in a live broadcast that the new model is better than the GPT-4O, the most widely used model of Openai, and is better in some ways than its largest and most powerful model, the GPT-4.5.
GPT-4.1 scores 55% on SWE-Bench, a widely used benchmark for measuring the ability of coding models. The score is several percentage points higher than other OpenAI models. The new models “good at coding, and after they are good at teaching in sophisticated they fit well into buildings,” Weill said.
The ability to write and edit code in AI models has been greatly improved in recent months, enabling more automated prototyping software methods and improving the capabilities of so-called AI agents. Competitors like Anthropic and Google have introduced models that are particularly good at writing code.
The arrival of GPT-4.1 has been widely rumored for weeks. Sources said Openai apparently tested the model on some popular rankings under the pseudonym Alpha Quasar. Some users of the “invisible” model reported impressive coding capabilities. “Quasar solved all the empty problems I had with other code genes [sic] Incomplete by LLMS,” one wrote on Reddit.
All new models can analyze eight times the code at once, improving their ability to improve and fix bugs. The new model also better follows instructions given by users, reducing the need to repeat commands in different ways to get the desired results. OpenAI shows a demonstration of GPT-4.1, building different applications, including flashcard applications for language learning.
“Developers are very concerned about coding and we have been improving the ability of models to write functional code,” Michelle Pokrass, who was training at Openai, said in a live broadcast on Monday. “We’ve been working on making it follow a different format and better explore repositories, run unit tests and write compiled code.”
GPT-4.1 is 40% faster than GPT.4O used by Openai’s most widely used developers. Openai said that in this latest version, the cost of user input queries has been reduced by 80%.
In today’s live stream, Windsurf’s CEO Varun Mohan, a popular tool for AI encoding, said the company has been testing GPT-4.1 and found that the new model is better than GPT-4O, according to its own benchmark. “We found that GPT-4.1’s degenerate behavior cases have been greatly reduced,” Mohan said, noting that the new model has less time to mislead irrelevant files.
Over the past few years, Openai has been interested in the fever of Chatgpt, an extraordinary chatbot that was first unveiled in late 2022, selling sales of more advanced chatbots and AI models. Altman said in a TED interview last week that Openai has 500 million active users per week and is using it “very fast.”