Global News

OpenAI did not make DeepSeek break out in a cold sweat.

Apr 18, 2025

"OpenAI's innovation seems to have reached a bottleneck," said an industry insider.

On April 16, local time, OpenAI released the full version of the o3 model. It had been in development for a long time. The team released o3 and o4-mini at the same time this time. This live broadcast is different. The last one had a slow, step-by-step "squeezing toothpaste" style.

OpenAI's statement highlights o3 and o4-mini. They can combine and call various tools in ChatGPT. These tools include web search, Python, image analysis, file interpretation, and image generation. This marks OpenAI's start to enhance its exploration of agent capabilities.

o3 has broken records in benchmark tests. It excels in programming, mathematics, science, and visual perception. In Codeforces, SWE-bench, and MMMU benchmarks, visual task accuracy hits 87.5%. MathVista is lower at 75.4%.

In fact, scores and rankings are routine operations. This time, o3 and o4-mini focus on uploaded images in the thinking chain. Now, images are not viewed; they also play a role in the thinking process.

Judging from the post published by Jiahui Yu, a member of the OpenAI team and alumnus of USTC, "thinking with images" should be in the research and development plan since OpenAI released the o series model in September last year. Before that, o1 Vision was quietly launched and previewed, but it did not achieve good results and did not attract attention until this time on o3 and o4-mini.

According to external experts, o3, as the successor of the o1 series model, has a 20% lower error rate on complex problems than the latter, which is suitable for complex problem queries in biology, mathematics and engineering.

In response to this, a foreign doctor of medicine posted a message after the evaluation, saying that this is indeed a great improvement. When he asked o3 some challenging clinical or medical questions, the answers were accurate and comprehensive, in line with the expectations of a true expert in the field.

For those who want to personally evaluate and experience it, OpenAI officials have stated that ChatGPT Plus, Pro members and Team users can now directly experience o3, o4-mini, and o4-mini-high, while the previous o1, o3-mini and o3-mini-high have quietly withdrawn from the stage. This operation has also been jokingly called "internal horse racing" by some netizens. When the new product comes out, all the previous ones give way.

So far, the GPT‑4.1 family bucket series, o3, and o4-mini models that were previously announced to be released have all been unveiled. According to Sam Altman, o3 and o4-mini may be the last independent AI reasoning models of ChatGPT before the release of GPT-5. There is a high probability that no other new models will be released during this period. In addition, he also said that it is expected that o3-pro will be upgraded to the professional version within a few weeks.

OpenAI claims that O3 and O4-mini are its strongest and smartest models. Some developers and users have also felt the progress when using them. However, the innovative ability does not seem to meet expectations.

"OpenAI's pace is no longer high-spirited, and it is even more at a loss." An industry insider sighed after seeing OpenAI's new products.

The o3 and o4-mini models are new and perform well. However, they lack the bold innovation seen in OpenAI's earlier advancements.

Two days ago, the GPT-4.1 series package launched. Many insiders shared with Huxiu, "I haven't seen any major breakthroughs yet." They also said, "I don't have high hopes for O3."

Or this "disappointment" should come earlier.

Last December, o3 was unveiled at the end of OpenAI's series of live broadcasts. Sam Altman called it "a very, very smart model", completely leaving O1 behind. In the ARC-AGI test, which is designed to evaluate the ability of AI systems to adapt to new tasks and demonstrate fluid intelligence, it scored 87.5%, which is also the first time to surpass the human average (85%), which shocked the industry and was considered to be a new breakthrough on the road to AGI. But in the eyes of industry developers, it seems not to be the case.

"It's like the college entrance examination results cannot represent the business ability of the job." An open source person commented sharply. Moreover, the current industry trend has entered the direction of high data requirements and full adaptation of Agents, which means that the era of privatization and hybrid model reasoning has arrived. But OpenAI's attitude towards open source is well known.

Especially at the beginning of the year, when DeepSeek-R1 made its debut with its ultra-low training cost and performance comparable to o1, it undoubtedly gave OpenAI a loud slap in the face, and DeepSeek's all-round, all-round open source seemed to give OpenAI another louder slap in the face.

Afterwards, these two slaps not only eclipsed OpenAI, but also messed up its footing and rhythm. The confusing model naming, insufficient functional innovation, ambiguous attitude towards open source attempts, and high internal staff turnover, etc., are undoubtedly making it slowly lose its competitive advantage, and it is no longer regarded as the correct and leading path to AGI a year ago...

And the official also said that the GPT4.1 series, o3, and o4-mini released this time are the last model releases before the official release of GPT-5, and are also regarded as a key step in the GPT-5 moment. It can be understood as the appetizer of GPT-5, focusing on "large quantity and fullness". However, on the road of technological climbing, it is not the only truth that quantitative change can cause qualitative change, not to mention that this amount is far from enough.

"GPT-5 should be composed of multiple GPT4.1s." An industry insider joked. It has been rumored that GPT-5 may be released in May. Whether OpenAI can return to its peak will only be revealed at that time.

The picture is from the Internet.
If there is any infringement, please contact the platform to delete it.

OpenAI did not make DeepSeek break out in a cold sweat.

Popular News