OpenAI launches GPT-5.4 mini and nano, two models designed for speed and cost optimization. Two models that rival Anthropic’s Claude Sonnet 4.6 on benchmarks for a fraction of the price.
It’s a battle in the great generative AI war that publishers are fighting. Fast mode at Anthropic, partnership with Cerebras at OpenAI, model optimized for speed at Nvidia (Nemotron 3 Super)… As models saturate code and reasoning benchmarks, the big names in AI are now fighting on new ground: generation speed. Latest feat of arms to date for OpenAI: the launch this March 17 of GPT-5.4 mini and nano, two ultra-optimized versions close in benchmarks to GPT-5.4, the publisher’s flagship model. So should we opt for these newcomers designed to offer the best speed/cost/performance ratio?
Sonnet 4.6 and GPT-5.4 mini on par
In summary: GPT-5.4 mini is an optimized version of GPT-5.4 with a level of reasoning slightly below the flagship and GPT-5.4 nano is the lightest version, also equipped with native reasoning capabilities. Both models are multimodal and have a context window of 400,000 tokens, plenty to do. Both support tool calling, web search and MCP. On the other hand, only GPT-5.4 mini allows you to use computer use to control graphical interfaces. Finally, both models have a setting to define the level of reasoning (none, low, medium, high, very high).
The two models announced today by OpenAI are a direct response to competition from Anthropic. GPT-5.4 mini is a direct competitor to Sonnet 4.6, with the same positioning: power AND speed. GPT-5.4 mini is almost on par with Claude Sonnet 4.6 in terminal coding and multimodal vision, but falls slightly short in the use of tools. The main interest lies in latency: OpenAI announces a generation speed more than twice as high as GPT-5 mini. At a lower cost, the speed of GPT-5.4 mini can make the difference in many use cases. It thus becomes possible to consider replacing Claude Sonnet 4.6 with GPT-5.4 mini.
Aggressive pricing
On the price side, OpenAI’s offensive is just as aggressive. GPT-5.4 mini is priced at $0.75 per million tokens for input and $4.50 for output, respectively 4 times and 3.3 times cheaper than Anthropic’s Claude Sonnet 4.6. GPT-5.4 nano pushes the slider even further: at $0.20 for input and $1.25 for output, it is 15 times less expensive than Sonnet 4.6 for input and 12 times less for output. All with a context window of 400,000 tokens compared to 200,000 for the Anthropic model.
Typical use cases for GPT-5.4 mini and nano
GPT-5.4 mini finds its place as soon as the cost/speed/performance ratio becomes the decisive criterion: sub-agents executing tasks in parallel, rapid code generation, automated workflows at moderate volume. Its half the latency and its price four times lower than a Sonnet 4.6 make it a perfect model for pipelines where responsiveness matters more than reasoning. GPT-5.4 nano targets a notch below: simple RAG, question-answer, classification, data extraction of light tasks at very high frequency where every penny counts. Sonnet 4.6 remains the relevant choice for teams already invested in the Anthropic ecosystem and for tasks where precision takes precedence over speed. Finally, for advanced orchestration of autonomous agents, the flagship models (Opus 4.6 on the Anthropic side, GPT-5.4 on the OpenAI side) remain the most legitimate.
With GPT-5.4 mini and nano, OpenAI hits where it counts. By positioning itself just behind Anthropic on benchmarks while offering prices up to 15 times lower and latency halved, the publisher is directly targeting the two metrics most scrutinized by IT departments today: cost per request and response speed. We will also note the profusion of benchmark data accompanying the announcement which contrasts with the discretion that OpenAI has demonstrated in recent months on the subject. A sign that the model keeps its promises and that OpenAI intends to make it known.




