Performance, speed, price: the power move of Grok-4 Fast

Performance, speed, price: the power move of Grok-4 Fast

Launched by xAI last September, Grok-4 Fast displays on-par performance with the majority of proprietary reasoning models for a more than attractive price.

This is another performance by xAI. If the company founded by Elon Musk in 2023 does not boast every month with new, ever larger and more powerful models, its engineers do not work any less. Launched in September, Grok-4 Fast perfectly illustrates this approach. The model offers leading capabilities in benchmarks at a relatively affordable cost for businesses.

Peak performance in reasoning

Grok-4 Fast was pre-trained on a large dataset including, in addition to data from the open web, qualitative data purchased by xAI, feedback from Grok users and synthetic data generated internally. To improve its performance, the model was then post-trained using an arsenal of large-scale reinforcement learning (RL) techniques. Its reasoning has thus been optimized to be as efficient as other models while massively reducing the use of reasoning tokens (40% less compared to Grok-4). Finally, the researchers also trained the RL model using external tools.

The result is a very good reasoning model, effective by nature. In reasoning benchmarks, Grok-4 Fast displays competitive scores against its competitors’ proprietary models: it reaches 85.7% on GPQA Diamond (physics, biology and chemistry questions at doctoral level), a level equivalent to GPT-5 High, and obtains 92.0% on AIME 2025 (pre-university mathematics exam). On HMMT 2025 (university advanced mathematics competition), the model peaks at 93.3%, outperforming Grok 4 which tops out at 90.0%. And on LiveCodeBench (code generation evaluation on recent problems) evaluated between January and May, Grok-4 Fast scores 80.0%, placing itself ahead of its predecessor but behind the 86.8% of GPT-5 High. In short, peak performance in general reasoning, mathematics and coding. An excellent composition for a general model.

Finally, in terms of speed, the model remains at the top of the basket. Grok-4 Fast is available, from xAI, at a speed of 227 tokens per second, behind Gemini 2.5 Flash (268 tokens / second) and gpt-oss-120B (3128 tokens per second from Cerebras).

Maximum intelligence at a low price

The performance is all the more exceptional compared to the cost offered by xAI. The San Francisco-based start-up managed, by optimizing the model, and certainly its infrastructure, to reduce the price by 98% compared to Grok-4 for comparable performance. According to Artificial Analysis, the model offers the best intelligence/cost ratio on the market for a proprietary model.

Concretely, Grok-4 Fast charges $0.20 per million tokens for input and $0.50 for output for contexts of less than 128,000 tokens, respectively 6 and 20 times cheaper than GPT-5. The gap widens further against Anthropic’s Claude Sonnet 4.5, whose prices amount to $3.00 per million tokens for entry and $15.00 for exit, representing an additional cost of 15 and 30 times respectively. Even in its extended configuration (beyond 128,000 tokens), where Grok-4 Fast doubles its prices to $0.50 for entry and $1.00 for exit, the xAI model maintains a price advantage.

Model Input price (per million tokens) Output price (per million tokens)
Grok-4 Fast (≤128k) $0.20 $0.50
Grok-4 Fast (>128k) $0.50 $1
GPT-5 (Standard) $1.25 $10.00
Claude Sonnet 4.5 $3.00 $15.00
Gemini 2.5 Pro (≤200K) $1.25 $10.00
Gemini 2.5 Pro (>200K) $2.50 $15.00

For the agent, the code, or as a company co-pilot…

The combination of excellent benchmark results, attractive pricing and competitive speed makes Grok-4 Fast an ideal candidate for many enterprise use cases. Starting with agentics: the model was specifically trained in reinforcement learning for calling external tools, allowing it to efficiently navigate the web and synthesize information in real time. It will thus be an excellent orchestrator at a reasonable price.

On the code side, developers have every interest in adopting it for agentic development. The model is reasonably priced and its 2 million token context window is ideal for large code bases. Finally, the model emerges as a rational choice for large-scale enterprise co-pilots, where the volume of daily requests can quickly generate significant costs, particularly for companies using OpenAI.

The path traveled by xAI since its founding in 2023 commands admiration, especially in a sector dominated by players established for several years (OpenAI, Anthropic or Google). In just two years of existence, the company has managed to develop a range of competitive models. By capitalizing on an aggressive engineering philosophy focused on delivery and cost reduction, xAI is now positioning itself as a more than credible alternative.

Jake Thompson
Jake Thompson
Growing up in Seattle, I've always been intrigued by the ever-evolving digital landscape and its impacts on our world. With a background in computer science and business from MIT, I've spent the last decade working with tech companies and writing about technological advancements. I'm passionate about uncovering how innovation and digitalization are reshaping industries, and I feel privileged to share these insights through MeshedSociety.com.

Leave a Comment