The success of AI depends on fast and efficient storage, key to supporting computation and energy, essential to avoid bottlenecks and fully exploit the power of GPUs.
Artificial intelligence is disrupting almost every industry, including its own. In early 2025, the Chinese language model DeepSeek R1 briefly supplanted ChatGPT in the public debate, sparking speculation about a shifting balance in the AI field and contributing to volatility in technology markets. Nations around the world are announcing their ambition to become AI powerhouses, while hyperscalers are expected to invest $1 trillion in AI-optimized infrastructure by 2028.
Companies are also investing massively. Yet, according to Gartner, nearly a third of projects do not generate the expected business value. It is clear that this AI gold rush cannot be ignored, however, participating in it requires considerable investment. How then can we maximize the chances of success for AI projects, and what considerations should be taken into account for the underlying infrastructure?
The Compute and Storage Requirements of Generative AI
Generative AI workloads fall into two broad categories: training, where a model learns from a dataset, and inference, where it applies what it has learned to new information. But even before training, essential steps like data collection, preparation and curation are necessary. These can come from archives, images or structured databases, often subject to varying governance rules.
What remains constant is that AI is extremely resource intensive. The power and energy consumption of GPUs during training is well known, and frequent backups further put pressure on the infrastructure. These checkpoints ensure model recovery, rollback, and compliance, increasing storage capacity and energy requirements.
Search-augmented generation (RAG), which integrates internal datasets with language models, adds a layer of complexity. It relies on vectorized data, sets translated into high-dimensional vectors to allow similarity comparisons, and can multiply the size of datasets by ten. Even after training, inference requires storage on an ongoing basis to record the results and analyzed data.
Power, scale and compromise
The growing energy footprint of generative AI is another critical factor. Some estimates indicate that AI processing consumes more than thirty times the energy of traditional software, and that the energy demand of data centers could more than double by 2030. At the level of computer cabinets, their consumption has increased from less than 10 kW to 100 kW, or even more in certain clusters, mainly because of high-performance GPUs. Each watt used for storage is a watt not available for calculation: we therefore need fast and efficient storage, capable of powering the GPUs without increasing the energy bill.
Storage can also offer performance gains through caching mechanisms. By retaining frequently used data, queries, and conversations, they reduce repetitive GPU processing. This approach improves responsiveness, particularly for tasks such as RAG, trading or chatbots. Caching can accelerate inference up to twenty times, maximizing GPU efficiency while reducing costs and power consumption.
Storage must keep pace
The role of storage in an AI infrastructure is to provide fast, low-latency access to large data sets. Poor performance creates bottlenecks, limiting the value of expensive hardware. AI workloads often require hundreds of terabytes, or even petabytes, and fast read capabilities for training, inference, or integrating new sources. High-density QLC flash memory stands out as an ideal solution thanks to its combination of speed, capacity, reliability and energy efficiency. It allows large amounts of data to be stored on flash at a cost close to hard disk, while providing the responsiveness necessary for AI applications.
Strategic infrastructure for AI success
Some vendors now offer storage systems designed for AI workloads, certified to work with Nvidia architectures. These integrated solutions, combined with optimized RAG pipelines and AI microservices, simplify deployment and ensure consistent performance.
Deploying generative AI at scale requires more than powerful GPUs. This is based on a robust, efficient and responsive infrastructure. Storage is the cornerstone of this foundation: from data preparation to inference, AI projects depend on fast, scalable, and energy-efficient solutions. Without them, even the best-funded initiatives risk hitting their own limits.




