How Netflix built a tailor-made code AI

How Netflix built a tailor-made code AI

During a webinar in December 2025 co-organized with the Anthropic teams, the Netflix teams revealed how the streaming giant managed to automate a large part of its operations with AI.

“Tudum.” Netflix now wants to apply this iconic sound to its code: one trigger, and everything is executed. The streaming giant has developed a complete stack to automate its software production using AI agents. During a private webinar organized with Anthropic in December 2025, the Los Gatos company provided detailed feedback on the integration of generative AI within its development teams. Disappointed by generic code wizards, they designed a “contextual intelligence” system capable of connecting LLMs directly to company standards.

The disappointment of classic code wizards

Before industrializing AI on a scale, Netflix first had some difficulties with traditional AI assistants. The promise of the first code wizards integrated into IDEs was met with low overall team satisfaction and consequently almost non-existent adoption. Generic LLMs produced technically valid code, but completely disconnected from Netflix reality. “The complaint came up systematically: the code generated is correct, but it is not really connected to our standards and our internal practices”, summarizes, in his presentation, Adam Berry, staff engineer at Netflix.

Lack of context, often cited as the main reason for abandoning AI in production, was to blame. It was necessary to rethink the approach from scratch, hence the development of an in-house platform capable of injecting the organizational context directly into AI agents. Faced with this failure, Netflix made a radical decision: first build an internal Gen AI platform before deploying any code assistant.

A Gen AI platform to give context to code agents

Rather than simply plugging ChatGPT or Claude into their technical stack, the Netflix teams have developed a complete Gen AI platform orchestrated by a dedicated team. “Our goal is to provide exceptional tools for Netflix teams to build things powered by AI,” says Zee Waheed, PM of the GenAI platform at Netflix. The goal is to be “very dogmatic about the capabilities to be provided, but intentionally flexible about the components” and thus to be able to change the underlying technological bricks without breaking the applications already in production.

The architecture rests on four pillars. First, the classic technical basics: rate limiting, workflow management and resilience. Then, the evaluation system (observability) developed with Braintrust, which continuously measures whether agents do what is asked of them. Third pillar: the ecosystem of standardized tools via the MCP which acts as a common interface to connect databases, build tools and internal documentation. Finally, a RAG system, managed by a dedicated team, which injects the right organizational context at the right time. The goal is to ensure that each agent has exactly the right information at the right time, without manual intervention from developers.

Finally, the teams developed a “developer profile” system which automatically configures agents according to the work context. At startup, the tool queries a backend which calculates a personalized profile based on the team, the associated code base and security rules. The configuration of MCP tools, plugins and commands is then adjusted according to the user. Result: the right tools, relevant documentation and specific commands are automatically displayed, whether the developer is on their laptop or in a remote environment.

Claude Code as Senior Code Officer

Claude Code has become the standardized agent for the entire company, directly plugged into the Gen AI platform. Concretely, as soon as it starts, Claude Code automatically connects to the different pillars: it inherits the MCP tools configured for the developer’s team, accesses the RAG system to retrieve the relevant Netflix documentation, and reports its performance in the Braintrust evaluation system.

Netflix does not judge the success of its agents solely by gross ROI, like other tech giants (Oracle in particular). Beyond the satisfaction rate of more than 90% and the growth in use (around +10% of tokens per user monthly), Netflix relies on the failure rates of changes, the incident resolution time, or even the throughput of pull requests. “We are very attentive to the feeling of our developers. If we can keep our teams satisfied, these gains compound over time,” explains Adam Berry again on the webinar. Without providing an exact figure, Netflix is ​​talking about a gradual decline in support costs, despite the increase in users on the platform.

Fully autonomous agents in production

This modern stack has allowed Netflix to automate previously time-consuming operations. For example, the update of Nebula, the tool that compiles and packages the company’s thousands of Java applications. “Let’s face it, no one wants to spend time fixing warnings, updating plugins and validating builds,” summarizes Eric Wendelin, staff engineer at Netflix. Netflix therefore created three agents which take turns: one which analyzes the code, one which corrects errors, one which updates the documentation. Each application normally requires several hours of manual labor. By automating these migrations across thousands of applications, Netflix says it saves “weeks or months of engineering time”.

The Netflix experience confirms what we recalled in a previous article: context engineering remains the determining factor. The more autonomous the agent, the more critical context injection becomes. You must first build the infrastructure before even choosing the model. It goes without saying, the ROI comes from architecture, not from the LLM.

Jake Thompson
Jake Thompson
Growing up in Seattle, I've always been intrigued by the ever-evolving digital landscape and its impacts on our world. With a background in computer science and business from MIT, I've spent the last decade working with tech companies and writing about technological advancements. I'm passionate about uncovering how innovation and digitalization are reshaping industries, and I feel privileged to share these insights through MeshedSociety.com.

Leave a Comment