A study conducted by the University of Chicago demonstrates how code agents generate real productivity gains within organizations that adopt them.
Do AI agents really deliver on their promise of productivity? Until now, the market remained cautious. But a new study led by Suproteem Sarkar, a researcher at the University of Chicago, provides the first large-scale numerical evidence. In organizations that have adopted code brokers, software production increases by an average of 39%. A clear acceleration, the extent of which varies depending on the tasks entrusted.
The University of Chicago study is based on the analysis of companies using the Cursor code agent. The researcher examined code merge data from October 2024 to May 2025, comparing 24 companies already active before the agent’s launch with 8 control organizations that did not use the platform during this period.
The results are unambiguous: the introduction of the agent immediately increases the number of weekly merges by 26%, then this effect increases to 39% when the agent becomes the default generation mode. In other words, the more teams integrate the agent into their routine, the greater the acceleration in software production. Even more interesting, companies did not see an evolution in their revert rate (removal of merge, often caused by poor or buggy code). Concretely, the increase in productivity is not accompanied, as one might have thought, by a drop in the quality of the code generated.
The 4 areas where code officers excel
-
Fast feature implementations
Agents excel as soon as the feedback on execution is immediate: rapid prototypes, minor corrections, localized adjustments. According to the study data, these tasks represent a large part of the interactions recorded in Cursor, with a high proportion of “direct implementation” oriented requests. What they have in common is the ability to instantly check the result: the agent generates the code, the user tests it, then adjusts if necessary. The analysis also shows that the modifications resulting from these short tasks display a high acceptance rate.
-
Planning tasks
Beyond direct implementation, the study notes a recurring use: the request for a “plan”. Experienced users tend to ask the agent for a suggested step, a logical order of changes, or implementation options. The agent will then think about the different possibilities and propose several solutions adapted to the context. However, the study data shows that a two-step workflow of planning then generating code significantly improves the acceptance rate of modifications.
-
Code and bug explanations
Nearly 25% of interactions with Cursor involve code explanation or error analysis. Agents demonstrate an effective ability to synthesize the logic of a file, interpret unexpected behavior or analyze a stacktrace. More precisely, 12.1% of messages ask for explanations on how an existing code base works, while 11.9% directly concern fixing bugs from logs. The productivity gain is high because these tasks are normally very time-consuming: you have to dig through the code, open several files, understand the logic. The agents analyze the global context and understand the sequential actions very well.
-
Work guided by senior profiles
Unlike classic autocompletion, which is often more useful to junior developers, agents are more beneficial to experienced profiles. The study identifies an “experience gradient”: each additional standard deviation in seniority increases the acceptance rate of the generated code by 5 to 6%. Seniors formulate instructions richer in context, use planning more and evaluate the agent’s proposals more precisely. The agent then acts as an accelerator, amplifying the capabilities of those who already master the structure of the code.
A new way of coding
Beyond the 39% increase in software production, the study above all highlights a structural evolution in the work of developers. The use of a code agent gradually changes the developer’s work: less manual writing and much more code validation.
But the researcher recalls two essential limits. On the one hand, these results strongly depend on the field studied: software development provides immediate feedback (tests, errors, merges), an ideal terrain for objectively measuring productivity and quality. On the other hand, the Cursor environment constitutes a special case: the agent benefits from full access to the codebase, a rich context and integrated control tools. In other words, the observed performances cannot be generalized to any agent or to any context.




