Nova Act, autonomous agents, reliable LLMs: AWS refines its AI for professionals

Nova Act, autonomous agents, reliable LLMs: AWS refines its AI for professionals

On the second day of re:Invent, AWS continues its deluge of announcements focused on AI applied to businesses. A new salvo which confirms the group’s positioning: favoring concrete, directly exploitable uses.

AI must bring tangible added value to organizations, and this is not yet the case everywhere, recalled Matt Garman, CEO of AWS, on December 2 during the 2025 edition of Re:Invent in Las Vegas where the JDN is located. To accelerate the adoption and especially the impact of these technologies, AWS is banking on resolutely operational AI, by increasing the number of services designed to “get things done” and respond to immediate business needs. This Wednesday, December 3, the group is unveiling several tools intended to concretely improve the results obtained with the models, both for engineers and for less technical profiles. The JDN summarizes the announcements to remember.

Amazon Nova Act, the agent applied to interfaces

With Nova Act, AWS applies agentic at the most concrete level: the interface. The service makes it possible to automate existing workflows without changing the underlying software, simply by letting an agent operate the UI in place of a human user. The approach is designed for companies that want to quickly automate tasks that are still manual (screen navigation, form entry, etc.). AWS claims reliability greater than 90%, thanks to an action model based on Nova and trained to identify visual elements and sequence steps to carry out tasks in total autonomy. “The majority of companies’ work is done in a browser. With Nova Act, any software accessible via a web interface becomes automatable,” explains Ben Schreiner, head of AI and modern data strategy at AWS.

Nova Act integrates with IDEs already widely used by developers: VS Code, Cursor, Kiro, etc. It becomes possible to deploy hundreds of agents in parallel with integrated supervision and observability. “You can extend a workflow across multiple enterprise applications,” says Ben Schreiner. “Wherever humans were still doing data entry because the systems were isolated, Nova Act makes it possible to automate these tasks,” adds the specialist.

Three tools to make LLMs more reliable in production

Amazon reminds us: the models currently deployed in production are not yet sufficiently reliable, and companies lack tangible gains in terms of accuracy. To address this sticking point, AWS is introducing three new tools intended to improve model robustness and accuracy. Notable feature: the group simultaneously addresses the needs of very technical profiles and those of more general developers, in order to cover the entire model design and optimization cycle.

The first tool, Reinforcement Fine-Tuning in Bedrockis clearly aimed at non-specialist developers. The principle of reinforcement is to improve a model by giving it feedback on what constitutes a good result via a reward function (a function which notes the quality of the response). The logs, the history of responses produced by the model, then serve as a basis for automatically adjusting its behavior. AWS cites gains of up to 66% depending on the use cases.

The second tool, Model Customization in Amazon SageMaker AI, is aimed at more technical profiles. The goal is to profoundly modify the weights and behavior of a model. SageMaker automates key steps: generation of artificial data to enrich training, definition of the optimization workflow, and orchestration of training. Teams can start from an open source model (Llama, Qwen, DeepSeek, etc.) or a Nova model.

Finally, even more technical, AWS introduces Checkpointless Training on Amazon SageMaker HyperPod. Instead of relying on one-time checkpoints (regular saves of the model during training), SageMaker records training continuously. If a GPU cluster fails, training resumes almost immediately, without having to recalculate lost work hours. The objective is clear: to make long training sessions more reliable and reduce the costs linked to interruptions for the technical teams, and therefore increase reliability.

Persistent memory and management of the security framework of agents

Finally, AWS adds persistent memory to Bedrock AgentCore, its agent orchestration platform. Until now, most agents operated with limited context, unable to retain information beyond a session. Memory persistence allows interactions, preferences or intermediate states to be stored and reused in future executions. “The objective is to allow an agent to manage more complex workflows, to adapt its actions over time thanks to the persistence of information,” explains Ben Schreiner.

An evolution which is accompanied by two building blocks that the AWS teams consider equally structuring: policies and evaluations, designed to prevent autonomous agents from acting outside the planned scope. “Humans must define the guardrails: what AI can or cannot do on our behalf,” recalls the head of AI and modern data strategy at AWS. Policies impose these limits before execution and evaluations verify that the agent continues to behave as expected because “it’s not like code that always executes the same way.” For Ben Schreiner, these tools meet imminent needs: “Even if customers don’t have these problems yet, they will very soon.”

Interface automation, model reliability tools, agent governance: AWS is clearly seeking to fill the gaps between the promises of AI and its real impact in businesses. It remains to be seen whether these bricks will become a lever… or an additional wall to overcome.

Jake Thompson
Jake Thompson
Growing up in Seattle, I've always been intrigued by the ever-evolving digital landscape and its impacts on our world. With a background in computer science and business from MIT, I've spent the last decade working with tech companies and writing about technological advancements. I'm passionate about uncovering how innovation and digitalization are reshaping industries, and I feel privileged to share these insights through MeshedSociety.com.

Leave a Comment