OpenAI AgentKit: how we haven’t (yet) managed to create our agent

OpenAI AgentKit: how we haven’t (yet) managed to create our agent

We took over AgentKit, the new suite of tools unveiled by OpenAI, to evaluate its simplicity, effectiveness and current limitations. A test which above all suggests strong potential for organizations.

Launched on October 6 by OpenAI, AgentKit is a digital suite for creating, deploying and optimizing AI agents. Its first component, AgentBuilder, like n8n or Zapier, allows the creation of intuitive workflows. The user connects digital “bricks” using drag and drop. The interface must guarantee fast and integrated debugging. It is also possible to instantly validate the result thanks to a preview function. AgentKit is enhanced with tools such as MCP connectors or guardrails modules to secure and supervise its agent.

The second component of AgentKit, ChatKit implements customizable chat interfaces in applications and websites. Finally, Connector Registry centralizes data management in the agent workflow.

AgentBuilder to read emails

In order to test AgentKit, we create a workflow on AgentBuilder. The created agent must understand natural language queries, use a connector to interact with the Gmail API, search for matching messages and present the results to the user.

The first step is to access the Agent Builder. This is in beta and is being rolled out gradually. Access to Agent Builder is not available through a classic ChatGPT account. It is required to have an account on the OpenAI developer platform with API access and valid billing information.

Since simplicity is recommended, we will only create two nodes for this workflow. To do this, we click on “Create”, located in the middle of the interface. We drag the “Start” node from the left to the workflow. In the control panel on the right we choose the variable input_as_text.

We connect this node to an “Agent” module. We name it “Gmail Search Agent”.

For the agent’s instructions, we make it transform the raw data into readable text.
Our prompt:

"Tu es un assistant Gmail intelligent. Tu as accès à l'outil "search_emails" pour chercher dans Gmail. Quand l'utilisateur demande de chercher des emails :
 1. Convertis sa demande en requête Gmail
 2. Appelle search_emails avec UNIQUEMENT query et max_results
 3. N'envoie PAS les champs tag et next_page_token
 4. Présente les résultats clairement
 SYNTAXE GMAIL :
 - from:nom (expéditeur)
 - after:YYYY/MM/DD (date)
 - newer_than:Xd (X jours)
 - is:unread (non lu)
 - in:inbox (boîte de réception)
 EXEMPLES D'APPEL :
 "dernier mail" → search_emails(query="in:inbox", max_results=1) "emails de bruno" → search_emails(query="from:bruno", max_results=10) "emails d'aujourd'hui" → search_emails(query="after:2024/10/13", max_results=10)
 IMPORTANT :
 - N'utilise QUE les paramètres query et max_results
 - Ne mets PAS tag ou next_page_token
 PRÉSENTATION DES RÉSULTATS :
 **Email X**
 • Expéditeur : (nom) <(email)>
 • Sujet : (sujet)
 • Date : (date)
 • Extrait : (extrait) Sois concis et professionnel."

In the node “Tools”, we choose “Gmail”. This connector allows access to our emails. Be careful to enter the API key correctly to identify yourself. We go to Google’s OAuth 2.0 Playground page. At the bottom, in the “Input your own scopes” field, we type: https://www.googleapis.com/auth/gmail.readonly. We click on “Authorize APIs” at the bottom left. After granting the authorizations, we go to “Exchange authorization code for tokens”, where the Access token is located.

In the Gmail module, we check the different options, which include read_email and search_emails

That’s it, the workflow is ready!

To test our scenario, we go to “Preview”, at the top right of the interface. Access to this feature requires fairly advanced authentication. Once this is done, we ask the agent in the chat to find the last 3 emails received at a certain time of the day. The agent analyzes the request and calls the Gmail tool.

We see a correct answer. The emails sent are structured. The agent even asks if we want to be able to open one of them directly. This task is then carried out by the agent. It also looks for links inside the email and displays them in the chat. It can also generate a response but it does not mention it directly in the email.

The test being conclusive, we publish the flow. Note that this action does not make the workflow public. We can continue to edit the workflow in draft mode and publish new versions afterwards.

A workflow to deploy

To deploy the agent, OpenAI offers two main approaches. SDK agents are primarily aimed at developers. The principle is to download the code from the agent and program it yourself.

Relatively simpler to set up, but still quite technical, ChatKit allows you to integrate a chat widget on a website. While the tool version is still unstable, we are still trying to install and configure the development environment. To do this, we will create a backend server, a React frontend, we will make the two communicate and will load the ChatKit component.

Basically, we first create a skeleton website, with a tool called “Vite”. Interest: creation of a website project. We install the basic dependencies of the project. A “my-web-agent” folder on our computer contains several subfolders (“src”, “public”…). This is the basic structure of our site.

Next, we set up a small server that will protect our API key. Objective, secure the backend. Via the Terminal, in the `my-web-agent` folder, we install the tools for our server. We then design the file that will contain our server code. We create the .env file, still in the mon-agent-web folder. We put our OpenAI API key there: OPENAI_API_KEY=sk-xxxx. After that, in the terminal we install the ChatKit for React. The goal is to set up the frontend.

We are now launching the server and website. In the terminal, we type the command “node server.js” at the address of our mon-agent-web folder. A session is created.

On another terminal, in our my-agent-web folder, we type these commands: npm run dev. A Localhost address appears.

By going to the localhost address, a “My AI agent” message appears. Unfortunately, which could be caused by the lack of stability of AgentKit, it does not yet allow us to have access to the chat of our workflow. The best option for testing and using your agent remains to use Agent Builder directly, while waiting for ChatKit to be fully documented and accessible.

AgentKit turns out to be a solid and promising SDK for building AI agents directly in OpenAI. Agent Builder itself works remarkably well. However, putting it into production remains a real ordeal: ChatKit’s incomplete documentation and deployment problems well documented on GitHub by many users make this step tedious and frustrating. This is a normal situation for such a young SDK, the OpenAI teams being, moreover, engaged on several fronts simultaneously. The tool should improve considerably in the coming months.

Finally, it is good to remember that AgentKit does not aim to replace n8n or Zapier: its core target is rather to democratize the creation of AI agents in production for medium-scale use cases, by offering a more accessible experience than developing manually with raw APIs.

Jake Thompson
Jake Thompson
Growing up in Seattle, I've always been intrigued by the ever-evolving digital landscape and its impacts on our world. With a background in computer science and business from MIT, I've spent the last decade working with tech companies and writing about technological advancements. I'm passionate about uncovering how innovation and digitalization are reshaping industries, and I feel privileged to share these insights through MeshedSociety.com.

Leave a Comment