Gemini vs chatgpt: who is the best for the generation of images?

Gemini vs chatgpt: who is the best for the generation of images?

Between typographic precision, rendering speed and cost control, the two platforms promise to transform the visual workflow of organizations.

For a long time Midjourney reigned as an absolute master on the creation of images by artificial intelligence. But that was without counting on Openai and Google, which directly integrating their image generation models into their Chatgpt and Gemini chatbots revolutionized the market. In this article, we have sought to give you an overview of the behaviors of GPT-4O Image Generation (OPENAI) and Imagen 4 (Google) in a wide range of situations, through very concrete examples. The prompts used are in gray.

First request: a macro photo

"Photo macro réaliste d'une nouvelle puce électronique sur une carte mère, avec un bokeh doux en arrière-plan et un éclairage de studio précis."

For the close -ups, each AI has its interpretation of the demand. But overall, the results are completely convincing in one case as in the other and we would be struggling to decide between them!

An advertising image created from scratch

"Image publicitaire d'une bouteille de boisson énergétique glacée sur une surface métallique, avec des gouttelettes d'eau et un éclairage dynamique pour souligner la fraîcheur. Contexte sportif en arrière-plan flou."

The best is the enemy of good and Google AI proves it here by adding a basketball in the hands of the… footballer. For its part, the visual of Chatgpt is a little more sober, but perfectly responds to demand.

A photorealist portrait

"Portrait photoréaliste d'un professionnel souriant d'environ 35 ans, représentant la diversité, travaillant sur un ordinateur portable dans un bureau moderne et lumineux. Éclairage naturel."

No hiccups on one side or the other here, even if we have a clear preference for the visual of Chatgpt, which looks like it is mistaken for a real photo.

A conceptual representation “flat”

"Représentation conceptuelle d'une interface utilisateur minimaliste pour une application de gestion de projet, sur un écran de tablette, avec des icônes intuitives et une palette de couleurs douces. Style flat design."

If the two AI respects the instructions generally, the image generated by Chatgpt is immediately usable, while Gemini offers a visual certainly more stylized, but which seems less operational immediately in certain cases.

An infographic

"Infographie simplifiée illustrant les étapes d'un processus de recyclage, du tri des déchets au produit fini. Style vectoriel, couleurs vives et icônes claires."

The generation of infographic is undoubtedly one of the uses that may concern the most users within the company.

And in this little game, Chatgpt is much more reliable than its competitor.

So much for this prompt battle. Overall, even if Gemini gets away with honors, Chatgpt often does better, being more faithful to the user’s demand and by making fewer errors. There is no match between the two at present and the Openai chatbot is a clear winner. A fortiori in a professional environment.

Image modification with chatgpt

One of the big strengths of Chatgpt is the possibility of iterer from an image created. There is even a field provided for this purpose in the interface when you click on an image to display it in greater. It is therefore enough to type your requests for modifications in this field to directly influence the result. Imagen 4 does not offer this possibility.

1. Cut out an image with Chagpt

The possibility of cutting an image in a few seconds and without any skill in Photoshop is a perspective that should not fail to delight the vast majority of employees who are led to use images. Take the example of this small miniature of the blue mosque. You will notice that it is not a simple clutch, insofar as the object is not on a united and well -defined background. With Chatgpt, you just have to submit the image to him and tell him what you want to cut to see AI is running.

After a few moments, the detached image appears without any other form of trial in the conversation. It is of course possible to export it to PNG to use it immediately. The tool was not content to cut the object, it also seems to have improved the sharpness. But it goes further, since he actually recreated it, almost identical. We spot a few small differences by looking at it.

2. Change it

Once the image has been cut, nothing forces you to stop on such a good path. You can continue to modify it as you wish, so that it best suits your need. Here, we asked the AI ​​that all the walls went red. We let you judge the result.

3. Place it in the decor of your choice

Once you are satisfied with your cutting image, it is very easy to ask the AI ​​to add it to the landscape of your choice. However, if you are particularly satisfied with the image obtained in the previous step, we advise you to make this addition yourself in Photoshop. Why bother when you already have chatgpt on hand? If you carefully look at the details of the two objects, you will see that they are not quite the same. AI recreates every time, which regularly leads to small differences.

4. Modify a reference photo

Here is a particularly interesting function for those who are endowed with absolutely no photographic talent. You just need to load your own photo in Chatgpt and ask it to improve it. In a professional setting, this can be even more practical, insofar as it is for example possible to strengthen the effectiveness of a tracking session by asking the AI ​​to include or modify elements.

Humans remain a problem

If all this looks like magic, there are still limits, which can be more or less annoying depending on what you want to do with it. Thus, AI always has a hard time integrating a real photo of a human.

It must be understood that in all cases, the system does not really detail a given element. He recreates it completely. And if the illusion is often perfect for objects or sets, it is less obvious for humans.

Jake Thompson
Jake Thompson
Growing up in Seattle, I've always been intrigued by the ever-evolving digital landscape and its impacts on our world. With a background in computer science and business from MIT, I've spent the last decade working with tech companies and writing about technological advancements. I'm passionate about uncovering how innovation and digitalization are reshaping industries, and I feel privileged to share these insights through MeshedSociety.com.

Leave a Comment