Google claims text-to-image AI delivers “unprecedented photorealism”

presented an artificial intelligence system that can create images based on the entered text. The idea is for users to enter any descriptive text and the artificial intelligence will turn it into an image. Created by the Brain Team at Google Research, the company says it offers “an unprecedented degree of photorealism and a profound level of understanding of the language.”

This is not the first time we have seen such AI models. (and) it generated headlines and images because it can turn text into visual elements. The Google version, however, tries to create more realistic images.

To compare Imagen with other text-to-image models (including the DALL-E 2, VQ-GAN + CLIP, and Latent Diffusion Models), researchers created a benchmark called. This is a list of 200 text suggestions that were entered for each model. The raters were asked to rate each image. “They prefer Imagen over other models in parallel comparisons, both in terms of sample quality and image-to-text alignment,” said Google.

It is worth noting that the examples shown on the website are curated. Therefore, these may be the best of the best images that the model has created. They may not accurately reflect most of the visualizations they generate.

Like DALL-E, Imagen is not available to the public. Google believes that so far it is unusable by the general population for a number of reasons. First, text-to-image models tend to be trained on large datasets that are removed from the network and unattended, which causes a number of problems.

‘While this approach has enabled rapid algorithmic advances in recent years, such datasets often reflect social stereotypes, oppressive viewpoints, and derogatory or otherwise harmful associations with marginalized identity groups,’ the researchers write. “While a subset of our training data has been filtered to remove noise and unwanted content such as pornographic images and toxic language, we have also used the LAION-400M dataset, which is known to contain a wide range of inappropriate content including pornographic, racist images and insults and harmful social stereotypes ”.

As a result, they said, Imagen has inherited “social prejudices and the limitations of large language models” and may present “harmful stereotypes and representation”. The team said preliminary findings indicate that AI encodes social prejudices, including the tendency to create images of people with lighter skin tones and place them in certain stereotyped gender roles. In addition, the researchers note that there is a potential for abuse if Imagen is made available to the public as is.

However, the team may eventually allow the public to enter text into the model version to generate their own images. ‘In future work, we will explore a responsible externalization framework that balances the value of external audit with the risk of unrestricted open access,’ wrote the researchers.

However, you can try Imagen on a limited basis. Na, you can create a description using pre-selected phrases. Users can choose whether the image is to be a photo or an oil painting, the type of animal displayed, the clothes it wears, the action it takes, and the setting. So if you’ve ever wanted to see an interpretation of an oil painting of a fluffy panda in sunglasses and a black leather jacket while skateboarding on the beach, here’s your chance.

Google research

All products recommended by Engadget are selected by our editorial staff, independent of our parent company. Some of our stories contain affiliate links. If you buy something through one of these links, we may receive an affiliate commission.

Leave a Reply