Creative engineers are combining deep learning systems to produce a groundswell of generated imagery.
What’s new: Researchers, hackers, and artists are producing new works by pairing CLIP, a pretrained image classifier, with a generative adversarial network (GAN). UC Berkeley researcher Charlie Snell captured the ferment in a blog post.
How it works: Users typically give CLIP a text list of the classes they want to recognize; given an image, it returns the most likely class in the list. Digital artists, on the other hand, feed CLIP a verbal description of an image they want to produce and use its ability to match text with images to guide a GAN.
- The community has developed a set of Google Collab Notebooks that link CLIP with various GANs. A user types a phrase, sets some parameters, and chooses which GAN to use for image generation.
- Once the GAN has generated an image, CLIP scores it based on how closely it matches the original phrase. The Collab code then adjusts the GAN’s hyperparameters iteratively, so its output earns a higher score from CLIP. It repeats the cycle of generation and adjustment until CLIP’s score exceeds a threshold set by the user.
- Different GANs yield images with different visual characteristics. For instance, pairing CLIP with BigGAN produces output that tends to look like an impressionist painting. Pairing CLIP with VQ-GAN produces more abstract images with a cubist look.
- Adding to the prompt a phrase like “rendered in Unreal Engine,” referring to a popular video game renderer, can drastically improve the quality of the generated output.
Behind the news: Open AI has its own image generator, DALL·E. Reportedly its output is less abstract and fanciful.
Why it matters: CLIP was built to classify, not co-create, while GANs were developed to produce variations on familiar images. The boomlet in generated art shows how the creative impulse can unlock potential that engineers may not have imagined.
We’re thinking: It’s great to see human artists collaborating with neural networks. It’s even better to see neural networks collaborating with one another!