Ground truth video of a road on the left and predicted video with MaskViT on the right
Generative Modeling

Seeing What Comes Next: Transformers predict future video frames.

If a robot can predict what it’s likely to see next, it may have a better basis for choosing an appropriate action — but it has to predict quickly. Transformers, for all their utility in computer vision, aren’t well suited to this because of their steep computational and memory requirements...
Different screenshots of Create with Alexa feature displayed on a tablet
Generative Modeling

How Alexa Says Goodnight: Amazon Echo uses generative AI to create bedtime stories.

Too exhausted (or unimaginative) to tell your child a bedtime story? Amazon’s smart displays can spin bespoke tales on demand. A feature called Create with Alexa generates children’s stories complete with illustrations, music, and sound effects on the Amazon Echo Show device.
List of ChatGPT's examples, capabilities and limitations
Generative Modeling

More Plausible Text, Familiar Failings: ChatGPT hasn’t overcome the weaknesses of other large language models

Members of the AI community tested the limits of the ChatGPT chatbot, unleashing an avalanche of tweets that made for sometimes-great, sometimes-troubling entertainment.
Illustration of the multiresolution hash encoding in 2D
Generative Modeling

Novel Views of 3D Scenes — Pronto: Using NeRF Algorithms to Quickly Generate New 3D Views

Given a number of images of the same scene, a neural network can synthesize images from novel vantage points, but it can take hours to train. A new approach cuts training time to a few minutes.
Screen capture of question in DeviantArt about consent of the use of artwork by AI datasets
Generative Modeling

Creatives Fight Back: Generative AI from DeviantArt Creates Controversy

Artists are rebelling against AI-driven imitation. DeviantArt, an online community where artists display and sell their work and marketplace for digital art, launched DreamUp, a text-to-image generator that aims to help artists thwart attempts to imitate their styles or works.
Network architecture of Reasoner
Generative Modeling

What the Missing Frames Showed: Machine Learning Describes Masked Video Events

Neural networks can describe in words what’s happening in pictures and videos — but can they make sensible guesses about things that happened before or will happen afterward? Researchers probed this ability.
Different logos from companies like OpenAI, Stability.ai, Jasper and the dollar sign
Generative Modeling

Generating Investment: Generative AI Startups Raise Hundreds of Millions in Funding

The generative gold rush is on. Venture capitalists are betting hundreds of millions of dollars on startups that use AI to generate images, text, and more, Wired reported.
AI-generated image of Joe Rogan interviewing Steve Jobs
Generative Modeling

All Synthetic, All the Time: Joe Rogan Meets Steve Jobs in an AI-Generated Podcast

For the debut episode of a new podcast series, Play.ht synthesized a 19-minute interview between the rock-star podcaster and late Apple CEO.
Example of a video produced from a story-like description
Generative Modeling

Long-Form Videos from Text Stories: Google's Phenaki Generates Long-Form Video from Text

Only a week ago, researchers unveiled a system that generates a few seconds of video based on a text prompt. New work enables a text-to-video system to produce an entire visual narrative from several sentences of text.
Illustration of the Dialogue Transformer Language Model (DLM)
Generative Modeling

The Sound of Conversation: AI Learns to Mimic Conversational Pauses and Interruptions

In spoken conversation, people naturally take turns amid interjections and other patterns that aren’t strictly verbal. A new approach generated natural-sounding audio dialogs without training on text transcriptions that mark when one party should stop speaking and the other should chime in.
Panda on a swing
Generative Modeling

Text to Video Without Text-Video Training Data: Make-A-Video, an AI System from Meta, Generates Video from Text

Text-to-image generators like DALL·E 2, Midjourney, and Stable Diffusion are winning art contests and worrying artists. A new approach brings the magic of text-to-image generation to video.
Captures from PromptBase
Generative Modeling

Prompting DALL·E for Fun and Profit: A marketplace for phrases that produce art in DALL·E, Midjourney, and Stable Diffusion

An online marketplace enables people to buy text prompts designed to produce consistent output from the new generation of text-to-image generators.
Animated graphs showing how an ensemble of fine-tuned models can provide better performance.
Generative Modeling

Ensemble Models Simplified: New Machine Learning Research Simplifies Ensembles

A CLIP model whose weights were the mean of an ensemble of fine-tuned models performed as well as the ensemble and better than its best-performing constituent.
Different images generated by DALL·E
Generative Modeling

Text-to-Image Goes Viral: Inside Craiyon, Formerly Known as DALL-E Mini

A homebrew re-creation of OpenAI’s DALL·E model is the latest internet sensation. Craiyon has been generating around 50,000 user-prompted images daily, thanks to its ability to produce visual mashups like Darth Vader ice fishing and photorealistic Pokemon characters.
parsing network diagram
Generative Modeling

Speaking Your Language: Startup Papercup Offers AI-Powered Voice Translation

A startup that automatically translates video voice overs into different languages is ready for its big break. London-based Papercup offers a voice translation service that combines algorithmic translation and voice synthesis with human-in-the-loop quality control.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox