Generative AI

80 Posts

Expressive Synthetic Talking Heads: Microsoft's VASA-1 delivers more lifelike talking-head videos
Generative AI

Expressive Synthetic Talking Heads: Microsoft's VASA-1 delivers more lifelike talking-head videos

Previous systems that produce a talking-head video from a photo and a spoken-word audio clip animate the lips and other parts of the face separately.
Mini but Mighty: OpenAI's GPT-4o Mini offers big performance at a small price
Generative AI

Mini but Mighty: OpenAI's GPT-4o Mini offers big performance at a small price

A slimmed-down version of Open AI’s multimodal flagship packs a low-price punch.
Hallucination Detector: Oxford scientists propose effective method to detect AI hallucinations
Generative AI

Hallucination Detector: Oxford scientists propose effective method to detect AI hallucinations

Large language models can produce output that’s convincing but false. Researchers proposed a way to identify such hallucinations. 
Image Generators in the Arena: Text-to-image generators face off in arena leaderboard by Artificial Analysis
Generative AI

Image Generators in the Arena: Text-to-image generators face off in arena leaderboard by Artificial Analysis

An arena-style contest pits the world’s best text-to-image generators against each other.
How Open Are Open Models?: Radboud University study ranks AI models on openness
Generative AI

How Open Are Open Models?: Radboud University study ranks AI models on openness

The word “open” can mean many things with respect to AI. A new paper outlines the variations and ranks popular models for openness.
Copyright Claim Fails in GitHub Case: Judge dismisses key arguments in AI copyright lawsuit against GitHub, Microsoft, and OpenAI
Generative AI

Copyright Claim Fails in GitHub Case: Judge dismisses key arguments in AI copyright lawsuit against GitHub, Microsoft, and OpenAI

A judge rejected key claims in a lawsuit by developers against GitHub, Microsoft, and OpenAI, the first decision in a series of court actions related to generative AI.
Like LoRA, But for Pretraining: GaLore, a memory-saving method for pretraining and fine-tuning LLMs
Generative AI

Like LoRA, But for Pretraining: GaLore, a memory-saving method for pretraining and fine-tuning LLMs

Low-rank adaptation (LoRA) reduces memory requirements when fine-tuning large language models, but it isn’t as conducive to pretraining.
Claude Advances the LLM Interface: Claude 3.5 Sonnet’s Artifacts feature makes it easier to build and code on-site
Generative AI

Claude Advances the LLM Interface: Claude 3.5 Sonnet’s Artifacts feature makes it easier to build and code on-site

Claude 3.5 Sonnet lets users work on generated outputs as though they were independent files — a step forward in large language model user interfaces.
Model Merging Evolves: Researchers developed automated system for efficient model merging
Generative AI

Model Merging Evolves: Researchers developed automated system for efficient model merging

The technique of model merging combines separate models into a single, more capable model without further training, but it requires expertise and manual effort. Researchers automated the process.
Conversing With the Departed: Lifelike avatars of deceased loved ones, a new market in video generation
Generative AI

Conversing With the Departed: Lifelike avatars of deceased loved ones, a new market in video generation

Advances in video generation have spawned a market for lifelike avatars of deceased loved ones.
Chatbot for Minority Languages: Startup Two AI launches SUTRA, a multilingual model for South Asian markets
Generative AI

Chatbot for Minority Languages: Startup Two AI launches SUTRA, a multilingual model for South Asian markets

An AI startup that aims to crack markets in southern Asia launched a multilingual competitor to GPT-4.
For Faster Diffusion, Think a GAN: Adversarial Diffusion Distillation, a method to accelerate diffusion models
Generative AI

For Faster Diffusion, Think a GAN: Adversarial Diffusion Distillation, a method to accelerate diffusion models

Generative adversarial networks (GANs) produce images quickly, but they’re of relatively low quality. Diffusion image generators typically take more time, but they produce higher-quality output. Researchers aimed to achieve the best of both worlds.
From Clip to Composition: Udio expands text-to-music generator, now extends existing recordings
Generative AI

From Clip to Composition: Udio expands text-to-music generator, now extends existing recordings

Is your song’s verse in need of a chorus? A popular text-to-music generator can extend existing recordings while maintaining their musical character.
Different results from new text-to-image models from Nvidia, Alibaba, and Stability AI
Generative AI

More New Open Models: New models from Nvidia, Alibaba, and Stability AI expand open options

A trio of powerful open and semi-open models give developers new options for both text and image generation. 
Audio Generation Clear of Copyrights: Stability AI releases enhanced text-to-audio generator Stable Audio Open
Generative AI

Audio Generation Clear of Copyrights: Stability AI releases enhanced text-to-audio generator Stable Audio Open

Sonically minded developers gained a high-profile text-to-audio generator. 
Load More

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox