Machine Learning Research

331 Posts

Expressive Synthetic Talking Heads: Microsoft's VASA-1 delivers more lifelike talking-head videos
Machine Learning Research

Expressive Synthetic Talking Heads: Microsoft's VASA-1 delivers more lifelike talking-head videos

Previous systems that produce a talking-head video from a photo and a spoken-word audio clip animate the lips and other parts of the face separately.
Hallucination Detector: Oxford scientists propose effective method to detect AI hallucinations
Machine Learning Research

Hallucination Detector: Oxford scientists propose effective method to detect AI hallucinations

Large language models can produce output that’s convincing but false. Researchers proposed a way to identify such hallucinations. 
How Open Are Open Models?: Radboud University study ranks AI models on openness
Machine Learning Research

How Open Are Open Models?: Radboud University study ranks AI models on openness

The word “open” can mean many things with respect to AI. A new paper outlines the variations and ranks popular models for openness.
Efficient Subject Consistency For Stable Diffusion
Machine Learning Research

Efficient Subject Consistency For Stable Diffusion

Published in mid-2022, DreamBooth enables Stable Diffusion to depict variations on a given subject; say, a particular dog and the same dog with angel wings or wearing a chef’s hat.
Like LoRA, But for Pretraining: GaLore, a memory-saving method for pretraining and fine-tuning LLMs
Machine Learning Research

Like LoRA, But for Pretraining: GaLore, a memory-saving method for pretraining and fine-tuning LLMs

Low-rank adaptation (LoRA) reduces memory requirements when fine-tuning large language models, but it isn’t as conducive to pretraining.
Model Merging Evolves: Researchers developed automated system for efficient model merging
Machine Learning Research

Model Merging Evolves: Researchers developed automated system for efficient model merging

The technique of model merging combines separate models into a single, more capable model without further training, but it requires expertise and manual effort. Researchers automated the process.
24 Hours on an Old Consumer GPU
Machine Learning Research

24 Hours on an Old Consumer GPU

BERT, a large language model released in 2018 and built upon the then-new transformer architecture, marked a paradigm shift in AI.
Benchmarks for Agentic Behaviors: New LLM benchmarks for Tool Use and Planning in workplace tasks
Machine Learning Research

Benchmarks for Agentic Behaviors: New LLM benchmarks for Tool Use and Planning in workplace tasks

Tool use and planning are key behaviors in agentic workflows that enable large language models (LLMs) to execute complex sequences of steps. New benchmarks measure these capabilities in common workplace tasks. 
For Faster Diffusion, Think a GAN: Adversarial Diffusion Distillation, a method to accelerate diffusion models
Machine Learning Research

For Faster Diffusion, Think a GAN: Adversarial Diffusion Distillation, a method to accelerate diffusion models

Generative adversarial networks (GANs) produce images quickly, but they’re of relatively low quality. Diffusion image generators typically take more time, but they produce higher-quality output. Researchers aimed to achieve the best of both worlds.
The LLM Will See You Now: AMIE, a chatbot that outperforms doctors in diagnostic conversations
Machine Learning Research

The LLM Will See You Now: AMIE, a chatbot that outperforms doctors in diagnostic conversations

A critical step in diagnosing illnesses is a conversation between doctor and patient to assemble a medical history, discuss approaches to managing symptoms, and so on.
Better Teachers Make Better Students: Microsoft‘s Orca 2 strengthens the native reasoning abilities of smaller models
Machine Learning Research

Better Teachers Make Better Students: Microsoft‘s Orca 2 strengthens the native reasoning abilities of smaller models

A relatively small student LLM that learns to mimic a larger teacher model can perform nearly as well as the teacher while using much less computation. It can come even closer if the teacher also teaches reasoning techniques.
Richer Context for RAG: RAPTOR, a recursive summarizer, captures more relevant context for LLM inputs
Machine Learning Research

Richer Context for RAG: RAPTOR, a recursive summarizer, captures more relevant context for LLM inputs

Text excerpts used in retrieval augmented generation (RAG) tend to be short. Researchers used summarization to pack more relevant context into the same amount of text.
Interpreting Image Edit Instructions: Meta’s Emu Edit improves text-to-image generation with task classification.
Machine Learning Research

Interpreting Image Edit Instructions: Meta’s Emu Edit improves text-to-image generation with task classification.

The latest text-to-image generators can alter images in response to a text prompt, but their outputs often don’t accurately reflect the text. They do better if, in addition to a prompt, they’re told the general type of alteration they’re expected to make.
Brain-Controlled Robots Get More Versatile: NOIR, a system to control robots via electroencephalogram for everyday tasks
Machine Learning Research

Brain-Controlled Robots Get More Versatile: NOIR, a system to control robots via electroencephalogram for everyday tasks

Brain-to-computer interfaces that enable users to control robots with their thoughts typically execute a single type of task such as reaching and grasping. Researchers designed a system that responds to a variety of intentions.
Deja Vu, an algorithm that accelerates inferencing of large language models
Machine Learning Research

Streamlined Inference: Deja Vu, a method that boosts LLM speed by activating only essential neural parts

It’s not necessary to activate all parts of a large language model to process a given input. Using only the necessary parts saves processing.
Load More

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox