transformers

112 Posts

Illustration of the Dialogue Transformer Language Model (DLM)
transformers

The Sound of Conversation: AI Learns to Mimic Conversational Pauses and Interruptions

In spoken conversation, people naturally take turns amid interjections and other patterns that aren’t strictly verbal. A new approach generated natural-sounding audio dialogs without training on text transcriptions that mark when one party should stop speaking and the other should chime in.
2 min read
Panda on a swing
transformers

Text to Video Without Text-Video Data: AI System Make-A-Video Generates Video From Text

Text-to-image generators like DALL·E 2, Midjourney, and Stable Diffusion are winning art contests and worrying artists. A new approach brings the magic of text-to-image generation to video.
2 min read
Animation showing 3 main types of data augmentation and random cropping of a picture
transformers

Cookbook for Vision Transformers: A Formula for Training Vision Transformers

Vision Transformers (ViTs) are overtaking convolutional neural networks (CNN) in many vision tasks, but procedures for training them are still tailored for CNNs. New research investigated how various training ingredients affect ViT performance.
2 min read
Robot with an arm, camera, and gripper handing over a plastic bottle to a person
transformers

Parsing Commands Into Actions: NLP Helps Google Robot Understand Spoken Instructions

A new method enables robots to respond helpfully to verbal commands by pairing a natural language model with a repertoire of existing skills.
2 min read
Different Nvidia cloud-computing services
transformers

Chipmaker Boosts AI as a Service: Nvidia Launches Cloud Service for NLP Models

Nvidia, known for chips designed to process AI systems, is providing access to large language models. Nvidia announced early access to NeMo LLM and BioNeMo, cloud-computing services that enable developers to generate text and biological sequences respectively.
2 min read
Information related to Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model (SERAC)
transformers

Update Any Language Model: New Method to Update Pretrained Language Models

The ability to update language models is essential to incorporate new information and correct undesirable behaviors. Previous methods are unwieldy and often fail as the amount of new data increases. New work offers a workaround.
3 min read
Illustration shows different self-attention mechanisms used by Transformer-based AI models.
transformers

Attention to Rows and Columns: Altering Transformers' Self-Attention Mechanism for Greater Efficiency

A new approach alters transformers' self-attention mechanism to balance computational efficiency with performance on vision tasks.
2 min read
Object-Detection Transformers Simplified: New Research Improves Object Detection With Vision Transformers
transformers

Object-Detection Transformers Simplified: New Research Improves Object Detection With Vision Transformers

ViTDet, a new system from Facebook, adds an object detector to a plain pretrained transformer.
2 min read
A flowchart shows how a jury learning method reduces annotator bias in machine learning models.
transformers

Choose the Right Annotators

A new machine learning method attempts to account for biases that may be held by certain subsets of labelers.
2 min read
Bloom logo
transformers

Large Language Models Unbound

A worldwide collaboration produced the biggest open source language model to date. BLOOM is a family of language models built by the BigScience Research Workshop, a collective of over 1,000 researchers from 250 institutions around the globe.
2 min read
Humanized Training for Robot Arms
transformers

Humanized Training for Robot Arms: New Research Improves Robot Performance and Adaptability

Robots trained via reinforcement learning usually study videos of robots performing the task at hand. A new approach used videos of humans to pre-train robotic arms.
2 min read
A series of graphs show the carbon emissions associated with training AI models.
transformers

Cutting the Carbon Cost of Training: A New Tool Helps AI Developers Lower Their Greenhouse Gas Emissions

You can reduce your model’s carbon emissions by being choosy about when and where you train it.
2 min read
Different images generated by DALL·E
transformers

Text-to-Image Goes Viral

A homebrew re-creation of OpenAI’s DALL·E model is the latest internet sensation. Craiyon has been generating around 50,000 user-prompted images daily, thanks to its ability to produce visual mashups like Darth Vader ice fishing and photorealistic Pokemon characters.
1 min read
DeepNet Graph Layers vs years
transformers

Pile on the Layers!

Adding layers to a neural network puts the “deep” in deep learning, but it also increases the chance that the network will get stuck during training. A new approach effectively trains transformers with an order of magnitude more layers than previous methods.
2 min read
Word Cloud
transformers

Toward Next-Gen Language Models

A new benchmark aims to raise the bar for large language models. Researchers at 132 institutions worldwide introduced the Beyond the Imitation Game benchmark (BIG-bench), which includes tasks that humans perform well but current state-of-the-art models don’t.
2 min read

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox