25 Posts

Language Models Defy Logic: Large NLP models struggle with logical reasoning.

Language Models Defy Logic: Large NLP models struggle with logical reasoning.

Who would disagree that, if all people are mortal and Socrates is a person, Socrates must be mortal? GPT-3, for one. Recent work shows that bigger language models are not necessarily better when it comes to logical reasoning.
Alon Halevy next to a big computer screen

Alon Halevy - Your Personal Data Timeline: Data timelines will protect your privacy and make AI better.

The important question of how companies and organizations use our data has received a lot of attention in the technology and policy communities. An equally important question that deserves more focus in 2023 is how...
Diagram explaining Atlas, a retrieval-augmented language model that exhibits strong few-shot performance on knowledge tasks

Memorize Less; Retrieve More: How small language models can perform specialized tasks.

Large language models are trained only to predict the next word based on previous ones. Yet, given a modest fine-tuning set, they acquire enough information to learn how to perform tasks such as answering questions.
Image of body parts in Hokkien, map showing Hokkien speaking regions across the world and Model architecture of S2ST

Translating a Mostly Oral Language: How Meta Trained an NLP Model to Translate Hokkein

Most speech-to-speech translation systems use text as an intermediate mode. So how do you build an automated translator for a language that has no standard written form? A new approach trained neural networks to translate a primarily oral language.
Technical components of No Language Left Behind and how they fit together

Massively Multilingual Translation: NLP Model Translates 200 Different Languages

Sentence pairs that have equivalent meanings in different languages — typically used to train machine translation systems — have been available in sufficient quantities for only around 100 languages. New work doubled that number and produced a more capable model.
Illustration of the Dialogue Transformer Language Model (DLM)

The Sound of Conversation: AI Learns to Mimic Conversational Pauses and Interruptions

In spoken conversation, people naturally take turns amid interjections and other patterns that aren’t strictly verbal. A new approach generated natural-sounding audio dialogs without training on text transcriptions that mark when one party should stop speaking and the other should chime in.
Panda on a swing

Text to Video Without Text-Video Training Data: Make-A-Video, an AI System from Meta, Generates Video from Text

Text-to-image generators like DALL·E 2, Midjourney, and Stable Diffusion are winning art contests and worrying artists. A new approach brings the magic of text-to-image generation to video.
Animation showing 3 main types of data augmentation and random cropping of a picture

Cookbook for Vision Transformers: A Formula for Training Vision Transformers

Vision Transformers (ViTs) are overtaking convolutional neural networks (CNN) in many vision tasks, but procedures for training them are still tailored for CNNs. New research investigated how various training ingredients affect ViT performance.
Animated graphs showing how an ensemble of fine-tuned models can provide better performance.

Ensemble Models Simplified: New Machine Learning Research Simplifies Ensembles

A CLIP model whose weights were the mean of an ensemble of fine-tuned models performed as well as the ensemble and better than its best-performing constituent.
Two randomly cropped pictures

Tradeoffs for Higher Accuracy: Data Augmentation Plus Weight Decay can Boost Some AI Models

Vision models can be improved by training them on several altered versions of the same image and also by encouraging their weights to be close to zero. Recent research showed that both can have adverse effects that may be difficult to detect.
House for sale AD

U.S. Acts Against Algorithmic Bias: Meta Removes Bias from its Ad Algorithms

Regulators are forcing Meta (formerly Facebook) to display certain advertisements more evenly across its membership. The United States government compelled Meta to revise its ad-placement system to deliver ads for housing to members regardless of their age, gender, or ethnicity.
Metaverse illustration with Meta AI product names

Meta Decentralizes AI Effort: Meta Restructures its AI Research Teams

The future of Big AI may lie with product-development teams. Meta reorganized its AI division. Henceforth, AI teams will report to departments that develop key products.
Graph Average across 14 NLP Tasks parameters versus Average Accuracy

GPT-Free: Meta Releases Open Source Large Language Models OPT

Itching to get your hands on a fully trained large language model? The wait is over. Meta introduced the OPT family of transformer-based language models with nearly unfettered access to source code and trained weights.
Deep Symbolic Regression

From Sequences to Symbols: Transformers Extend AI's Mathematical Capabilities

Given a sequence of numbers, neural networks have proven adept at discovering a mathematical expression that generates it. New work uses transformers to extend that success to a further class of expressions.
AI Research SuperCluster (RSC)

New Supercomputer on the Block: All about Meta's AI Research Supercluster.

Facebook’s parent company is staking its future on a new compute cluster. Meta unveiled AI Research SuperCluster (RSC), which is designed to accelerate training of large models for applications like computer vision, natural language processing, and speech recognition.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox