Indigenous Knowledge Graph

Native Processing: Intelligent Voices of Wisdom Teaches Native Culture to AI

A group of media and technology experts is working to give AI a better understanding of indigenous peoples. IVOW is a consultancy that aims to reduce machine learning bias against cultures that are underrepresented in training data by producing knowledge graphs and other resources.
Illustration of how different data split strategies partition the labelled data

Fine-Tune Your Fine-Tuning: New method optimizes training for few shot NLP models.

Let’s say you have a pretrained language model and a small amount of data to fine-tune it to answer yes-or-no questions. Should you fine-tune it to classify yes/no or to fill in missing words — both viable approaches that are likely to yield different results?
Diagram with info about AlphaCode

Competitive Coder: AI code writing system can compete alongside humans.

Programming is hard. Programming competitions are harder. Yet transformers proved themselves up to the task.
AI Research SuperCluster (RSC)

New Supercomputer on the Block: All about Meta's AI Research Supercluster.

Facebook’s parent company is staking its future on a new compute cluster. Meta unveiled AI Research SuperCluster (RSC), which is designed to accelerate training of large models for applications like computer vision, natural language processing, and speech recognition.
InstructGPT methods

A Kinder, Gentler Language Model: Inside Instruct GPT-3, OpenAI's GPT-3 successor.

OpenAI unveiled a more reliable successor to its GPT-3 natural language model. InstructGPT is a version of GPT-3 fine-tuned to minimize harmful, untruthful, and biased output. It's available via an application programming interface.
Multimodal deep learning model

AI Versus the Garbage Heap: How Amazon uses AI to cut waste.

Amazon reported long-term success using machine learning to shrink its environmental footprint. The online retailer developed a system that fuses product descriptions, images, and structured data to decide how an item should be packed for shipping.
Schematic of 8-bit optimizers via block-wise dynamic quantization

More Learning With Less Memory: Training large language models using less memory.

Researchers discovered a new way to reduce memory requirements when training large machine learning models. Tim Dettmers and colleagues at University of Washington released 8-bit optimizers that store gradient statistics as 8-bit values, while maintaining the same accuracy.
Yoav Shoham

Yoav Shoham: Language models that reason.

I believe that natural language processing in 2022 will re-embrace symbolic reasoning, harmonizing it with the statistical operation of modern neural networks. Let me explain what I mean by this.
Abeba Birhane

Abeba Birhane: Clean up web datasets.

From language to vision models, deep neural networks are marked by improved performance, higher efficiency, and better generalizations. Yet, these systems are also marked by perpetuation of bias and injustice.
A living room made out of cups of coffee: the people, the seats, the chimney, the lamp, all gather around a cozy fire.

One Architecture to Do Them All: Transformer: The AI architecture that can do it all.

The transformer architecture extended its reach to a variety of new domains.What happened: Originally developed for natural language processing, transformers are becoming the Swiss Army Knife of deep learning.
An illustration shows a cozy cabin where all the furniture is made out of coffee mugs.

Transformers Take Over: Transformers Applied to Vision, Language, Video, and More

In 2021, transformers were harnessed to discover drugs, recognize speech, and paint pictures — and much more.
Illustration of a woman riding a sled

Multimodal AI Takes Off: Multimodal Models, such as CLIP and DALL-E, are taking over AI.

While models like GPT-3 and EfficientNet, which work on text and images respectively, are responsible for some of deep learning’s highest-profile successes, approaches that find relationships between text and images made impressive
Illustration of giant Christmas tree in a town plaza

Trillions of Parameters: Are AI models with trillions of parameters the new normal?

The trend toward ever-larger models crossed the threshold from immense to ginormous. Google kicked off 2021 with Switch Transformer, the first published work to exceed a trillion parameters, weighing in at 1.6 trillion.
Two images showing RETRO Architecture and Gopher (280B) vs State of the Art

Large Language Models Shrink: Gopher and RETRO prove lean language models can push boundaries.

DeepMind released three papers that push the boundaries — and examine the issues — of large language models.
A conversation between a human and an open-domain chatbot.

Long-Haul Chatbot: Facebook Chatbot is Able to Carry on Long Conversations

Facebook released a chatbot that summarizes dialog on the fly and uses the summary to generate further repartee.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox