Illustration of a robot with a captain costume
Language

Neural Networks: Find the Function — A Basic Introduction to Neural Networks

Let’s get this out of the way: A brain is not a cluster of graphics processing units, and if it were, it would run software far more complex than the typical artificial neural network. Yet neural networks were inspired by the brain’s architecture.
Gato’s performance on simulated control tasks | Image captions generated by Gato
Language

One Model, Hundreds of Tasks: Multimodal Transformer Performs Over 600 Different Tasks

Researchers took a step toward achieving a longstanding goal: One model that performs a whole lot of very different tasks. Scott Reed, Konrad Żołna, Emilio Parisotto and a team at DeepMind announced Gato.
Graph Average across 14 NLP Tasks parameters versus Average Accuracy
Language

GPT-Free: Meta Releases Open Source Large Language Models OPT

Itching to get your hands on a fully trained large language model? The wait is over. Meta introduced the OPT family of transformer-based language models with nearly unfettered access to source code and trained weights.
GLaM model architecture
Language

Efficiency Experts: Mixture of Experts Makes Language Models More Efficient

The emerging generation of trillion-parameter language models take significant computation to train. Activating only a portion of the network at a time can cut the requirement dramatically and still achieve exceptional results.
AI-generated portraits
Language

Your Salesbot Connection: How Marketers Use AI to Generate New Leads

Marketers are using fake social media personas — enhanced by AI-generated portraits — to expand their reach without busting their budgets.
Indigenous Knowledge Graph
Language

Native Processing: Intelligent Voices of Wisdom Teaches Native Culture to AI

A group of media and technology experts is working to give AI a better understanding of indigenous peoples. IVOW is a consultancy that aims to reduce machine learning bias against cultures that are underrepresented in training data by producing knowledge graphs and other resources.
Illustration of how different data split strategies partition the labelled data
Language

Fine-Tune Your Fine-Tuning: New method optimizes training for few shot NLP models.

Let’s say you have a pretrained language model and a small amount of data to fine-tune it to answer yes-or-no questions. Should you fine-tune it to classify yes/no or to fill in missing words — both viable approaches that are likely to yield different results?
Diagram with info about AlphaCode
Language

Competitive Coder: AI code writing system can compete alongside humans.

Programming is hard. Programming competitions are harder. Yet transformers proved themselves up to the task.
AI Research SuperCluster (RSC)
Language

New Supercomputer on the Block: All about Meta's AI Research Supercluster

Facebook’s parent company is staking its future on a new compute cluster. Meta unveiled AI Research SuperCluster (RSC), which is designed to accelerate training of large models for applications like computer vision, natural language processing, and speech recognition.
InstructGPT methods
Language

A Kinder, Gentler Language Model: Inside Instruct GPT-3, OpenAI's GPT-3 successor.

OpenAI unveiled a more reliable successor to its GPT-3 natural language model. InstructGPT is a version of GPT-3 fine-tuned to minimize harmful, untruthful, and biased output. It's available via an application programming interface.
Multimodal deep learning model
Language

AI Versus the Garbage Heap: How Amazon uses AI to cut waste.

Amazon reported long-term success using machine learning to shrink its environmental footprint. The online retailer developed a system that fuses product descriptions, images, and structured data to decide how an item should be packed for shipping.
Schematic of 8-bit optimizers via block-wise dynamic quantization
Language

More Learning With Less Memory: Training large language models using less memory.

Researchers discovered a new way to reduce memory requirements when training large machine learning models. Tim Dettmers and colleagues at University of Washington released 8-bit optimizers that store gradient statistics as 8-bit values, while maintaining the same accuracy.
Yoav Shoham
Language

Yoav Shoham: Language models that reason

I believe that natural language processing in 2022 will re-embrace symbolic reasoning, harmonizing it with the statistical operation of modern neural networks. Let me explain what I mean by this.
Abeba Birhane
Language

Abeba Birhane: Clean up web datasets

From language to vision models, deep neural networks are marked by improved performance, higher efficiency, and better generalizations. Yet, these systems are also marked by perpetuation of bias and injustice.
A living room made out of cups of coffee: the people, the seats, the chimney, the lamp, all gather around a cozy fire.
Language

One Architecture to Do Them All: Transformer: The AI architecture that can do it all.

The transformer architecture extended its reach to a variety of new domains.What happened: Originally developed for natural language processing, transformers are becoming the Swiss Army Knife of deep learning.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox