Graphs comparing SGD + Momentum, Adam and AdaBelief

Striding Toward the Minimum

When you’re training a deep learning model, it can take days for an optimization algorithm to minimize the loss function. A new approach could save time.
AI-driven balloon reaching high altitude

How to Drive a Balloon

Helium balloons that beam internet service to hard-to-serve areas are using AI to navigate amid high-altitude winds. Loon, the Alphabet division that provides wireless internet via polyethylene blimps.
Illustration of two witches with half a pumpkin each and the moon behind them

The AI Community Splinters

Will international rivalries fragment international cooperation in machine learning? Countries competing for AI dominance will lash out at competitors.
Illustration of a neighborhood haunted by an evil pumpkin and a black cat

Giant Models Bankrupt Research

What if AI requires so much computation that it becomes unaffordable?The fear: Training ever more capable models will become too pricey for all but the richest corporations and government agencies. Rising costs will
Information and components of a battery

Getting a Charge From AI

Machine learning is helping to design energy cells that charge faster and last longer. Battery developers are using ML algorithms to devise new chemicals, components, and charging techniques faster than traditional techniques allow.
Graphs related to different attention mechanisms

More Efficient Transformers

As transformer networks move to the fore in applications from language to vision, the time it takes them to crunch longer sequences becomes a more pressing issue. A new method lightens the computational load using sparse attention.
Graphs with data related to Microsoft's library DeepSpeed

Toward 1 Trillion Parameters: DeepSpeed PyTorch Library Supports Large AI and NLP Models

An open source library could spawn trillion-parameter neural networks and help small-time developers build big-league models. Microsoft upgraded DeepSpeed, a library that accelerates the PyTorch deep learning framework.
Data and information related to dropout

Dropout With a Difference

The technique known as dropout discourages neural networks from overfitting by deterring them from reliance on particular features. A new approach reorganizes the process to run efficiently on the chips that typically run neural network calculations.
Graphs and data related to transformer networks

The Transformation Continues

Transformer networks are gaining popularity as a high-accuracy alternative to recurrent neural networks. But they can run slowly when they’re applied to long sequences.
Data related to experience replay

Experience Counts

If the world changes every second and you take a picture every 10 seconds, you won’t have enough pictures to observe the changes clearly, and storing a series of pictures won’t help. On the other hand, if you take a picture every tenth of a second, then storing a history will help model the world.
Graphs and data related to semi-supervised learning

All Examples Are Not Equal

Semi-supervised learning — a set of training techniques that use a small number of labeled examples and a large number of unlabeled examples — typically treats all unlabeled examples the same way. But some examples are more useful for learning than others.
Excerpt from study about models that learn to predict task-specific distance metrics

Misleading Metrics

A growing body of literature shows that some steps in AI’s forward march may actually move sideways. A new study questions advances in metric learning.
Information related to the Once-for-All (OFA) method

Build Once, Run Anywhere

From server to smartphone, devices with less processing speed and memory require smaller networks. Instead of building and training separate models to run on a variety of hardware, a new approach trains a single network that can be adapted to any device.
Talking bubbles inside talking bubbles

Bigger is Better

Natural language processing lately has come to resemble an arms race, as the big AI companies build models that encompass ever larger numbers of parameters. Microsoft recently held the record — but not for long.
Hamster running in a hamster ball

Running Fast, Standing Still

Machine learning researchers report better and better results, but some of that progress may be illusory. Some models that appear to set a new state of the art haven’t been compared properly to their predecessors, Science News reports based on several published surveys.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox