Contentedge screen video capture
transformers

Winning The Google Game

AI startups are helping writers tailor articles that appear near the top of Google’s search results. At least 14 companies sell access to software that uses GPT-3, the language model from OpenAI, to generate headlines, product descriptions, blog posts, and video scripts.
2 min read
Illustration of a robot with a captain costume
transformers

Neural Networks: Find the Function

Let’s get this out of the way: A brain is not a cluster of graphics processing units, and if it were, it would run software far more complex than the typical artificial neural network. Yet neural networks were inspired by the brain’s architecture.
3 min read
Gato’s performance on simulated control tasks | Image captions generated by Gato
transformers

One Model, Hundreds of Tasks

Researchers took a step toward achieving a longstanding goal: One model that performs a whole lot of very different tasks. Scott Reed, Konrad Żołna, Emilio Parisotto and a team at DeepMind announced Gato.
2 min read
Architecture of CXV
transformers

Upgrade for Vision Transformers

Vision Transformer and models like it use a lot of computation and memory when processing images. New work modifies these architectures to run more efficiently while adopting helpful properties from convolutions.
2 min read
Graph Average across 14 NLP Tasks parameters versus Average Accuracy
transformers

GPT-Free

Itching to get your hands on a fully trained large language model? The wait is over. Meta introduced the OPT family of transformer-based language models with nearly unfettered access to source code and trained weights.
2 min read
Shifted Patch Tokenization (SPT) | Locality Self-Attention (LSA)
transformers

Less Data for Vision Transformers

Vision Transformer (ViT) outperformed convolutional neural networks in image classification, but it required more training data. New work enabled ViT and its variants to outperform other architectures with less training data.
2 min read
GLaM model architecture
transformers

Efficiency Experts

The emerging generation of trillion-parameter language models take significant computation to train. Activating only a portion of the network at a time can cut the requirement dramatically and still achieve exceptional results.
3 min read
AI generated images with different descriptions
transformers

More Realistic Pictures From Text

OpenAI’s DALL·E got an upgrade that takes in text descriptions and produces images in styles from hand-drawn to photorealistic. The new version is a rewrite from the ground up. It uses the earlier CLIP zero-shot image classifier to represent text descriptions.
2 min read
Jurassic-X's software infrastructure
transformers

Neural Nets + Rules = Truer Text

A new approach aims to cure text generators of their tendency to produce nonsense. AI21 Labs launched Jurassic-X, a natural language processing system that combines neural networks and rule-based programs.
2 min read
Deep Symbolic Regression
transformers

From Sequences to Symbols

Given a sequence of numbers, neural networks have proven adept at discovering a mathematical expression that generates it. New work uses transformers to extend that success to a further class of expressions.
2 min read
Grokking: A dramatic example of generalization far after overfitting on an algorithmic dataset
transformers

Learning After Overfitting

When a model trains too much, it can overfit, or memorize, the training data, which reduces its ability to analyze similar-but-different inputs. But what if training continues? New work found that overfitting isn’t the end of the line.
2 min read
Nvidia Chip
transformers

Transformer Accelerator

Is your colossal text generator bogged down in training? Nvidia announced a chip designed to accelerate the transformer architecture, the basis of large language models such as GPT-3.
2 min read
Diagram with info about AlphaCode
transformers

Competitive Coder

Programming is hard. Programming competitions are harder. Yet transformers proved themselves up to the task.
2 min read
The performance of different downstream (DS)
transformers

The Limits of Pretraining

The higher the accuracy of a pretrained model, the better its performance after fine-tuning, right? Not necessarily. Researchers conducted a meta-analysis of image-recognition experiments and performed some of their own.
2 min read
Diagram with automated decision systems
transformers

Roadblocks to Regulation

Most U.S. state agencies use AI without limits or oversight. An investigative report probed reasons why efforts to rein them in have made little headway. Since 2018, nearly every proposed bill aimed at studying or controlling how state agencies use automated decision systems.
2 min read

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox