Gopher

3 Posts

Dependency between compute budget and number of parameters
Gopher

Right-Sizing Models for the Dataset: Finding the Best Data-To-Parameter Ratio for NLP Models

The route to improving transformer-based language models like GPT-3 and Gopher, which are trained on immense quantities of text scraped from the web, has been to increase their size. But research shows that, given a processing budget, bigger doesn’t necessarily mean better.
Photograph of Yale Song
Gopher

Yale Song: Foundation models for vision.

Large models pretrained on immense quantities of text have been proven to provide strong foundations for solving specialized language tasks. My biggest hope for AI in 2022 is...
Two images showing RETRO Architecture and Gopher (280B) vs State of the Art
Gopher

Large Language Models Shrink: Gopher and RETRO prove lean language models can push boundaries.

DeepMind released three papers that push the boundaries — and examine the issues — of large language models.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox