Switch Transformer

6 Posts

Graph Average across 14 NLP Tasks parameters versus Average Accuracy
Switch Transformer

GPT-Free: Meta Releases Open Source Large Language Models OPT

Itching to get your hands on a fully trained large language model? The wait is over. Meta introduced the OPT family of transformer-based language models with nearly unfettered access to source code and trained weights.
Illustration of giant Christmas tree in a town plaza
Switch Transformer

Trillions of Parameters: Are AI models with trillions of parameters the new normal?

The trend toward ever-larger models crossed the threshold from immense to ginormous. Google kicked off 2021 with Switch Transformer, the first published work to exceed a trillion parameters, weighing in at 1.6 trillion.
A graph shows the cost in dollars of training large natural language processing models.
Switch Transformer

Who Can Afford to Train AI?: Cost of AI is Too Expensive for Many Small Companies

The cost of training top-performing machine learning models has grown beyond the reach of smaller companies.
Animations that shows how the Google Search Algorithm works with Multimodal AI
Switch Transformer

Search Goes Multimodal: Google Upgrades its Search Algorithm with Multimodal AI

Google will upgrade its search engine with a new model that tracks the relationships between words, images, and, in time, videos — the first fruit of its latest research into multimodal machine learning and multilingual language modeling.
Animation showing a AI's metaphorical transition to using green energy.
Switch Transformer

Greener Machine Learning: Techniques for Reducing the Carbon Footprint of NLP Models

A new study suggests tactics for machine learning engineers to cut their carbon emissions. Led by David Patterson, researchers at Google and UC Berkeley found that AI developers can shrink a model’s carbon footprint a thousand-fold by streamlining architecture...
Different graphs showing switch transformer data
Switch Transformer

Bigger, Faster Transformers: Google's Switch Transformer uses MoE for Efficient NLP

Performance in language tasks rises with the size of the model — yet, as a model’s parameter count rises, so does the time it takes to render output. New work pumps up the number of parameters without slowing down the network.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox