Transformer-XL

2 Posts

Data related to Nvidia's Pay Attention When Required (Par) approach
Transformer-XL

Selective Attention

Large transformer networks work wonders with natural language, but they require enormous amounts of computation. New research slashes processor cycles without compromising performance.
1 min read
Graph related to Language Model Analysis (LAMA)
Transformer-XL

What Language Models Know

Watson set a high bar for language understanding in 2011, when it famously whipped human competitors in the televised trivia game show Jeopardy! IBM’s special-purpose AI required around $1 billion. Research suggests that today’s best language models can accomplish similar tasks right off the shelf.
2 min read

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox