Transformer

140 Posts

Masked Pretraining for CNNs: ConvNeXt V2, the new model family that boosts ConvNet performance
Transformer

Masked Pretraining for CNNs: ConvNeXt V2, the new model family that boosts ConvNet performance

Vision transformers have bested convolutional neural networks (CNNs) in a number of key vision tasks. Have CNNs hit their limit? New research suggests otherwise.
Diffusion Transformed: A new class of diffusion models based on the transformer architecture
Transformer

Diffusion Transformed: A new class of diffusion models based on the transformer architecture

A tweak to diffusion models, which are responsible for most of the recent excitement about AI-generated images, enables them to produce more realistic output.
Where Is Meta’s Generative Play?: Why Meta still lacks a flagship generative AI service
Transformer

Where Is Meta’s Generative Play?: Why Meta still lacks a flagship generative AI service

While Microsoft and Google scramble to supercharge their businesses with text generation, Meta has yet to launch a flagship generative AI service. Reporters went looking for reasons why.
What the Brain Sees: How a text-to-image model generates images from brain scans
Transformer

What the Brain Sees: How a text-to-image model generates images from brain scans

A pretrained text-to-image generator enabled researchers to see — roughly — what other people looked at based on brain scans. Yu Takagi and Shinji Nishimoto developed a method that uses Stable Diffusion to reconstruct images viewed by test subjects...
Falcon Ascends: Falcon, the new open source commercial LLM, explained
Transformer

Falcon Ascends: Falcon, the new open source commercial LLM, explained

A team in the United Arab Emirates, a seven-state federation on the Arabian Peninsula, built the latest top-performing open source large language model.
Text-to-Image Editing Evolves: InstructPix2Pix for text-to-image editing, explained
Transformer

Text-to-Image Editing Evolves: InstructPix2Pix for text-to-image editing, explained

Text-to-image generators like DALL·E 2, Stable Diffusion, and Adobe’s new Generative Fill feature can revise images in a targeted way — say, change the fruit in a bowl from oranges to bananas — if you enter a few words that describe the change plus an indication of the areas to be changed.
Goodbye Prompt Engineering, Hello Prompt Generation: Automatic Prompt Engineer (APE) research summary.
Transformer

Goodbye Prompt Engineering, Hello Prompt Generation: Automatic Prompt Engineer (APE) research summary.

When you’re looking for answers from a large language model, some prompts are better than others. So how can you come up with the best one? A new model automates the process.
Example of interactive editing sessions with Meta's text generator PEER
Transformer

Collaborative Text Generator: A language model that collaborates with human writers

Text from current language models can be useful as a rough draft, but that leaves the polishing to human writers. A language model learned how to generate and respond to editorial directions.
Transformer-based system simulating simulate the Atari game "Pong"
Transformer

Efficient Reinforcement Learning: IRIS used reinforcement learning to master Atari games with little gameplay.

Both transformers and reinforcement learning models are notoriously data-hungry. They may be less so when they work together. Vincent Micheli and colleagues at the University of Geneva trained a transformer-based system to simulate Atari games using a small amount of gameplay.
Vision and Language Tightly Bound: Training on a single loss function improves multimiodal AI.
Transformer

Vision and Language Tightly Bound: Training on a single loss function improves multimiodal AI.

Recent multimodal models process both text and images as sequences of tokens, but they learn to represent these distinct data types using separate loss functions. Recent work unifies the loss function as well.
Runaway LLaMA: How Meta's LLaMA NLP model leaked
Transformer

Runaway LLaMA: How Meta's LLaMA NLP model leaked

Meta’s effort to make a large language model available to researchers ended with its escape into the wild. Soon after Meta started accepting applications for developer access to LLaMA, a family of trained large language models...
GPT-4 Has Landed: Everything you need to know about GPT-4.
Transformer

GPT-4 Has Landed: Everything you need to know about GPT-4.

Get ready for the next wave of language-model mania. OpenAI introduced the latest in its GPT series of large language models to widespread excitement. The company showed statistics and examples designed to demonstrate...
MuLan text tokens computed from text prompt as conditioning signal
Transformer

He Who Types the Prompt Calls the Tune: Google introduces an AI that generates music from text.

As AI-generated text and images capture the world’s attention, music is catching up. Andrea Agostinelli, Timo I. Denk, and colleagues at Google and Sorbonne Université introduced MusicLM, a system that generates music from text descriptions.
Dataset FOLIO example based on the Wild Turkey Wikipedia page
Transformer

Language Models Defy Logic: Large NLP models struggle with logical reasoning.

Who would disagree that, if all people are mortal and Socrates is a person, Socrates must be mortal? GPT-3, for one. Recent work shows that bigger language models are not necessarily better when it comes to logical reasoning.
Screen captures of the Sparrow Chatbot
Transformer

Google’s Rule-Respecting Chatbot: Research helps AI chatbots be more truthful and less hateful.

Amid speculation about the threat posed by OpenAI’s ChatGPT chatbot to Google’s search business, a paper shows how the search giant might address the tendency of such models to produce offensive, incoherent, or untruthful dialog.
Load More

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox