Carnegie Mellon University

15 Posts

A Transformer Alternative Emerges: Mamba, a new approach that may outperform transformers

An architectural innovation improves upon transformers — up to 2 billion parameters, at least...

Better, Faster Network Pruning: Researchers devise pruning method that boosts AI speed

Pruning weights from a neural network makes it smaller and faster, but it can take a lot of computation to choose weights that can be removed without degrading the network’s performance.

Carnegie Mellon University

Text or Images, Input or Output: GILL, an innovative approach to multimodal model training

GPT-4V introduced a large multimodal model that generates text from images and, with help from DALL-E 3, generates images from text. However, OpenAI hasn’t fully explained how it built the system. A separate group of researchers described their own method.

Flowcharts show how a new contrastive learning approach uses metadata to improve AI image classifiers

Carnegie Mellon University

Learning From Metadata: Descriptive Text Improves Performance for AI Image Classification Systems

Images in the wild may not come with labels, but they often include metadata. A new training method takes advantage of this information to improve contrastive learning.

A series of graphs show the carbon emissions associated with training AI models.

Carnegie Mellon University

Cutting the Carbon Cost of Training: A New Tool Helps NLP Models Lower Their Gas Emissions

You can reduce your model’s carbon emissions by being choosy about when and where you train it.

Animation showing probability of children who may benefit from intervention

Carnegie Mellon University

Child-Welfare Agency Drops AI: Oregon and Pennsylvania Halt Use of AI Tool for At-Risk Kids

Officials in charge of protecting children stopped using a machine learning model designed to help them make decisions in difficult cases. The U.S. state of Oregon halted its use of an algorithm intended to identify children who may benefit from intervention.

Illustration of how different data split strategies partition the labelled data

Carnegie Mellon University

Fine-Tune Your Fine-Tuning: New method optimizes training for few shot NLP models.

Let’s say you have a pretrained language model and a small amount of data to fine-tune it to answer yes-or-no questions. Should you fine-tune it to classify yes/no or to fill in missing words — both viable approaches that are likely to yield different results?

A four-legged robot walking over difficult and changing terrain

Carnegie Mellon University

Walking the Dog: Training a robot to walk over unsteady terrain with RL.

A reinforcement learning system enabled a four-legged robot to amble over unfamiliar, rapidly changing terrain.

Neural networks generating novel views of a 3D scene based on existing pictures

Carnegie Mellon University

3D Scene Synthesis for the Real World: Generating 3D scenes with radiance fields and image data

Researchers have used neural networks to generate novel views of a 3D scene based on existing pictures plus the positions and angles of the cameras that took them. In practice, though, you may not know the precise camera

Carnegie Mellon University

Unsupervised Prejudice: Image classification models learned bias from ImageNet.

Social biases are well documented in decisions made by supervised models trained on ImageNet’s labels. But they also crept into the output of unsupervised models pretrained on the same dataset.

Data related to Covid-19 symptoms prediction

Carnegie Mellon University

Cats Cured of Covid: Why some deep learning models thought cats had Covid

Neural networks are famously bad at interpreting input that falls outside the training set’s distribution, so it’s not surprising that some models are certain that cat pictures show symptoms of Covid-19. A new approach won’t mistakenly condemn your feline to a quarantine.

Data and graphs related to teacher networks

Carnegie Mellon University

Flexible Teachers, Smarter Students: Meta Pseudo Labels improves knowledge distillation.

Human teachers can teach more effectively by adjusting their methods in response to student feedback. It turns out that teacher networks can do the same.

Graph related to Mixture of Softmaxes (MoS)

Carnegie Mellon University

Upgrading Softmax: Mixtape is a faster way to avoid the softmax bottleneck.

Softmax commonly computes probabilities in a classifier’s output layer. But softmax isn’t always accurate in complex tasks — say, in a natural-language task, when the length of word vectors is much smaller than the number of words in the vocabulary.

Illustration of a fireplace with "Happy holidays" cards in English, Spanish and French

Carnegie Mellon University

Natural Language Processing Models Get Literate: Why 2019 was a breakthrough year for NLP

Earlier language models powered by Word2Vec and GloVe embeddings yielded confused chatbots, grammar tools with middle-school reading comprehension, and not-half-bad translations. The latest generation is so good, some people consider it dangerous.

Graph related to Noisy Student performance on ImageNet

Carnegie Mellon University

Self-Training for Sharper Vision: The noisy student method for computer vision, explained

The previous state-of-the-art image classifier was trained on the ImageNet dataset plus 3.5 billion supplemental images from a different database. A new method achieved higher accuracy with one-tenth as many supplemental examples — and they were unlabeled, to boot.

Carnegie Mellon University

A Transformer Alternative Emerges: Mamba, a new approach that may outperform transformers

Better, Faster Network Pruning: Researchers devise pruning method that boosts AI speed

Text or Images, Input or Output: GILL, an innovative approach to multimodal model training

Learning From Metadata: Descriptive Text Improves Performance for AI Image Classification Systems

Cutting the Carbon Cost of Training: A New Tool Helps NLP Models Lower Their Gas Emissions

Child-Welfare Agency Drops AI: Oregon and Pennsylvania Halt Use of AI Tool for At-Risk Kids

Fine-Tune Your Fine-Tuning: New method optimizes training for few shot NLP models.

Walking the Dog: Training a robot to walk over unsteady terrain with RL.

3D Scene Synthesis for the Real World: Generating 3D scenes with radiance fields and image data

Unsupervised Prejudice: Image classification models learned bias from ImageNet.

Cats Cured of Covid: Why some deep learning models thought cats had Covid

Flexible Teachers, Smarter Students: Meta Pseudo Labels improves knowledge distillation.

Upgrading Softmax: Mixtape is a faster way to avoid the softmax bottleneck.

Natural Language Processing Models Get Literate: Why 2019 was a breakthrough year for NLP

Self-Training for Sharper Vision: The noisy student method for computer vision, explained

Subscribe to The Batch