Tag-Retrieve-Compose-Synthesize (TReCS)
Machine Learning Research

Pictures From Words and Gestures: AI model generates captions as users mouse over images.

A new system combines verbal descriptions and crude lines to visualize complex scenes. Google researchers led by Jing Yu Koh proposed Tag-Retrieve-Compose-Synthesize (TReCS), a system that generates photorealistic images by describing what they want to see while mousing around on a blank screen.
Taxonomy of deep learning architectures using self-attention for visual recognition and images from the COCO dataset
Machine Learning Research

Vision Models Get Some Attention: Researchers add self-attention to convolutional neural nets.

Self-attention is a key element in state-of-the-art language models, but it struggles to process images because its memory requirement rises rapidly with the size of the input. New research addresses the issue with a simple twist on a convolutional neural network.
Sequence showing a training step that uses different perspectives of the same patient to enhance unsupervised pretraining
Machine Learning Research

Same Patient, Different Views: Contrastive pretraining improves medical imaging AI.

When you lack labeled training data, pretraining a model on unlabeled data can compensate. New research pretrained a model three times to boost performance on a medical imaging task.
Images generated by a network designed to visualize what goes on in peoples’ brains while they watch Doctor Who
Machine Learning Research

What the Brain Sees: AI uses brain activity to create images.

What’s creepier than images from the sci-fi TV series Doctor Who? Images generated by a network designed to visualize what goes on in peoples’ brains while they watch Doctor Who.
Graphs and data related to ReLabel, a technique that labels any random crop of any image.
Machine Learning Research

Good Labels for Cropped Images: AI technique adds text labels to random image crops.

In training an image recognition model, it’s not uncommon to augment the data by cropping original images randomly. But if an image contains several objects, a cropped version may no longer match its label. Researchers developed a way to make sure random crops are labeled properly.
Neural Body, a procedure that generates novel views of a single human character, working
Machine Learning Research

Seeing People From a New Angle: Neural Body is an AI tool for generating 3D images of people.

The University of Hong Kong, and Cornell University to create Neural Body, a procedure that generates novel views of a single human character based on shots from only a few angles.
Graphs and data related to ImageNet performance
Machine Learning Research

ImageNet Performance, No Panacea: ImageNet pretraining won't always improve computer vision.

It’s commonly assumed that models pretrained to achieve high performance on ImageNet will perform better on other visual tasks after fine-tuning. But is it always true? A new study reached surprising conclusions.
Graph showing information about different transformer models
Machine Learning Research

Transformer Variants Head to Head: A benchmark for comparing different AI transformers.

The transformer architecture has inspired a plethora of variations. Yet researchers have used a patchwork of metrics to evaluate their performance, making them hard to compare. New work aims to level the playing field.
Graph showing system that examines X-ray images to predict which Covid-19 patients are at greatest risk of decline
Machine Learning Research

Covid-19 Triage: Computer vision for x-rays helps triage Covid-19 patients.

The pandemic has pushed hospitals to their limits. A new machine learning system could help doctors make sure the most severe cases get timely, appropriate care.
System Oscar+ working
Machine Learning Research

Sharper Eyes For Vision+Language: AI research shows improved image and text matching.

Models that interpret the interplay of words and images tend to be trained on richer bodies of text than images. Recent research worked toward giving such models a more balanced knowledge of the two domains.
Different graphs showing switch transformer data
Machine Learning Research

Bigger, Faster Transformers: Increasing parameters without slowing down transformers

Performance in language tasks rises with the size of the model — yet, as a model’s parameter count rises, so does the time it takes to render output. New work pumps up the number of parameters without slowing down the network.
Different data related to the phenomenon called underspecification
Machine Learning Research

Facing Failure to Generalize: Why some AI models exhibit underspecification.

The same models trained on the same data may show the same performance in the lab, and yet respond very differently to data they haven’t seen before. New work finds this inconsistency to be pervasive.
Series of images showing improvements in a multilingual language translator
Machine Learning Research

Better Zero-Shot Translations: A method for improving transformer NLP translation

Train a multilingual language translator to translate between Spanish and English and between English and German, and it may be able to translate directly between Spanish and German as well. New work proposes a simple path to better machine translation between languages.
Graphs and data related to visualized tokens (or vokens)
Machine Learning Research

Better Language Through Vision: Study improved Bert performance using visual tokens.

For children, associating a word with a picture that illustrates it helps them learn the word’s meaning. Research aims to do something similar for machine learning models. Researchers improved a BERT model’s performance on some language tasks by training it on a large dataset of image-word pairs.
Series of images and graphs related to cancer detection
Machine Learning Research

Shortcut to Cancer Treatment: AI determines breast cancer treatment from H&E stains.

Doctors who treat breast cancer typically use a quick, inexpensive tumor-tissue stain test to diagnose the illness and a slower, more costly one to determine treatment. A new neural network could help doctors to go straight from diagnosis to treatment.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox