What’s creepier than images from the sci-fi TV series Doctor Who? Images generated by a network designed to visualize what goes on in peoples’ brains while they watch Doctor Who.
What’s new: Lynn Le, Luca Ambrogioni, and colleagues at Radboud University and Max Planck Institute for Human and Brain Cognitive Sciences developed Brain2Pix, a system that reconstructs what people saw from scans of their brain activity.
Key insight: The brain uses neurons nearby one another to represent visual features nearby one another. Convolutional neural networks excel at finding and using spatial patterns to perform tasks such as image generation. Thus, a convolutional neural network can use the spatial relationships between active neurons in a brain scan to reconstruct the corresponding visual image.
How it works: The authors used a picture-to-picture generative adversarial network (GAN) to try to produce an image of what a person was looking at based on functional magnetic resonance imaging (fMRI): 3D scans that depict blood flow in the brain, which indicates neuron activity. They trained the GAN on Doctor Who fMRI, a collection of video frames from 30 episodes of Doctor Who and corresponding fMRIs captured as an individual watched the show.
- The authors converted each 3D scan into 2D images, each of which represented distinct sections of the brain, using a neuroscientific device known as a receptive field estimator .
- They trained the GAN’s discriminator to classify whether an image came from Doctor Who or the GAN’s generator. They trained the generator with a loss function that encouraged it to translate the 2D images of neuron activity into an image that would fool the discriminator.
- The generator used two additional loss terms. The first term aimed to minimize the difference between the pixel values of a video frame and its generated counterpart. The second term aimed to minimize the difference between representations, extracted by
a pretrained VGG-16, of a video frame and its generated counterpart.
- The generator used a convolutional architecture inspired by U-Net in which residual connections passed the first layer’s output to the last layer, second layer’s output to the penultimate layer, and so on. This arrangement helped later layers in the network to preserve spatial patterns in the brain scans.
Results: The researchers used an AlexNet to extract representations of Brain2Pix images and Doctor Who frames and compared the distance between them. Brain2Pix achieved an average distance of 4.6252, an improvement over the previous state-of-the-art method’s average of 5.3511.
Why it matters: The previous state-of-the-art used 3D convolutions directly on the raw fMRIs, yet the new approach fared better. For some problems, engineering features — in this case, converting fMRIs into 2D — may be the best way to improve performance.
We’re thinking: We wouldn’t mind sitting in an fMRI machine for hours on end if we were binge-watching Doctor Who.