Predicting a molecule’s aroma is hard because slight changes in structure lead to huge shifts in perception. Good thing deep learning is developing a sense of smell.
What’s new: Benjamin Sanchez-Lengeling and a team from Google Brain, Arizona State University, and the University of Toronto developed a model that predicts a chemical’s smell from an embedding of its molecular structure.
Key insight: A molecule is composed of atoms with bonds between them. Representing atoms as nodes and bonds as edges yields a graph ripe for processing by a graph neural network, or GNN.
How it works: The researchers gathered about 5,030 molecules and 138 odor descriptions, such as “fruity” or “medicinal,” from the GoodScents and Leffingwell PMP 2001 fragrance databases. They treated each description as a class in a classification task. Their model included a GNN, a component that converts graphs into vectors, and a fully connected layer that performs classification.
- The GNN takes a graph representation of a molecule as its input and learns a more information-rich graph representation. The network learns a vector that describes each node. The network’s layers update these vectors based on the values of neighboring nodes. The model converts the enriched output graph to a vector by summing the values of each node’s neighbors.
- A sequence of feed-forward layers then classifies the molecule’s odor.
- The network’s penultimate layer encodes the molecule-scent embedding, which can be used for other tasks as well. For instance, the authors applied it to the DREAM Olfaction Prediction Challenge to predict an odor’s strength (“how fruity is this smell?”) on a scale of 1 to 100.
Results: The GNN achieved a 5 percent higher F1 score than random-forest or nearest-neighbor methods trained on hand-crafted features. On the DREAM Olfaction Prediction Challenge, the authors matched the original winner’s 2015 score, even though their embedding wasn’t designed for this particular task.
Why it matters: Chemists often struggle to predict properties of molecules based on their structure. This work suggests that deep learning can aid in the effort. Beyond predicting smells, the molecule-scent embedding is suited to transfer learning for other scent-related tasks and possibly generative methods that might, say, predict molecules having a particular scent.
We’re thinking: One of the biggest challenges to building an artificial nose is not in the software, but in the hardware: How to build a sensor that can detect minute numbers of scent molecules in the air. This research could help design new fragrances, but further work in chemical sensing technology is also needed. Whoever cracks this problem will come up smelling like roses.