Doctors who treat breast cancer typically use a quick, inexpensive tumor-tissue stain test to diagnose the illness and a slower, more costly one to determine treatment. A new neural network could help doctors to go straight from diagnosis to treatment.
What’s new: The stain in the test for treatment highlights a key visual clue to the choice of therapy that’s otherwise invisible to human pathologists. Nikhil Naik at Salesforce and colleagues at University of Southern California built ReceptorNet to detect that clue in the diagnostic test.
Key insight: The presence of estrogen receptor proteins is a sign that hormone therapy may work. In the diagnostic test, known as hematoxylin and eosin (H&E), these proteins are invisible to the human eye. An attention mechanism, which identifies the most important parts of an input (in this case, a portion of an H&E slide) in determining the output (a label that the proteins are present), can aggregate image areas where they occur so a vision network can classify the slide.
How it works: ReceptorNet comprises a ResNet-50 pretrained on ImageNet followed by an attention layer and a fully connected layer. The researchers trained and tested ReceptorNet on images of H&E slides, and augmentations of them, in the Australian Breast Cancer Tissue Bank and The Cancer Genome Atlas datasets.
- The authors isolated the images’ foreground using Otsu’s method, which distinguishes foreground from background based on variance in each pixel’s grayscale value, to remove background regions. They magnified the foregrounds 20 times and divided them into tiles of 256×256 pixels.
- During training, they fed ReceptorNet 50 randomly sampled tiles per slide. The ResNet extracted representations of each tile and passed them en masse to the attention layer, which weighted their importance. The fully connected layer used the weighted representations to classify slides according to whether they contain estrogen receptors.
- To combat overfitting, the authors used mean pixel regularization, randomly replacing 75 percent of tiles with a single-color image of the dataset’s mean pixel value.
Results: ReceptorNet achieved an area under the curve of 0.92 AUC, a measure of true versus false positives where 1 is a perfect score. The authors experimented with alternatives to the attention layer that didn’t perform as well, which suggests that attention was key.
Yes, but: The authors had access only to H&E images, so they couldn’t compare ReceptorNet’s performance against the test that’s typically used to guide treatment.
Why it matters: ReceptorNet had a label for each tissue slide but not for each tile derived from it. The success of attention in aggregating and evaluating the representations extracted from each tile bodes well for this approach in reading medical images.
We’re thinking: Where else could computer vision augment or replace slow, expensive medical tests?