Controversy erupted over the need for transparency in research into AI for medicine.

What’s new: Google Health introduced a system that purportedly identified breast cancer more accurately than human radiologists. But the search giant’s healthcare division failed to disclose details that would have enabled others to reproduce its results, dozens of critics wrote in a letter to Nature (also published on Arxiv).

The critique: Researchers at Harvard, University of Toronto, Vector Institute, and elsewhere argue that AI systems used to diagnose life-threatening conditions should meet high standards of transparency. The Google research fell short on several counts:

  • The authors didn’t release the trained model for others to verify their results.
  • Although they mentioned the framework and libraries used, they omitted training details such as learning rate, type of optimizer, number of training epochs, and data augmentation techniques. That’s like listing the ingredients in a cake recipe without disclosing the amounts, Benjamin Haibe-Kains of the University of Toronto, who co-authored the critique, told The Batch.
  • One dataset used in the study, Optimam, is readily available. However, the authors also used patient data that remains private. In lieu of that dataset, the critics argue, the authors should have disclosed labels and model predictions that would allow for  independent statistical analysis.
  • Other details were also missing, leading to questions such as whether the model trained on a given patient’s data multiple times in a single training epoch.

The response: In a rebuttal published in Nature, the Google researchers said that keeping the model under wraps was part of “a sustainable venture to promote a vibrant ecosystem that supports future innovation.” The training details omitted are “of scant scientific value and limited utility to researchers outside our organization,” they added. They held back the proprietary dataset to protect patient privacy.

Behind the news: AI researchers are struggling to balance trade secrets, open science, and privacy. The U.S. Food and Drug Administration hosted a workshop earlier this year aimed at developing best practices for validating AI systems that interpret medical images.

Why it matters: Transparency makes it possible for scientists to verify and build on their colleagues’ findings, find flaws they may have missed, and ultimately build trust in the systems they deploy. Without sufficient information, the community can’t make rapid, reliable progress.

We’re thinking: There are valid reasons to withhold some details. For instance, some datasets come with limitations on distribution to protect privacy. However, outside of circumstances like that, our view is that researchers owe it to each other to make research findings as reproducible as possible.


Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox