There has been a lot of excitement about the idea of using deep learning to diagnose diabetic retinopathy: That is by taking a photo of the retina and using AI to detect signs of disease. I was fascinated by a new paper by Emma Beede and others that studied the use of Google’s diabetic retinopathy detecting system in 11 clinics in Thailand and found that, despite all the research progress, this technology isn’t working well in production yet.
We as a community need to get much better at bridging proofs of concept (PoCs) and production deployments. Even Google’s excellent AI team ran into many practical issues:
- Images collected in the sometimes poorly-equipped clinics were not of the same uniform high quality as those in the training and test sets used in the original research. For example, in some clinics, pictures were taken with the ceiling lights on, resulting in lower image quality.
- There were unexpected corner cases. For example, the paper describes how a nurse who was unable to take a single, clean image of the retina instead took two partial images and wanted to diagnose from a composite of them.
- Many details relating to the patients’ and clinicians’ workflow still needed to be sorted out. Some patients were dissuaded from using the system because of concerns that its results might require them to immediately go to a hospital — a time consuming and expensive proposition for many. More generally, decisions about when/whether/who/how to refer a patient based on the AI output are consequential and need to be sorted out.
The paper was published by SIGCHI, a leading conference in human-computer interaction (HCI). I’m encouraged to see the HCI community embrace AI and help us with these problems, and I applaud the research team for publishing these insights. I exchanged some emails with the authors, and believe there’s a promising path for AI diagnosis of diabetic retinopathy. To get there, we’ll need to address challenges in robustness, small data, and change management.
Many teams are working to meet these challenges, but no one has perfect answers right now. As AI matures, I hope we can turn building production systems into a systematic engineering discipline, so we can deploy working AI systems as reliably as we can deploy a website today.