Bad Machine Learning Makes Bad Science Is Machine Learning Driving a Scientific Reproducibility Crisis?

Published

Aug 17, 2022

Reading time

1 min read

Misuse of machine learning by scientific researchers is causing a spate of irreproducible results.

What’s new: A recent workshop highlighted the impact of poorly designed models in medicine, security, software engineering, and other disciplines, Wired reported.

Flawed machine learning: Speakers at the Princeton University event highlighted common pitfalls that undermine reproducibility:

Data leakage including lack of a test set, training on the test set, deciding which features to use based on those that performed well on the test set, and testing on datasets that include duplicate examples
Drawing erroneous conclusions from insufficient data
Applying machine learning when it’s not the best tool for the job

Behind the news: The workshop followed a recent meta-analysis by Princeton researchers that identified 329 scientific papers in which poorly implemented machine learning yielded questionable results.

Why it matters: Experienced machine learning practitioners are well aware of the pitfalls detailed by the workshop, but researchers from other disciplines may not be. When they apply machine learning in a naive way, they can generate invalid results that inherit an aura of credibility owing to machine learning’s track record of success. Such results degrade science and impinge on the willingness of more skeptical scientists to trust the efficacy of learning algorithms. Enquiries like this one will be necessary at least until machine learning becomes far more widely practiced and understood.

We’re thinking: Many AI practitioners are eager to contribute to meaningful projects. Partnering with scientists in other fields is a great way to gain experience developing effective models and educate experts in other domains about the uses and limitations of machine learning.

Subscribe to The Batch