A few weeks ago, the White House required that research papers funded by the U.S. government be available online promptly and freely by the end of 2025. Data that underlies those publications must also be made available.
I’m thrilled! Paywalled journals that block free access to scientific research are the bane of the academic community.
The AI world is fortunate to have shifted years ago to free online distribution of research papers, primarily through the arXiv site. I have no doubt that this has contributed to the rapid rise of AI and am confident that, thanks to the new U.S. policy, promoting a similar shift in other disciplines will accelerate global scientific progress.
In the year 2000 — before modern deep learning, and when dinosaurs still roamed the planet — AI researchers were up in arms against paywalled journals. Machine Learning Journal, a prominent journal of the time, refused to open up access. With widespread support from the AI community, MIT computer scientist Leslie Kaelbling started the free Journal of Machine Learning Research, and many researchers promptly began publishing there instead. This move led to the rapid decline of Machine Learning Journal. The Journal of Machine Learning Research remains a respected institution today, edited by David Blei and Francis Bach (both of who are my former officemates at UC Berkeley).
Before the modern internet, journal publishers played an important role by printing and disseminating hard copies of papers. It was only fair that they could charge fees to recoup their costs and make a modest profit. But in today’s research environment, for-profit journals rely mainly on academics to review papers for free, and they harvest the journals’ reputations (as reflected in metrics such as impact factor) to extract a profit.
Today, there are peer-reviewed journal papers, peer-reviewed conference papers, and non-peer-reviewed papers posted online directly by the authors. Journal articles tend to be longer and undergo peer review and careful revisions. In contrast, conference papers (such as NeurIPS, ICML and ICLR articles) tend to be shorter and less carefully edited, and thus they can be published more quickly. And papers published on arXiv aren’t peer reviewed, so they can be published and reach interested readers immediately.
The benefits of rapid publication and distribution have caused a lot of the action to shift away from journals and toward conferences and arXiv. While the volume of research is overwhelming (that’s why The Batch tries to summarize the AI research that matters), the velocity at which ideas circulate has contributed to AI’s rise.
By the time the new White House guidance takes effect, a quarter century will have passed since machine learning researchers took a key step toward unlocking journal access. When I apply AI to healthcare, climate change, and other topics, I occasionally bump into an annoyingly paywalled article from these other disciplines. I look forward to seeing these walls come down.
Don’t underestimate the impact of freeing up knowledge. I wish all these changes had taken place a quarter century ago, but I’m glad we’re getting there and look forward to the acceleration of research in all disciplines!