Language Models in Lab Coats The chatbot search engines for scientific research

Published

May 10, 2023

Reading time

2 min read

Specialized chatbots are providing answers to scientific questions.

What’s new: A new breed of search engines including Consensus, Elicit, and Scite use large language models to enable scientific researchers to find and summarize significant publications, Nature reported.

How it works: The models answer text questions by retrieving information from databases of peer-reviewed scientific research.

Consensus uses unnamed language models that were trained on tens of thousands of scientific research papers annotated by PhD students. Upon receiving a query, the tool searches Semantic Scholar (a search engine for academic literature built by the Allen Institute for Artificial Intelligence) for papers, which it ranks according to relevance, quality, citation count, and publishing date. At the user’s request, it uses GPT-4 to generate a single-paragraph summary of the top papers. You can try it here.
Given a question, Elicit queries Semantic Scholar's dataset for the top 400 results. GPT-3 Babbage and monot5-base-msmarco-10k re-rank and select the top eight results. FLAN-T5, GPT-3 Davinci, and other models summarize the papers. It can also generate a summary of high-ranking critiques of the top-ranked paper. Free access is available here.
Scite queries a proprietary dataset of over 1.2 billion citation statements extracted from scientific papers using the Elasticsearch search engine. Scite re-ranks the top 200 results using a cross-encoder trained on the MS MARCO dataset of Bing queries and answers. A RoBERTa model trained on a question-and-answer dataset extracts relevant text. Basic search is free, but detailed citations require a subscription ($20 monthly, $144 annually).

Yes, but: These tools may struggle with sensitive or fast-moving fields. For example, in response to the question, “Do vaccines cause autism?”, pediatrician Meghan Azad at the University of Manitoba found that Consensus returned a paper that focused on public opinion rather than scientific research. Clémentine Fourrier, who evaluates language models at HuggingFace, said that searching for machine learning papers via Elicit often brought up obsolete results.

Why it matters: Search engines that rank and summarize relevant research can save untold hours for scientists, students, and seekers of knowledge in general. With continued improvement, they stand to accelerate the pace of progress.

We’re thinking: These systems show promise and point in an exciting direction. When search was young, search engines that covered the web (like Google) competed with vertical search engines that covered niches such as retail (Amazon) or travel (Expedia). A similar competition is shaping up between general-purpose chatbots and vertical chatbots.

Subscribe to The Batch