Reading time
1 min read
GPT-Neo related animation

A grassroots research collective aims to make a GPT-3 clone that’s available to everyone.

What’s new: EleutherAI, a loose-knit group of independent researchers, is developing GPT-Neo, an open source, free-to-use version of OpenAI’s gargantuan language model. The model could be finished as early as August, team member Connor Leahy told The Batch.

How it works: The goal is to match the speed and performance of the fully fledged, 175 billion-parameter version of GPT-3, with extra attention to weeding out social biases. The team successfully completed a 1 billion-parameter version, and architectural experiments are ongoing.

  • CoreWeave, a cloud computing provider, gives the project free access to infrastructure. It plans eventually to host instances for paying customers.
  • The training corpus comprises 825GB of text. In addition to established text datasets, it includes IRC chat logs, YouTube subtitles, and abstracts from PubMed, a medical research archive.
  • The team measured word pairings and used sentiment analysis to rate the data on gender, religion, and racial bias. Examples that showed unacceptably high levels of bias were removed.

Behind the news: In 2019, when OpenAI introduced GPT-2, the company initially refused to release the full model, citing fears that it would set off a flood of disinformation. That motivated outside researchers, including Leahy, to try to replicate the model. Similarly, OpenAI’s decision to keep GPT-3 under wraps inspired EleutherAI’s drive to create GPT-Neo.

Why it matters: GPT-3 has made headlines worldwide, but few coders have had a chance to use it. Microsoft has an exclusive license to the full model, while others can sign up for access to a test version of the API. Widespread access could spur growth in AI-powered productivity and commerce.

We’re thinking: If talk is cheap, AI-generated talk might as well be free!


Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox