Ancient Scrolls Recovered Researchers decipher scrolls charred by Mount Vesuvius using AI

Published

Feb 14, 2024

Reading time

3 min read

Three researchers decoded scrolls that had gone unread since they were turned into charcoal by the eruption of Mount Vesuvius in the year 79.

What’s new: Youssef Nader, Luke Farritor, and Julian Schilliger used neural networks to win the $700,000 grand prize in the Vesuvius Challenge, a competition to translate charred papyrus scrolls found in the ruins of a villa at the Roman town of Herculaneum in southern Italy.

How it works: The volcanic eruption covered Herculaneum in ash. It also transformed into carbon the papyrus scrolls, which originally would have unrolled to lengths as long as 30 feet.

Competitors were given extremely high-resolution, three-dimensional X-ray scans of four intact scrolls. Like CT scans, each scan comprised a series of 2D cross sections. An application developed by researchers who have been working to decipher the scrolls virtually unwrapped the 3D scans into 2D images of the scroll surfaces and segmented them into individual papyrus sheets.
Examining the resulting images by eye, a member of a different team noticed faint patterns of cracks and lines that suggested Greek letters. He uploaded his findings, which prompted Farritor to take up the search.
Having identified traces of ink in one of the scrolls, Farritor trained a ResNet to recognize 64x64-pixel patches of the sheet images that showed similar traces. The initial model revealed more ink traces, which were added to the training set; the retrained model found more, which joined the training set, and so on. The model enabled Farritor to render 10 legible letters, winning an intermediate prize.
Building on Farritor’s approach, the team trained three models on fragments of other scrolls to recognize patches that showed signs of ink. They selected the 3D architectures TimeSformer, Resnet3D-101, and I3D to capture ink residue that rose above the carbonized papyrus surface. The clearest images came from TimeSformer. The team manually compared TimeSformer’s images with those produced by the other two models to ensure that TimeSformer didn’t misclassify patches as having ink when it wasn’t there.
Working on one of the four scrolls (the other three having proven more difficult to scan, unwrap, and segment), the team rendered readable 85 percent of the presumed characters in four 140-character passages — thus satisfying the grand-prize criteria. They also rendered 11 additional passages for a total of more than 2,000 characters, or roughly 5 percent of the scroll. The rendered text appears to express Epicurean philosophy that praises the virtues of pleasure.

Behind the news: The Vesuvius Challenge launched in March 2023 with funding provided by GitHub CEO Nat Friedman.

Smaller prizes were awarded to researchers who deciphered single words and shorter passages. Notably, these early prizewinners included Nader and Farritor, who then teamed with Schilliger.
In its next round, the competition is offering $100,000 to the first team to decipher 90 percent of all four scrolls that have been imaged so far.
The library at Herculaneum includes 800 scrolls already recovered and potentially thousands more still to be excavated. Reading them all would make this library one of the largest collections of texts recovered from the ancient world.

Why it matters: The winning team’s achievement testifies to the ability of deep learning to help solve difficult problems. And their work may have broader significance: Recovering the entire Herculaneum library could provide insights into literature, philosophy, history, science, and art at the time of Caesar.

We’re thinking: University of Kentucky computer scientist Brent Seales, who helped design the contest as well as pioneering the use of medical imaging and machine learning to read ancient texts, reckons that over 1,000 teams worked on the problem, amounting to 10 person-years and two compute-years. It's a great example of the power of global collaboration and open resources — central facets of the AI community — to find solutions to hard problems.

Subscribe to The Batch