Machine learning promises to streamline handling of tomorrow’s bureaucratic drudgery — and, it turns out, that of 2,500 years ago.
What’s new: Computer vision is helping researchers at the University of Chicago translate a massive collection of ancient records inscribed on clay tablets.
How it works: Persian scribes around 500 BCE produced thousands of documents now collected in the Persepolis Fortification Archive.
Researchers have been translating the cuneiform characters for decades. Now they hope to speed up the job with help from DeepScribe, a model built by computer scientist Sanjay Krishnan.
- The university began capturing digital images of the tablets in 2002. Students hand-labeled 100,000 symbols.
- DeepScribe was trained using 6,000 annotated images. It deciphered the test set with 80 percent accuracy.
- The researchers hope to build a generalized version that can decipher other ancient languages.
Behind the news: The archive mostly contains records of government purchases, sales, and transport of food, helping scholars develop a detailed understanding of life in the First Persian Empire. University of Chicago archaeologists found the tablets in 1933 near the palace sites of early Persian kings. They returned the artifacts to Iran in 2019.
Why it matters: DeepScribe’s current accuracy is good enough to automate translation of repetitive words and phrases, freeing up human attention for more specialized work like translating place names or deciphering particular words in context. The researchers also believe the model could be useful for filling in gaps on tablets where text has worn away or is indecipherable.
We’re thinking: These tablets hold an important lesson for all of us during tax season: Never throw away your receipts.