Large Language Models for Chinese A brief overview of the WuDao family

Published

Apr 14, 2021

Reading time

2 min read

Researchers unveiled competition for the reigning large language model GPT-3.

What’s new: Four models collectively called Wu Dao were described by Beijing Academy of Artificial Intelligence, a research collective funded by the Chinese government, according to Synced Review.

Power quartet: Wu Dao’s constituent models were developed by over 100 scientists at leading Chinese universities and tech companies. In January, researchers associated with the project told Wired that it could help citizens navigate China’s bureaucracy, including the Beijing Motor Vehicles Administration.

Wen Yuan is a 2.6 billion parameter language model that matched or exceeded GPT-3’s performance in Chinese- and English-language tasks. The group plans to scale the model up to 100 billion parameters later this year, according to a report in AI Technology Review.
Wen Lan associates images and video with text. The learning algorithm enabled the researchers to train the model on 50 million image/text pairs containing a high percentage of negative examples: data labeled according to what it is not. The model outperformed CLIP on a text-image retrieval task and beat the previous top scorer on AIC-ICC image captioning by 5 percentage points.
Wen Hui, with 11.3 billion parameters, is a text-generation model pretrained for general language skills. The team has spun out applications that write poetry, generate videos, and generate images from text prompts (shown above).
Wen Su, based on BERT, is trained to predict the shapes of biomolecules including proteins and DNA from human blood cells and drug-resistant bacteria.
The project also includes FastMoE, a method for training models with more than 1 trillion parameters, along with a 2 terabyte Chinese-language database.

Behind the news: The Beijing Academy of Artificial Intelligence was founded in 2018 to help the Chinese government achieve its goal of becoming the global center of AI. Its other projects include research into the cognitive roots of neural networks, a proposal for standardized AI notation, and a program to develop AI-specific computer chips.

Why it matters: This effort reflects China’s growing confidence and capability in AI. Such ambitious projects could also curb China’s AI brain drain, as many of its most talented engineers wind up leaving for work overseas.

We’re thinking: AI has long been dominated by organizations clustered in a few geographic hotspots. China’s effort to shift AI’s center of gravity away from the west could have far-reaching repercussions into the types of systems that get built and how they’re deployed.

Subscribe to The Batch