Published
Reading time
2 min read
Screen captures from videos generated by VidPress

Will reading soon become obsolete? A new system converts text articles into videos.

What’s new: VidPress, a prototype project from Chinese tech giant Baidu, currently generates more than 1,000 narrated video summaries of news stories daily.

How it works: VidPress synthesizes a two-minute video in around two and a half minutes, a task that typically takes a human editor 15 minutes.

  • VidPress identifies an article’s most important ideas using Baidu’s Ernie language model and organizes them into a script, pulling language directly from the article or crafting its own.
  • A text-to-speech tool converts the script into audio.
  • A decision tree predicts segments where viewers would expect to see new visuals.
  • The system collects related images and video clips from news sites, Baidu’s own media libraries, and search engines.
  • Using face, object, and optical character recognition models, it determines how well each clip or image relates to each segment. Then it slots the highest ranking clips and images into the relevant places in the timeline.

Results: Sixty-five percent of viewers who watched VidPress videos on Haokan, Baidu’s short-video service, viewed them all the way through, compared to a 50 percent watch-through rate for similar videos made by humans. The system’s most popular production, which describes a feud between Chinese pop stars Jiang Dawei and Zhu Zhiwen, has been viewed over 850,000 times.

Behind the news: Baidu isn’t the only outfit to use AI to expedite video production, though its approach may be the most sophisticated.

  • Taiwan’s GliaStudio has been creating video summaries since 2015. Its platform pulls text from the original article and video clips from stock footage.
  • Earlier this year, Reuters announced a prototype that inserts a GAN-generated announcer into recaps of sports footage.
  • Trash is an app aimed at cultural influencers and musicians that combines video and audio to produce custom music videos.

Why it matters: Baidu’s Haokan service previously outsourced all of its productions. Now VidPress produces around 75 percent of its in-house videos, presumably saving the company time and money.

We’re thinking: VidPress is fast, but what the internet really needs is a zillion-x speedup in the production of cat videos.

Share

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox