Published
Reading time
2 min read
Photoshopped picture of George Bush, Joe Biden, Donald Trump and Barack Obama playing UNO

Tired of rap battles composed by ChatGPT? Get ready for the next wave of AI-generated fun and profit.

What’s new: Cloned voices are taking center stage in productions by upstart creators and monied corporations alike.

How it works: Companies including ElevenLabs, Resemble AI, Respeecher, and Play.ht recently launched free services that clone a speaker’s voice from brief samples. Such offerings unleashed a chorus of generated voices.

  • YouTube creators attracted hundreds of thousands of viewers to videos that purportedly capture the voices of recent U.S. presidents arguing over a card game, playing Minecraft, and debating Pokemon.
  • Athene AI Show, a fictional talk show that streams nonstop on Twitch, accepts interview questions provided by viewers in the chat channel. Generated voices of celebrities or fictional characters answer in a generated conversation with the host (an Internet personality named Athene). The channel has over 16,000 followers.
  • Musician David Guetta, using unspecified text- and voice-generation models available on the web, synthesized lines in the style of Eminem “as a joke.” He played it during a live performance and “people went nuts!”
  • Music-streaming service Spotify launched an “AI DJ” that generates bespoke playlists for users punctuated by commentary in the cloned voice of Xavier Jernigan, the company’s Head of Cultural Partnerships. Sonantic AI, a startup that Spotify acquired last year, supplied the synthesized voice, which intones a combination of human-written words and text generated by an unspecified model from OpenAI.

Yes, but: The democratization of voice cloning opens doors to criminals and pranksters.

  • Scammers conned their victims out of money by mimicking voices of their relatives asking for money.
  • A Vice reporter used ElevenLabs to clone his own voice. The facsimile was convincing enough to enable him to access his bank account.
  • 4Chan users used ElevenLabs’ technology to generate hate speech in synthesized celebrity voices.
  • ElevenLabs responded to the deluge of fake voices by verifying user identities, identifying clones, and banning accounts that abuse its services.

Why it matters: Voice cloning has entered the cultural mainstream facilitated by online platforms that offer AI services free of charge. Images, text, and now voices rapidly have become convincing and accessible enough to serve as expressive tools for media producers of all sorts.

We’re thinking: With new capabilities come new challenges. Many social and security practices will need to be revised for an era when a person’s voice is no longer a reliable mark of their identity.

Share

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox