Toward Machines That LOL Scientists Teach a Speech Recognition Robot to Laugh

Published

Sep 21, 2022

Reading time

2 min read

Even if we manage to stop robots from taking over the world, they may still have the last laugh.

What’s new: Researchers at Kyoto University developed a series of neural networks that enable a robot engaged in spoken conversation to chortle along with its human interlocutor.

How it works: The authors built a system of three models that, depending on a user’s spoken input, emitted either a hearty hoot, a conversational chuckle, or no laugh at all. They trained all three models on recordings of speed-dating dialogs between humans and Erica, an android teleoperated by an actress, which they deemed to be rich in social laughter.

The first model detected a conversant’s laughter. Given an utterance represented as a sequence of mel filter bank coefficients (features that describe the frequencies that make up a short audio segment), a recurrent neural network featuring BiGRUs learned to determine whether the utterance ended in a laugh.
The second model decided when the conversant’s outburst called for a sympathetic cackle. If the utterance didn’t end in a laugh, the system didn’t generate a laughing response. If it did, the authors fed the mean and variance of the mel filter bank features, plus features that described the utterance’s lowest frequency and volume, into a logistic regression model, which learned whether or not to join in.
The third model chose the type of laugh to use. The authors fed the same features into another logistic regression model. It learned whether to play a recording of giggles or guffaws.

Results: The authors’ system and two baselines responded to brief monologues that included laughter, while more than 30 crowdsourced workers judged naturalness and human-likeness on a scale of 1 to 7. The authors’ system achieved an average 4.01 for naturalness and 4.36 for human-likeness. One baseline, which never laughed, scored an average 3.89 for naturalness and 3.99 for human-likeness. The other, which always reacted to laughter in the monologue with a social laugh, scored an average of 3.83 for naturalness and 4.16 for human-likeness.

Behind the news: About the training corpus: The authors recorded speed-dating dialogs with Erica as part of a larger effort to elicit human-machine conversations that delved more deeply into human issues than typical text dialogs with chatbots. Built by researchers at Kyoto and Osaka Universities and Kyoto’s Advanced Telecommunications Research Institute, the feminine-styled automaton has rapped, anchored TV news, and been cast to play the lead role in a science-fiction film scheduled for release in 2025.

Why it matters: Automating laughter is no joke! Mastering when and how to laugh would be valuable in many systems that aim to integrate seamlessly with human conversation. Titters, snickers, and howls play a key role in bonding, agreement, affection, and other crucial human interactions. Laughter’s role varies in different communities, yet it can cross cultures and bring people together.

We’re thinking: We’re glad the robots are laughing with us, not at us!

Subscribe to The Batch