Neuroscientists translated brain signals directly into artificial speech, synthesizing full sentences based purely on neural impulses.
What happened: Researchers tapped the brains of five epilepsy patients who had been implanted with electrodes to map the source of seizures, according to a paper published by Nature (preprint here). During a lull in the procedure, they had the patients read English-language texts aloud. They recorded the fluctuating voltage as the brain controlled the muscles involved in speaking. Later, they fed the voltage measurements into a synthesizer. You can hear the synthesized speech in this video.
How it works: A pair of three-layer, bidirectional, LSTM recurrent neural networks drove the synthesis.
- One model used the neural activity to predict motions of the lips, jaw, tongue, and larynx.
- The other used those predictions to identify corresponding consonant, vowel, and other sounds.
- The second model’s output fed a synthesizer that rendered speech sounds.
- Listeners on Amazon Mechanical Turk transcribed the synthesized sentences, achieving 83% median accuracy.
Why it matters: The technique could help people who have lost the ability to control their vocal tract due to disorders such as Lou Gehrig’s disease, stroke, or injury. People with such conditions — think of the late physicist Stephen Hawking — can communicate very slowly through systems that track eye movements or facial muscles to spell words one letter at a time. Translating brain impulses directly would allow them to communicate with the ease of normal speech.
Reality check: The researchers did not read minds. Gopala K. Anumanchipalli, Josh Chartier, and Edward F. Chang read brain signals controlling the patients’ muscles. Still, their approach conceivably could be refined to translate brain signals associated with thought.
What’s next: The team plans to test the technology in people who can’t move their face and tongue. It also aims to adapt the system for languages other than English.