The next generation of hearing aids might match a listener's brainwaves with the "soundprint" of a speaker's voice, using that information to automatically turn up the volume on that speaker, according to a new study in the May 15 issue of Science Advances.
Such a device would go a long way toward solving what scientists call the "cocktail party problem" of how a person picks out one voice from many in a crowded, noisy environment. It's a challenge for today's assistive hearing devices, which are forced to solve the problem by boosting the volume of all sounds, said Nima Mesgarani, the study's senior author.
"If you talk to people with a hearing impairment, their biggest complaint is always hearing speech in crowded places," he noted.
The science behind the brain-controlled hearing aid. Credit: Columbia University's Zuckerman Institute.
Humans aren't the only ones coping with the cocktail problem, Mesgarani added. Smart speakers and digital assistants like Amazon's Alexa and Google Home Assistant can also be vexed by identifying and responding to voice commands in a noisy room.
Mesgarani, a neuroengineer at Columbia University's Mortimer B. Zuckerman Mind Brain Behavior Institute, and his colleagues tackled the cocktail problem in two ways. First, they developed a method to separate the voices of different speakers in a "multi-talker" audio environment. They then compared the spectrograms — a kind of audio "fingerprint" — of those voices against a spectrogram that was reconstructed from neural activity in a listener's brain.
The experimental hearing device boosts the volume of the speaker spectrogram that best matches the listener spectrogram, elevating that voice above the other voices in the room.
"So far, we've only tested it in an indoor environment," said Mesgarani. "But we want to ensure that it can work just as well on a busy city street or a noisy restaurant, so that wherever wearers go, they can fully experience the world and people around them."
Mesgarani and his team have been working on the technology for several years. Initially, the system could only separate specific voices that it had been trained to recognize previously. For instance, a system trained on the voices of four friends could turn up the volume on one of those friends at a restaurant, but it wouldn't be able to raise the volume on a new voice — like a waiter appearing tableside — that was added to the mix.
In the new study, they turned to a type of artificial intelligence processing called a deep neural network to distinguish and separate unknown voices. The platform was able to distinguish between as many as three voices and could adapt to new voices when they appeared, as they might in a real-life conversation.
The researchers used the brainwave data to guide the hearing device after their previous studies showed that the brainwaves of a listener only track the voice of the speaker at the center of the listener's attention. For the current study, they collected brainwaves using electrodes implanted directly in the brains of three people undergoing surgery to treat epilepsy.
Future devices would likely collect these neural data through a sensor placed over the ear or against the scalp, said Mesgarani.
The team is now refining their experimental hearing device, creating a microphone that can pick up and separate multiple voices in real time during a conversation, along with a non-invasive way to collect listener brainwaves. Mesgarani said a fully operational device could be ready in five years.
This article has been republished from materials provided by the AAAS. Note: material may have been edited for length and content. For further information, please contact the cited source.
Reference: Han, C., O’Sullivan, J., Luo, Y., Herrero, J., Mehta, A. D., & Mesgarani, N. (2019). Speaker-independent auditory attention decoding without access to clean speech sources. Science Advances, 5(5), eaav6134. https://doi.org/10.1126/sciadv.aav6134