The research could lead to more accurate speech recognition systems, enhancing virtual assistants like Siri or Alexa, as well as transcription tools.
By Pesach Benson, TPS
Israeli scientists unveiled research on how the brain processes speech and transforms sounds into conversation on Thursday with findings that could lead to new advancements in speech recognition technology, communication tools for people with speech disorders, and personalized assistive devices.
By get insights into the neural pathways that allow people to speak and understand each other, scientists from Hebrew University of Jerusalem, Google Research, Princeton University’s Hasson Lab, the NYU Langone Comprehensive Epilepsy Center used electrocorticography (ECoG), a technique that records brain activity directly from the brain’s surface, to capture real-time data from participants during open-ended conversations.
The data was analyzed using a speech-to-text model called Whisper, which breaks down language into three levels: sounds, speech patterns, and word meanings.
Led by Dr. Ariel Goldstein of Hebrew University’s Department of Cognitive and Brain Sciences and published in the peer-reviewed Nature Human Behavior journal, the study offers a framework for studying how the brain processes speech in real-world conversations.
The Whisper model predicted brain activity during conversation, even when applied to new data. It showed that different regions of the brain are activated at various stages of speech processing.
For example, areas associated with hearing and speaking were aligned with the sounds and speech patterns of language, while regions responsible for higher-level understanding were connected to the meanings of words. This mapping of brain activity highlights how the brain supports communication.
“Our findings help us understand how the brain processes conversations in real-life settings. By connecting different layers of language, we’re uncovering the mechanics behind something we all do naturally—talking and understanding each other,” Goldstein said.
“Before we speak, our brain moves from thinking about words to forming sounds, while after we listen, it works backward to make sense of what was said,” he added.
The research could lead to more accurate speech recognition systems, enhancing virtual assistants like Siri or Alexa, as well as transcription tools.
It also offers the hope of better communication aids for individuals with speech disorders, such as aphasia or dysarthria. The study could potentially lead to to brain-computer interfaces for people with severe speech disabilities.
For people with hearing loss, speech impairments, or neurodegenerative diseases like ALS and Parkinson’s, the findings hold the potential of personalized assistive devices.
The findings could also lead to more targeted therapies for people recovering from brain injuries or strokes who are trying to regain speech and comprehension abilities.