Say it slowly: ‘Audio dilation’ could enhance hearing comprehension

By Michael Epstein

_DSC0594
PhD candidate John Novak prepares to demonstrate his audio dilation software at the University of Illinois at Chicago’s Electronic Visualization Lab in November. (Michael Epstein/MEDILL)

John Novak thinks that we all need to slow down and listen if we want to really hear a conversation or even your favorite song.

Novak, a P.h.D. candidate studying computer science at the University of Illinois at Chicago’s Electronic Visualization Lab, has spent the last four years developing software he calls “audio dilation,” which can reduce the speed of audio streaming through a smartphone or laptop — music or a phone call — in real time with little to no effect on its clarity.

“If you really analyzed this [dilated musical clip] and really pulled out the mathematics sample by sample,” Novak explained, “yes, there would be distortions in it, but to the human ear it’s a pretty good reproduction of what would happen if you told a bunch of musicians, ‘just play this slower.’

While the average person might see slowing down the sound as a waste of precious time, Novak believes that giving listeners more time to take things in allows them to more fully comprehend information. Novak specifically theorized that his software could make it easier to translate foreign languages on the fly and could help the hearing impaired.

Novak suggested that his software could be helpful when used in conjunction with hearing aids. For people who have partial hearing loss and can’t hear certain frequencies, slowing down sounds could help the brain recognize what’s missing, essentially filling in the blanks of a person’s hearing.

While Novak’s software does exactly what it’s supposed to, the question remains as to whether or not slowing down speech actually has any impact on listening comprehension.

Slowing down an audio track generally lowers its tone, often in a way that obscures the message. To prevent those side effects, Novak’s software deconstructs sounds down to their components, sine waves aka “sound waves,” and re-assembles them in the desired pattern on the fly.

Novak said his confidence has been bolstered by support for his published work on the subject.

“The broader academic community thinks there’s some merit to this,” Novak said. “This is part of the academic community’s function; trying to separate the chaff from the good stuff.”

To prove the neurological benefits, Novak has crafted a series of experiments using audio dilation to test whether listening to “dilated” sound makes a discernable difference on the brain. The first set of experiments would test the software’s effects on memory and would begin early next year.

“If you can figure out the right way to manipulate the original sine waves and put them back together,” Novak explained. “you can do lots of interesting things.”

Novak did not invent this technique. He based his program on a “Phase Vocoder” algorithm, a program that also separates and re-arranges sine waves, by Columbia University professor Daniel Ellis. Novak turned that algorithm into a useable piece of software; one that stands apart from its predecessors because his adaptation operates on streaming audio instead of pre-recorded files.

The original version of his software was part of a group project for his “Human Augmentics” course at UIC in 2011. Novak said the term “audio dilation” also came from his professors. Past research on the topic refers to the process as “audio stretching.”

Though the software was originally conceived as part of a group effort, Novak’s research since the class has taken the project beyond the theoretical phase, according to Jason Archer, one of Novak’s classmates who worked on the original version of the project.

“We never got very conclusive results,” Archer said, but we were also never able to run [the] experiments we wanted to run.”

Novak and Archer were both particularly interested in the potential benefit to “multi-tasking” listeners, who need to pay attention to a recording while engaging in another activity, such as driving or writing. According to Novak, slowing down sounds could reduce the cognitive load of listening, while allowing users to devote more mental energy to the second activity.

There are limitations, Novak said. Slowing down one side of a conversation means forcing someone to wait. And the software would not be viable with video messaging applications, such as Skype or Apple Facetime, where the distracting dissonance between a voice playing from speakers at reduced speed and the movements of a person’s mouth at full speed would likely overshadow the neurological benefit.

Novak said he plans to leave audio dilation behind once he completes his graduate research. Novak said that decision partially stems from the fact there proving the software is beneficial is only the first step in a long road to making a commercially viable product.

“For me, it’s almost too far in the future to think about.” Novak said.

While he won’t stick with it until the end, Novak said he would like to leave the project at a point where it will offer appealing prospects for more research.

“Before I leave here,” Novak said. “I’d like to be at a point where I can say ‘not only does this have an effect, but here are the effects that it has. Here is how it can benefit people.’ That way it can be a platform for research after I’m done and gone.”