Technology

Artificial intelligence could soon diagnose illness based on the sound of your voice

By Carmen Molina Acosta,

Lisa Weiner

Published October 10, 2022 at 4:00 AM CDT

Yael Bensoussan, MD, is part of the USF Health's department of Otolaryngology - Head & Neck Surgery. She's leading an effort to collect voice data that can be used to diagnose illnesses.

Voices offer lots of information. Turns out, they can even help diagnose an illness — and researchers are working on an app for that.

The National Institutes of Health is funding a massive research project to collect voice data and develop an AI that could diagnose people based on their speech.

Everything from your vocal cord vibrations to breathing patterns when you speak offers potential information about your health, says laryngologist Dr. Yael Bensoussan, the director of the University of South Florida's Health Voice Center and a leader on the study.

Dr. Olivier Elemento of Weill Cornell Medicine is the other co-principal investigator on the project.

"We asked experts: Well, if you close your eyes when a patient comes in, just by listening to their voice, can you have an idea of the diagnosis they have?" Bensoussan says. "And that's where we got all our information."

Someone who speaks low and slowly might have Parkinson's disease. Slurring is a sign of a stroke. Scientists could even diagnose depression or cancer. The team will start by collecting the voices of people with conditions in five areas: neurological disorders, voice disorders, mood disorders, respiratory disorders and pediatric disorders like autism and speech delays.

The project is part of the NIH's Bridge to AI program, which launched over a year ago with more than $100 million in funding from the federal government, with the goal of creating large-scale health care databases for precision medicine.

"We were really lacking large what we call open source databases," Bensoussan says. "Every institution kind of has their own database of data. But to create these networks and these infrastructures was really important to then allow researchers from other generations to use this data."

This isn't the first time researchers have used AI to study human voices, but it's the first time data will be collected on this level — the project is a collaboration between USF, Cornell and 10 other institutions.

"We saw that everybody was kind of doing very similar work but always at a smaller level," Bensoussan says. "We needed to do something as a team and build a network."

The ultimate goal is an app that could help bridge access to rural or underserved communities, by helping general practitioners refer patients to specialists. Long term, iPhones or Alexa could detect changes in your voice, such as a cough, and advise you to seek medical attention.

To get there, researchers have to start by amassing data, since the AI can only get as good as the database it's learning from. By the end of the four years, they hope to collect about 30,000 voices, with data on other biomarkers — like clinical data and genetic information — to match.

"We really want to build something scalable," Bensoussan says, "because if we can only collect data in our acoustic laboratories and people have to come to an academic institution to do that, then it kind of defeats the purpose."

There are a few roadblocks. HIPAA — the law that regulates medical privacy — isn't really clear on whether researchers can share voices.

"Let's say you donate your voice to our project," says Yael Bensoussan. "Who does the voice belong to? What are we allowed to do with it? What are researchers allowed to do with it? Can it be commercialized?"

While other health data can be separated from a patient's identity and used for research, voices are often identifiable. Every institution has different rules on what can be shared, and that opens all sorts of ethical and legal questions a team of bioethicists will explore.

In the meantime, here are three voice samples that can be shared:

Credit to SpeechVive, via YouTube.

The latter two clips come from the Perceptual Voice Qualities Database(PVQD), whose license can be found here. No changes were made to the audio.