Can AI-driven voice analysis help identify mental disorders?
Written Ingrid K. Williams
Imagine a test as quick and easy as having your temperature taken or your blood pressure measured that could reliably identify an anxiety disorder or predict an impending depressive relapse.
Health care providers have many tools to gauge a patient’s physical condition, yet no reliable biomarkers — objective indicators of medical states observed from outside the patient — for assessing mental health.
But some artificial intelligence researchers now believe that the sound of your voice might be the key to understanding your mental state — and AI is perfectly suited to detect such changes, which are difficult, if not impossible, to perceive otherwise. The result is a set of apps and online tools designed to track your mental status, as well as programs that deliver real-time mental health assessments to telehealth and call-center providers.
Psychologs have long known that certain mental health issues can be detected lening not only to what a person says but how they say it, said Maria Espinola, a psycholog and assant professor at the University of Cincinnati College of Medicine.
With depressed patients, Espinola said, “their speech is generally more monotone, flatter and softer. They also have a reduced pitch range and lower volume. They take more pauses. They stop more often.”
In her new book, Jennifer Heisz blends personal experience and the latest science about how exercise can improve your mental well-being. (Gabriela Bhaskar/The New York Times)
Patients with anxiety feel more tension in their bodies, which can also change the way their voice sounds, she said. “They tend to speak faster. They have more difficulty breathing.”
Today, these types of vocal features are being leveraged machine learning researchers to predict depression and anxiety, as well as other mental illnesses like schizophrenia and post-traumatic stress disorder. The use of deep-learning algorithms can uncover additional patterns and characterics, as captured in short voice recordings, that might not be evident even to trained experts.
“The technology that we’re using now can extract features that can be meaningful that even the human ear can’t pick up on,” said Kate Bentley, an assant professor at Harvard Medical School and a clinical psycholog at Massachusetts General Hospital.
“There’s a lot of excitement around finding biological or more objective indicators of psychiatric diagnoses that go beyond the more subjective forms of assessment that are traditionally used, like clinician-rated interviews or self-report measures,” she said. Other clues that researchers are tracking include changes in activity levels, sleep patterns and social media data.
These technological advances come at a time when the need for mental health care is particularly acute. According to a report from the National Alliance on Mental Illness, 1 in 5 adults in the United States experienced mental illness in 2020. And the numbers continue to climb.
Although AI technology cannot address the scarcity of qualified mental health care providers — there are not nearly enough to meet the country’s needs, said Bentley — there is hope that it may lower the barriers to receiving a correct diagnosis, ass clinicians in identifying patients who may be hesitant to seek care and facilitate self-monitoring between visits.
“A lot can happen in between appointments, and technology can really offer us the potential to improve monitoring and assessment in a more continuous way,” Bentley said.
To test this new technology, I began downloading the Mental Fitness app from Sonde Health, a health technology company, to see whether my feelings of malaise were a sign of something serious or if I was simply languishing. Described as “a voice-powered mental fitness tracking and journaling product,” the free app invited me to record my first check-in, a 30-second verbal journal entry, which would rank my mental health on a scale of 1 to 100.
A minute later I had my score: a not-great 52. “Pay Attention” the app warned.
The app flagged that the level of liveliness detected in my voice was notably low. Did I sound monotonic simply because I had been trying to speak quietly? Should I heed the app’s suggestions to improve my mental fitness going for a walk or decluttering my space? (The first question might indicate one of the app’s possible flaws: As a consumer, it can be difficult to know why your vocal levels fluctuate.)
Later, feeling jittery between interviews, I tested another voice-analysis program, this one focused on detecting anxiety levels. The StressWaves Test is a free online tool from Cigna, the health care and insurance conglomerate, developed in collaboration with AI special Ellipsis Health to evaluate stress levels using 60-second samples of recorded speech.
“What keeps you awake at night?” was the website’s prompt. After I spent a minute recounting my persent worries, the program scored my recording and sent me an email pronouncement: “Your stress level is moderate.” Unlike the Sonde app, Cigna’s email offered no helpful self-improvement tips.
Other technologies add a potentially helpful layer of human interaction, like Kintsugi, a company based in Berkeley, California, that raised $20 million in Series A funding recently. Kintsugi is named for the Japanese practice of mending broken pottery with veins of gold.
Founded Grace Chang and Rima Seiilova-Olson, who bonded over the shared past experience of struggling to access mental health care, Kintsugi develops technology for telehealth and call-center providers that can help them identify patients who might benefit from further support.
using Kintsugi’s voice analysis program, a nurse might be prompted, for example, to take an extra minute to ask a harried parent with a colicky infant about his own well-being.
Jennifer Heisz, the director of the NeuroFit Lab at McMaster University, at the school in Hamilton, Ontario, Canada, March 12, 2022. (Narisa Ladak/The New York Times)
One concern with the development of these types of machine-learning technologies is the issue of bias — ensuring the programs work equitably for all patients, regardless of age, gender, ethnicity, nationality and other demographic criteria.
“For machine-learning models to work well, you really need to have a very large and diverse and robust set of data,” Chang said, noting that Kintsugi used voice recordings from around the world, in many different languages, to guard against this problem in particular.
Another major concern in this nascent field is privacy — particularly, voice data, which can be used to identify individuals, Bentley said.
And even when patients do agree to be recorded, the question of consent is sometimes twofold. In addition to assessing a patient’s mental health, some voice analysis programs use the recordings to develop and refine their own algorithms.
Another challenge, Bentley said, is consumers’ potential mrust of machine learning and so-called black box algorithms, which work in ways that even the developers themselves cannot fully explain — particularly which features they use to make predictions.
“There’s creating the algorithm, and there’s understanding the algorithm,” said Dr. Alexander Young, interim director of the Semel Institute for Neuroscience and Human Behavior and the chair of psychiatry at UCLA, echoing the concerns that many researchers have about AI and machine learning in general: that little, if any, human oversight is present during the program’s training phase.
For now, Young remains cautiously optimic about the potential of voice analysis technologies, especially as tools for patients to monitor themselves.
“I do believe you can model people’s mental health status or approximate their mental health status in a general way,” he said. “People like to be able to self-monitor their statuses, particularly with chronic illnesses.”
But before automated voice analysis technologies enter mainstream use, some are calling for rigorous investigations of their accuracy.
“We really need more validation of not only voice technology, but AI and machine-learning models built on other data streams,” Bentley said. “And we need to achieve that validation from large-scale, well-designed representative studies.”
Until then, AI-driven voice analysis technology remains a promising but unproven tool, one that may eventually be an everyday method to take the temperature of our mental well-being.
This article originally appeared in The New York Times.