Stories and Innovation in ALS (4).png

Click here to go back to headlines

Towards A Large-Scale Audio-Visual Corpus for Research on Amyotrophic Lateral Sclerosis

A. Anvar, D. Suendermann-Oeft, D. Pautler, V. Ramanarayanan, J. Kumm, J. Berry, R. Norel, E. Fraenkel, and I. Navar: Towards A Large-Scale Audio-Visual Corpus for Research on Amyotrophic Lateral Sclerosis. In Proc. of AAN 2021, 73th Annual Meeting of the American Academy of Neurology, Virtual, April 2021.

In Proc. of AAN 2021, 73th Annual Meeting of the American Academy of Neurology, Virtual, April 2021

 

Objective

This presentation describes the creation of a large, open data platform, comprising speech and video recordings of people with ALS and healthy volunteers. Each participant is interviewed by Modality.AI’s virtual agent, emulating the role of a neurologist or speech pathologist walking them through speaking exercises [Fig 1] The collected data is made available to the academic and research community to foster acceleration of the development of biomarkers, diagnostics, therapies, and fundamental scientific understanding of ALS.

Investigating the Utility of Multimodal Conversational Technology and Audiovisual Analytic Measures for the Assessment and Monitoring of Amyotrophic Lateral Sclerosis at Scale

M. Neumann, O. Roesler, J. Liscombe, H. Kothare, D. Suendermann-Oeft, D. Pautler, I. Navar, A. Anvar, J. Kumm, R. Norel, E. Fraenkel, A. Sherman, J. Berry, G. Pattee, J. Wang, J. Green, V. Ramanarayanan: Investigating the Utility of Multimodal Conversational Technology and Audiovisual Analytic Measures for the Assessment and Monitoring of Amyotrophic Lateral Sclerosis at Scale . Accepted at Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czech Republic, August - September 2021

Accepted at Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czech Republic, August - September 2021.

 

Abstract

We investigate the utility of audiovisual dialog systems combined with speech and video analytics for real-time remote monitoring of depression at scale in uncontrolled environment settings. We collected audiovisual conversational data from participants who interacted with a cloud-based multimodal dialog system, and automatically extracted a large set of speech and vision metrics based on the rich existing literature of laboratory studies. We report on the efficacy of various audio and video metrics in differentiating people with mild, moderate and severe depression, and discuss the implications of these results for the deployment of such technologies in real-world neurological diagnosis and monitoring applications.