A multispeaker dataset of raw and reconstructed speech production   real-time MRI video and 3D volumetric images

Yongwan Lim; Asterios Toutios; Yannick Bliesener; Ye Tian; Sajan Goud; Lingala; Colin Vaz; Tanner Sorensen; Miran Oh; Sarah Harper; Weiyi Chen,; Yoonjeong Lee; Johannes T\"oger; Mairym Llor\'ens Montesserin; Caitlin Smith,; Bianca Godinez; Louis Goldstein; Dani Byrd; Krishna S. Nayak; Shrikanth S.; Narayanan

arXiv:2102.07896·eess.SP·July 23, 2021

A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images

Yongwan Lim, Asterios Toutios, Yannick Bliesener, Ye Tian, Sajan Goud, Lingala, Colin Vaz, Tanner Sorensen, Miran Oh, Sarah Harper, Weiyi Chen,, Yoonjeong Lee, Johannes T\"oger, Mairym Llor\'ens Montesserin, Caitlin Smith,, Bianca Godinez, Louis Goldstein, Dani Byrd

PDF

2 Repos

TL;DR

This paper introduces a comprehensive, publicly available multispeaker RT-MRI dataset capturing raw and reconstructed speech production data, including 2D videos, 3D volumetric images, and synchronized audio, to advance speech science and related fields.

Contribution

It provides the first open dataset with raw multi-coil RT-MRI data of speech production, enabling improved reconstruction, artifact correction, and biomarker extraction methods.

Findings

01

Dataset includes 75 subjects performing speech tasks

02

Provides raw multi-coil RT-MRI data and synchronized audio

03

Includes 3D volumetric and static anatomical MRI images

Abstract

Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited, and comprehensive datasets with broad access are needed to catalyze research across numerous domains. The imaging of the rapidly moving articulators and dynamic airway shaping during speech demands high spatio-temporal resolution and robust reconstruction methods. Further, while reconstructed images have been published, to-date there is no open dataset providing raw multi-coil RT-MRI data from an optimized speech production experimental setup. Such datasets could enable new and improved methods for dynamic image reconstruction, artifact correction, feature extraction, and direct extraction of linguistically-relevant biomarkers. The present…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.