Open-Source Manually Annotated Vocal Tract Database for Automatic Segmentation from 3D MRI Using Deep Learning: Benchmarking 2D and 3D Convolutional and Transformer Networks
Subin Erattakulangara, Karthika Kelat, Katie Burnham, Rachel Balbi,, Sarah E. Gerard, David Meyer, Sajan Goud Lingala

TL;DR
This paper introduces an open-source, manually annotated 3D MRI vocal tract database and benchmarks deep learning models, including 2D and 3D CNNs and Transformers, for automatic segmentation, aiming to improve efficiency and accuracy.
Contribution
It provides a new annotated dataset and compares the performance of various deep learning architectures for vocal tract segmentation from 3D MRI.
Findings
Deep learning models achieve high segmentation accuracy
Transformers outperform CNNs in certain scenarios
Open-source dataset facilitates future research
Abstract
Accurate segmentation of the vocal tract from magnetic resonance imaging (MRI) data is essential for various voice and speech applications. Manual segmentation is time intensive and susceptible to errors. This study aimed to evaluate the efficacy of deep learning algorithms for automatic vocal tract segmentation from 3D MRI.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
