Best Practices and Considerations for Child Speech Corpus Collection and Curation in Educational, Clinical, and Forensic Scenarios
John Hansen, Satwik Dutta, Ellen Grand

TL;DR
This paper provides comprehensive best practices and guidelines for collecting and curating child speech corpora, addressing challenges related to developmental changes and privacy across educational, clinical, and forensic applications.
Contribution
It offers a structured framework for data collection, collaboration, and quality assurance tailored to the unique needs of child speech data in various fields.
Findings
Guidelines for ethical data collection and collaboration.
Strategies for ensuring data quality and privacy.
Practical steps for corpus annotation and management.
Abstract
A child's spoken ability continues to change until their adult age. Until 7-8yrs, their speech sound development and language structure evolve rapidly. This dynamic shift in their spoken communication skills and data privacy make it challenging to curate technology-ready speech corpora for children. This study aims to bridge this gap and provide researchers and practitioners with the best practices and considerations for developing such a corpus based on an intended goal. Although primarily focused on educational goals, applications of child speech data have spread across fields including clinical and forensics fields. Motivated by this goal, we describe the WHO, WHAT, WHEN, and WHERE of data collection inspired by prior collection efforts and our experience/knowledge. We also provide a guide to establish collaboration, trust, and for navigating the human subjects research protocol.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Legal Issues in Pediatric Healthcare · Interpreting and Communication in Healthcare
