TL;DR
BlaBla is an open-source Python library that extracts linguistically relevant features from speech in multiple languages, aiding clinical research on neurological and psychiatric disorders with validated tools and multi-language support.
Contribution
It introduces BlaBla, a unified, efficient framework for extracting clinically relevant linguistic features across many languages, validated on multiple diseases and demonstrated on real clinical data.
Findings
Validated features across 12 diseases
Supported multi-language analysis including three languages
Demonstrated clinical application with language disorder classification
Abstract
We introduce BlaBla, an open-source Python library for extracting linguistic features with proven clinical relevance to neurological and psychiatric diseases across many languages. BlaBla is a unifying framework for accelerating and simplifying clinical linguistic research. The library is built on state-of-the-art NLP frameworks and supports multithreaded/GPU-enabled feature extraction via both native Python calls and a command line interface. We describe BlaBla's architecture and clinical validation of its features across 12 diseases. We further demonstrate the application of BlaBla to a task visualizing and classifying language disorders in three languages on real clinical data from the AphasiaBank dataset. We make the codebase freely available to researchers with the hope of providing a consistent, well-validated foundation for the next generation of clinical linguistic research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
