Lexico-acoustic Neural-based Models for Dialog Act Classification
Daniel Ortega, Ngoc Thang Vu

TL;DR
This paper introduces a neural model that combines lexical and acoustic features to improve dialog act classification accuracy, especially in cases with limited lexical cues or data.
Contribution
It is the first to systematically incorporate acoustic features into neural dialog act classification models, demonstrating their usefulness.
Findings
Acoustic features improve overall classification accuracy.
Acoustic cues are especially helpful with limited lexical information.
The model performs well on benchmark datasets.
Abstract
Recent works have proposed neural models for dialog act classification in spoken dialogs. However, they have not explored the role and the usefulness of acoustic information. We propose a neural model that processes both lexical and acoustic features for classification. Our results on two benchmark datasets reveal that acoustic features are helpful in improving the overall accuracy. Finally, a deeper analysis shows that acoustic features are valuable in three cases: when a dialog act has sufficient data, when lexical information is limited and when strong lexical cues are not present.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
