ADVISER: A Toolkit for Developing Multi-modal, Multi-domain and Socially-engaged Conversational Agents
Chia-Yu Li, Daniel Ortega, Dirk V\"ath, Florian Lux, Lindsey, Vanderlyn, Maximilian Schmidt, Michael Neumann, Moritz V\"olkel, Pavel, Denisov, Sabrina Jenne, Zorica Kacarevic, Ngoc Thang Vu

TL;DR
ADVISER is an open-source toolkit designed to facilitate the development of multi-modal, multi-domain, socially-engaged conversational agents, supporting diverse user expertise and enabling collaborative research.
Contribution
It introduces a flexible, Python-based toolkit that integrates speech, text, vision, and social engagement features for conversational agents, accessible to both technical and non-technical users.
Findings
Supports multi-modal, multi-domain dialogue development
Enables emotion recognition and engagement prediction
Provides an easy-to-extend platform for research
Abstract
We present ADVISER - an open-source, multi-domain dialog system toolkit that enables the development of multi-modal (incorporating speech, text and vision), socially-engaged (e.g. emotion recognition, engagement level prediction and backchanneling) conversational agents. The final Python-based implementation of our toolkit is flexible, easy to use, and easy to extend not only for technically experienced users, such as machine learning researchers, but also for less technically experienced users, such as linguists or cognitive scientists, thereby providing a flexible platform for collaborative research. Link to open-source code: https://github.com/DigitalPhonetics/adviser
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Multimodal Machine Learning Applications · Topic Modeling
