Whither the Priors for (Vocal) Interactivity?
Roger K. Moore

TL;DR
This paper argues that the lack of foundational principles or priors in vocal human-robot interaction leads to unnatural conversations, highlighting the need for theoretical insights to improve communication effectiveness.
Contribution
It identifies the absence of fundamental design principles and priors as key issues hindering natural vocal interaction with robots, proposing the need for theoretical frameworks.
Findings
Current spoken language systems require extensive training data.
Interactions are often stilted and one-sided.
Lack of design principles for effective communication.
Abstract
Voice-based communication is often cited as one of the most `natural' ways in which humans and robots might interact, and the recent availability of accurate automatic speech recognition and intelligible speech synthesis has enabled researchers to integrate advanced off-the-shelf spoken language technology components into their robot platforms. Despite this, the resulting interactions are anything but `natural'. It transpires that simply giving a robot a voice doesn't mean that a user will know how (or when) to talk to it, and the resulting `conversations' tend to be stilted, one-sided and short. On the surface, these difficulties might appear to be fairly trivial consequences of users' unfamiliarity with robots (and \emph{vice versa}), and that any problems would be mitigated by long-term use by the human, coupled with `deep learning' by the robot. However, it is argued here that such…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Language and cultural evolution
