A 'Canny' Approach to Spoken Language Interfaces
Roger K. Moore

TL;DR
This paper discusses the challenge of user engagement with voice-enabled devices and proposes aligning device affordances to improve interaction, drawing parallels with the 'uncanny valley' effect.
Contribution
It introduces a novel framework for enhancing voice interface design by aligning visual, vocal, behavioral, and cognitive features to increase user engagement.
Findings
Identifies a 'habitability gap' in voice interfaces.
Proposes a multi-affordance alignment approach.
Suggests potential improvements in user engagement.
Abstract
Voice-enabled artefacts such as Amazon Echo are very popular, but there appears to be a 'habitability gap' whereby users fail to engage with the full capabilities of the device. This position paper draws a parallel with the 'uncanny valley' effect, thereby proposing a solution based on aligning the visual, vocal, behavioural and cognitive affordances of future voice-enabled devices.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Speech and Audio Processing
