Cognitive Principles in Robust Multimodal Interpretation

J. Y. Chai; Z. Prasov; S. Qu

arXiv:1109.6361·cs.AI·September 30, 2011

Cognitive Principles in Robust Multimodal Interpretation

J. Y. Chai, Z. Prasov, S. Qu

PDF

TL;DR

This paper introduces a cognitively inspired greedy algorithm for interpreting multimodal user references in conversational interfaces, enhancing robustness and efficiency in resolving references across speech and gesture inputs.

Contribution

The paper presents a novel, simple, and general algorithm that incorporates cognitive principles for improved multimodal reference resolution.

Findings

01

Efficiently resolves a variety of user references

02

Demonstrates advantages over previous methods in empirical tests

03

Potential to improve robustness of multimodal interpretation

Abstract

Multimodal conversational interfaces provide a natural means for users to communicate with computer systems through multiple modalities such as speech and gesture. To build effective multimodal interfaces, automated interpretation of user multimodal inputs is important. Inspired by the previous investigation on cognitive status in multimodal human machine interaction, we have developed a greedy algorithm for interpreting user referring expressions (i.e., multimodal reference resolution). This algorithm incorporates the cognitive principles of Conversational Implicature and Givenness Hierarchy and applies constraints from various sources (e.g., temporal, semantic, and contextual) to resolve references. Our empirical results have shown the advantage of this algorithm in efficiently resolving a variety of user references. Because of its simplicity and generality, this approach has the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.