Refer, Reuse, Reduce: Generating Subsequent References in Visual and Conversational Contexts
Ece Takmaz, Mario Giulianelli, Sandro Pezzelle, Arabella Sinclair,, Raquel Fern\'andez

TL;DR
This paper presents a model for generating effective and human-like references in visual and conversational contexts, improving coherence and reuse in dialogue systems.
Contribution
It introduces a grounded generation model for references in dialogue, leveraging visual and conversational context, with evaluation showing improved reference effectiveness.
Findings
Model produces more effective references than non-grounded models.
Generated references exhibit human-like linguistic patterns.
Reference resolution system confirms improved referential accuracy.
Abstract
Dialogue participants often refer to entities or situations repeatedly within a conversation, which contributes to its cohesiveness. Subsequent references exploit the common ground accumulated by the interlocutors and hence have several interesting properties, namely, they tend to be shorter and reuse expressions that were effective in previous mentions. In this paper, we tackle the generation of first and subsequent references in visually grounded dialogue. We propose a generation model that produces referring utterances grounded in both the visual and the conversational context. To assess the referring effectiveness of its output, we also implement a reference resolution system. Our experiments and analyses show that the model produces better, more effective referring utterances than a model not grounded in the dialogue context, and generates subsequent references that exhibit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
