Emergent Communication in Interactive Sketch Question Answering
Zixing Lei, Yiming Zhang, Yuxin Xiong, Siheng Chen

TL;DR
This paper introduces a new multi-round interactive sketch-based question answering task that enhances communication efficiency and interpretability between agents, filling a significant gap in vision-based emergent communication research.
Contribution
It proposes the ISQA task and a novel interactive EC system that balances accuracy, complexity, and interpretability in multi-round sketch communication.
Findings
Multi-round interaction improves communication effectiveness.
The system achieves a good balance between accuracy and interpretability.
Human evaluation confirms the system's interpretability and efficiency.
Abstract
Vision-based emergent communication (EC) aims to learn to communicate through sketches and demystify the evolution of human communication. Ironically, previous works neglect multi-round interaction, which is indispensable in human communication. To fill this gap, we first introduce a novel Interactive Sketch Question Answering (ISQA) task, where two collaborative players are interacting through sketches to answer a question about an image in a multi-round manner. To accomplish this task, we design a new and efficient interactive EC system, which can achieve an effective balance among three evaluation factors, including the question answering accuracy, drawing complexity and human interpretability. Our experimental results including human evaluation demonstrate that multi-round interactive mechanism facilitates targeted and efficient communication between intelligent agents with decent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization
