TL;DR
This paper introduces a novel deep learning model for Pictionary-style word guessing using sketch data, combining elements of visual question answering and game simulation, with promising results in mimicking human guessing behavior.
Contribution
The work presents the first computational model for Pictionary, integrating sketch-based visual data with open-ended guessing, and introduces the Sketch-QA dataset for this task.
Findings
Model generates human-like guesses and mistakes.
Experimental results show promising accuracy and human mimicry.
The Sketch-QA dataset enables new research in sketch-based visual reasoning.
Abstract
The ability of intelligent agents to play games in human-like fashion is popularly considered a benchmark of progress in Artificial Intelligence. Similarly, performance on multi-disciplinary tasks such as Visual Question Answering (VQA) is considered a marker for gauging progress in Computer Vision. In our work, we bring games and VQA together. Specifically, we introduce the first computational model aimed at Pictionary, the popular word-guessing social game. We first introduce Sketch-QA, an elementary version of Visual Question Answering task. Styled after Pictionary, Sketch-QA uses incrementally accumulated sketch stroke sequences as visual data. Notably, Sketch-QA involves asking a fixed question ("What object is being drawn?") and gathering open-ended guess-words from human guessers. We analyze the resulting dataset and present many interesting findings therein. To mimic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
