TalkSketch: Multimodal Generative AI for Real-time Sketch Ideation with Speech
Weiyan Shi, Sunaya Upadhyay, Geraldine Quek, Kenny Tsu Wei Choo

TL;DR
TalkSketch is a multimodal AI system that combines speech and sketching to facilitate real-time, fluid design ideation, addressing limitations of text-based prompts in creative workflows.
Contribution
We introduce TalkSketch, a novel multimodal AI tool that integrates speech and sketching to enhance early-stage design ideation.
Findings
Designers prefer multimodal interaction over text prompts alone.
TalkSketch enables more natural and continuous creative flow.
The system demonstrates potential to transform design ideation processes.
Abstract
Sketching is a widely used medium for generating and exploring early-stage design concepts. While generative AI (GenAI) chatbots are increasingly used for idea generation, designers often struggle to craft effective prompts and find it difficult to express evolving visual concepts through text alone. In the formative study (N=6), we examined how designers use GenAI during ideation, revealing that text-based prompting disrupts creative flow. To address these issues, we developed TalkSketch, an embedded multimodal AI sketching system that integrates freehand drawing with real-time speech input. TalkSketch aims to support a more fluid ideation process through capturing verbal descriptions during sketching and generating context-aware AI responses. Our work highlights the potential of GenAI tools to engage the design process itself rather than focusing on output.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDesign Education and Practice · Innovative Human-Technology Interaction · AI in Service Interactions
