StepWrite: Adaptive Planning for Speech-Driven Text Generation
Hamza El Alaoui, Atieh Taheri, Yi-Hao Peng, Jeffrey P. Bigham

TL;DR
StepWrite is a novel voice-based system that enables structured, adaptive, and hands-free long-form text composition by guiding users through subtasks with context-aware prompts, reducing cognitive load and improving usability.
Contribution
It introduces a large language model-driven system that decomposes writing into subtasks and dynamically guides users, supporting complex text creation on the move.
Findings
Significantly reduces cognitive load during mobile text composition.
Improves user satisfaction and usability over baseline dictation and voice assistant methods.
Demonstrates effective dynamic prompt generation and fact checking capabilities.
Abstract
People frequently use speech-to-text systems to compose short texts with voice. However, current voice-based interfaces struggle to support composing more detailed, contextually complex texts, especially in scenarios where users are on the move and cannot visually track progress. Longer-form communication, such as composing structured emails or thoughtful responses, requires persistent context tracking, structured guidance, and adaptability to evolving user intentions--capabilities that conventional dictation tools and voice assistants do not support. We introduce StepWrite, a large language model-driven voice-based interaction system that augments human writing ability by enabling structured, hands-free and eyes-free composition of longer-form texts while on the move. StepWrite decomposes the writing process into manageable subtasks and sequentially guides users with contextually-aware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
