Toward a Unified Framework for Collaborative Design of Human-AI Interaction
Ankur Bhatt, Sven Mayer

TL;DR
This paper proposes a comprehensive framework for human-AI collaboration in multimodal interfaces, emphasizing alignment, explainability, and user agency to enhance trust and control.
Contribution
It introduces an integrated framework combining multimodal alignment, real-time explainability, and user agency for improved human-AI interaction design.
Findings
Framework demonstrated through collaborative design scenarios.
Addresses safety-critical and time-sensitive applications.
Enhances transparency and user control in AI systems.
Abstract
Human computer interaction is shifting from screen-based systems to multimodal interfaces where artificial intelligence powered systems increasingly interpret user intent through speech, gesture, and gaze. Yet users rarely understand how these interpretations are made, compromising trust and control. Existing approaches treat multimodal alignment, explainability, and human agency as separate concerns, leaving critical gaps in transparency and user oversight. We propose a Human Artificial Intelligence collaboration framework integrating these three principles as interdependent design requirements: 1) multimodal alignment for accurate intent interpretation, 2) interaction centric explainability delivering real time visual, textual, and audio feedback, and 3) agency preserving mechanisms enabling users to accept, reject, or modify artificial intelligence suggestions at any time. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
