Towards Enhanced Context Awareness with Vision-based Multimodal   Interfaces

Yongquan Hu; Wen Hu; Aaron Quigley

arXiv:2408.07488·cs.HC·August 15, 2024

Towards Enhanced Context Awareness with Vision-based Multimodal Interfaces

Yongquan Hu, Wen Hu, Aaron Quigley

PDF

TL;DR

This paper explores how vision-based multimodal interfaces can improve context awareness in human-computer interaction by integrating AI-driven visual modalities like scale, depth, and time for more seamless interactions.

Contribution

It presents three application cases demonstrating enhanced context awareness through multimodal visual data in HCI, focusing on scale, depth, and temporal information.

Findings

01

Enhanced interpretation of user intentions using multimodal visual data

02

Improved environmental understanding through depth and microscopic imaging

03

Seamless virtual interactions with integrated haptic feedback

Abstract

Vision-based Interfaces (VIs) are pivotal in advancing Human-Computer Interaction (HCI), particularly in enhancing context awareness. However, there are significant opportunities for these interfaces due to rapid advancements in multimodal Artificial Intelligence (AI), which promise a future of tight coupling between humans and intelligent systems. AI-driven VIs, when integrated with other modalities, offer a robust solution for effectively capturing and interpreting user intentions and complex environmental information, thereby facilitating seamless and efficient interactions. This PhD study explores three application cases of multimodal interfaces to augment context awareness, respectively focusing on three dimensions of visual modality: scale, depth, and time: a fine-grained analysis of physical surfaces via microscopic image, precise projection of the real world using depth data,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.