@Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology
Xin Jiang, Junwei Zheng, Ruiping Liu, Jiahang Li, Jiaming Zhang, Sven, Matthiesen, and Rainer Stiefelhagen

TL;DR
This paper introduces @Bench, a comprehensive benchmark for evaluating vision-language models in assistive technology for visually impaired people, along with a new multi-task model that improves assistance capabilities.
Contribution
The paper presents a novel benchmark (@Bench) for human-centered assistive tasks and a new multi-task model (@Model) that addresses multiple vision-language tasks simultaneously.
Findings
The benchmark covers five key assistive tasks.
The proposed model outperforms existing methods across tasks.
Experiments demonstrate the model's effectiveness and generalizability.
Abstract
As Vision-Language Models (VLMs) advance, human-centered Assistive Technologies (ATs) for helping People with Visual Impairments (PVIs) are evolving into generalists, capable of performing multiple tasks simultaneously. However, benchmarking VLMs for ATs remains under-explored. To bridge this gap, we first create a novel AT benchmark (@Bench). Guided by a pre-design user study with PVIs, our benchmark includes the five most crucial vision-language tasks: Panoptic Segmentation, Depth Estimation, Optical Character Recognition (OCR), Image Captioning, and Visual Question Answering (VQA). Besides, we propose a novel AT model (@Model) that addresses all tasks simultaneously and can be expanded to more assistive functions for helping PVIs. Our framework exhibits outstanding performance across tasks by integrating multi-modal information, and it offers PVIs a more comprehensive assistance.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAssistive Technology in Communication and Mobility · Digital Accessibility for Disabilities · Smart Cities and Technologies
