PIRA-Bench: A Transition from Reactive GUI Agents to GUI-based Proactive Intent Recommendation Agents

Yuxiang Chai; Shunye Tang; Han Xiao; Rui Liu; Hongsheng Li

arXiv:2603.08013·cs.AI·March 10, 2026

PIRA-Bench: A Transition from Reactive GUI Agents to GUI-based Proactive Intent Recommendation Agents

Yuxiang Chai, Shunye Tang, Han Xiao, Rui Liu, Hongsheng Li

PDF

Open Access 1 Datasets

TL;DR

This paper introduces PIRA-Bench, a new benchmark for evaluating multimodal large language models in proactive GUI-based assistant tasks, addressing challenges of real-world, noisy, and complex visual input trajectories.

Contribution

The paper presents PIRA-Bench, a novel benchmark for proactive GUI agents, and proposes PIRF, a baseline framework for managing multiple tasks and noisy inputs in visual assistant scenarios.

Findings

01

PIRA-Bench effectively challenges models with complex, noisy visual trajectories.

02

PIRF baseline demonstrates improved task management and robustness in proactive GUI tasks.

03

Benchmark paves the way for developing more intelligent, anticipatory GUI assistants.

Abstract

Current Graphical User Interface (GUI) agents operate primarily under a reactive paradigm: a user must provide an explicit instruction for the agent to execute a task. However, an intelligent AI assistant should be proactive, which is capable of anticipating user intentions directly from continuous visual inputs, such as mobile or desktop screenshots, and offering timely recommendations without explicit user prompting. Transitioning to this proactive paradigm presents significant challenges. Real-world screen activity is rarely linear; it consists of long-horizon trajectories fraught with noisy browsing, meaningless actions, and multithreaded task-switching. To address this gap, we introduce PIRA-Bench (Proactive Intent Recommendation Agent Benchmark), a novel benchmark for evaluating multimodal large language models (MLLMs) on continuous, weakly-supervised visual inputs. Unlike…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Yuxiang007/PIRA-Bench-data
dataset· 40 dl
40 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Speech and dialogue systems · Social Robot Interaction and HRI