CRAFT-GUI: Curriculum-Reinforced Agent For GUI Tasks
Songqin Nong, Xiaoxuan Tang, Jingxuan Xu, Sheng Zhou, Jianfeng Chen, Tao Jiang, Wenhao Xu

TL;DR
CRAFT-GUI introduces a curriculum reinforcement learning framework that improves GUI task performance by accounting for task difficulty variation and providing nuanced reward signals, leading to significant benchmarks improvements.
Contribution
It presents a novel curriculum learning approach with a new reward function for reinforcement learning in GUI tasks, addressing prior limitations of uniform training data and coarse rewards.
Findings
Achieves 5.6% improvement on Android Control benchmark.
Achieves 10.3% improvement on internal benchmarks.
Validates effectiveness of curriculum RL in GUI tasks.
Abstract
As autonomous agents become adept at understanding and interacting with graphical user interface (GUI) environments, a new era of automated task execution is emerging. Recent studies have demonstrated that Reinforcement Learning (RL) can effectively enhance agents' performance in dynamic interactive GUI environments. However, these methods face two key limitations: (1) they overlook the significant variation in difficulty across different GUI tasks by treating the entire training data as a uniform set, which hampers the agent's ability to adapt its learning process; and (2) most approaches collapse task-specific nuances into a single, coarse reward, leaving the agent with a uniform signal that yields inefficient policy updates. To address these limitations, we propose CRAFT-GUI, a curriculum learning framework based on Group Relative Policy Optimization (GRPO) that explicitly accounts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Teaching and Learning Programming · Explainable Artificial Intelligence (XAI)
