CGL: Advancing Continual GUI Learning via Reinforcement Fine-Tuning

Zhenquan Yao; Zitong Huang; Yihan Zeng; Jianhua Han; Hang Xu; Chun-Mei Feng; Jianwei Ma; Wangmeng Zuo

arXiv:2603.02951·cs.LG·March 10, 2026

CGL: Advancing Continual GUI Learning via Reinforcement Fine-Tuning

Zhenquan Yao, Zitong Huang, Yihan Zeng, Jianhua Han, Hang Xu, Chun-Mei Feng, Jianwei Ma, Wangmeng Zuo

PDF

Open Access

TL;DR

This paper introduces CGL, a framework that combines supervised fine-tuning and reinforcement learning to improve continual GUI learning, effectively balancing adaptation and retention in GUI agents.

Contribution

It proposes a novel CGL framework with dynamic SFT-RL balancing, gradient surgery, and a new AndroidControl-CL benchmark for evaluating continual GUI learning.

Findings

01

CGL outperforms baseline methods in continual learning scenarios.

02

The gradient surgery strategy reduces gradient interference.

03

The AndroidControl-CL benchmark effectively evaluates GUI learning performance.

Abstract

Graphical User Interface (GUI) Agents, benefiting from recent advances in multimodal large language models (MLLM), have achieved significant development. However, due to the frequent updates of GUI applications, adapting to new tasks without forgetting old tasks in GUI continual learning remains an open problem. In this work, we reveal that while Supervised Fine-Tuning (SFT) facilitates fast adaptation, it often triggers knowledge overwriting, whereas Reinforcement Learning (RL) demonstrates an inherent resilience that shields prior interaction logic from erasure. Based on this insight, we propose a \textbf{C}ontinual \textbf{G}UI \textbf{L}earning (CGL) framework that dynamically balances adaptation efficiency and skill retention by enhancing the synergy between SFT and RL. Specifically, we introduce an SFT proportion adjustment mechanism guided by policy entropy to dynamically control…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling