LiteGUI: Distilling Compact GUI Agents with Reinforcement Learning

Yubin Wu; Zicheng Cai; Liping Ning; Hua Wang; Zhi Chen; Yaohua Tang; Hao Chen

arXiv:2605.07505·cs.AI·May 11, 2026

LiteGUI: Distilling Compact GUI Agents with Reinforcement Learning

Yubin Wu, Zicheng Cai, Liping Ning, Hua Wang, Zhi Chen, Yaohua Tang, Hao Chen

PDF

TL;DR

This paper introduces a novel training paradigm for lightweight GUI agents that leverages knowledge distillation and dual-level exploration to improve performance without overfitting.

Contribution

It presents a SFT-free training method with guided on-policy distillation and a dual-level framework, advancing lightweight GUI agent capabilities.

Findings

01

Achieves state-of-the-art results among small-scale models.

02

Outperforms traditional imitation learning with 2B/3B scale agents.

03

Enhances exploration and reduces hallucinations in GUI tasks.

Abstract

Developing lightweight, on-device vision-language GUI agents is essential for efficient cross-platform automated interaction. However, current on-device agents are constrained by limited model capacity, and further performance improvements remain urgently needed. Traditional Supervised Fine-Tuning (SFT) for small-scale models often leads to overfitting, catastrophic forgetting and policy rigidity, and thus fails to fully address these challenges. In this work, we propose a novel SFT-free training paradigm that significantly enhances the performance of small-scale models. We first present the initial systematic integration of generalized knowledge distillation into the GUI agent domain via Guided On-policy Distillation. By incorporating oracle reference trajectories together with a dynamic retrieval mechanism, our method reduces hallucinations and mitigates the cognitive misalignment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.