M$^2$-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining

Rui Lv; Juncheng Mo; Tianyi Chu; Chen Rao; Hongyi Jing; Jiajie Teng; Jiafu Chen; Shiqi Zhang; Liangzi Ding; Shuo Fang; Huaizhong Lin; Ziqiang Dang; Chenguang Ma; Lei Zhao

arXiv:2602.05429·cs.AI·February 6, 2026

M$^2$-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining

Rui Lv, Juncheng Mo, Tianyi Chu, Chen Rao, Hongyi Jing, Jiajie Teng, Jiafu Chen, Shiqi Zhang, Liangzi Ding, Shuo Fang, Huaizhong Lin, Ziqiang Dang, Chenguang Ma, Lei Zhao

PDF

Open Access 3 Reviews

TL;DR

This paper introduces M$^2$-Miner, a novel multi-agent framework using MCTS for efficient, low-cost mobile GUI data mining, significantly improving data quality and diversity for training GUI agents.

Contribution

It presents the first automated multi-agent MCTS-based framework for mobile GUI data mining, incorporating strategies for enhanced efficiency, data diversity, and model training.

Findings

01

Achieves state-of-the-art performance on mobile GUI benchmarks.

02

Reduces data mining costs and improves data quality.

03

Enriches intent diversity through intent recycling.

Abstract

Graphical User Interface (GUI) agent is pivotal to advancing intelligent human-computer interaction paradigms. Constructing powerful GUI agents necessitates the large-scale annotation of high-quality user-behavior trajectory data (i.e., intent-trajectory pairs) for training. However, manual annotation methods and current GUI agent data mining approaches typically face three critical challenges: high construction cost, poor data quality, and low data richness. To address these issues, we propose M $^{2}$ -Miner, the first low-cost and automated mobile GUI agent data-mining framework based on Monte Carlo Tree Search (MCTS). For better data mining efficiency and quality, we present a collaborative multi-agent framework, comprising InferAgent, OrchestraAgent, and JudgeAgent for guidance, acceleration, and evaluation. To further enhance the efficiency of mining and enrich intent diversity, we…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 3

Strengths

The intent recycling strategy re-evaluates sibling paths to extract multiple intent-trajectory pairs from a single search tree, significantly improving data diversity and mining efficiency without additional exploration costs. The progressive model-in-the-loop training implements a three-stage training strategy, allowing agent capabilities to improve progressively in tandem with data complexity, which enhances the mining success rate in unseen scenarios.

Weaknesses

- The ablation study should be expanded: include a baseline using the stronger 72B model for InferAgent and JudgeAgent, but without the model-in-the-loop (MITL) strategy. This is necessary to validate the true effectiveness of MITL. - The paper mentions using 8 A100-80G GPUs for training and "retraining for 2 epochs on the full mined dataset at each stage". These significant computational costs, as well as the API costs for Qwen2.5-VL-72B, seem to be omitted from the 196 total cost claimed in T

Reviewer 02Rating 4Confidence 3

Strengths

S1. Strong Experiments. It compares with 13 methods and analyzes the effect of agent numbers and online learning strategies, showing solid and comprehensive evaluation. S2. Practical Significance. The framework is scalable and adaptable, demonstrating potential for real-world GUI automation and broader mobile applications. S3. Trajectory Recycling is an interesting and computation-efficient design.

Weaknesses

W1. Writing and Presentation Issues. The paper contains several typos and minor writing problems that affect readability: 1. Line 52: “we presents” ->“we present” 2. Line 274: “where i denotes the i-th visit to the node” appears twice. 3. Line 353” “This is crucial when targeting new application scenarios.” is unclear — please specify what scenarios are referred to. 4. Line 480 “significantly improve” -> “improves”. 5. Line 484 “an solid foundation” -> “a solid foundation” 6. Line 485 “Statemen

Reviewer 03Rating 6Confidence 2

Strengths

1.This paper propose a fully automated framework for mobile GUI agent data mining. By introducing MCTS and designing a collaborative multi-agent framework, the method improve data mining efficiency while enhancing data quality. 2.The intent recycling strategy further enhances both mining efficiency and intent richness, while the progressive model-in-the-loop training paradigm boosts success rates in both familiar and novel environments. 3.Extensive experiments show that GUI agents trained on th

Weaknesses

1. The paper propose an automated mobile GUI agent data-mining framework based on Monte Carlo Tree Search(MCTS). Monte Carlo tree search is a classic algorithm, is its innovation insufficient? 2.The background knowledge of MOBILE GUI AGENT DATA MING was not sufficiently introduced in the paper writing, making it difficult to understand.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Time Series Analysis and Forecasting · Recommender Systems and Techniques