TL;DR
DroidRetriever is a system that enhances mobile information seeking by providing transparency, control, and cross-source integration through a multi-LLM approach, reducing user workload.
Contribution
It introduces a transparent, steerable mobile information retrieval system with real-time progress visualization and privacy safeguards, advancing beyond opaque existing solutions.
Findings
Improved coverage and transparency in information seeking tasks.
Reduced user workload and increased control during tasks.
Demonstrated effectiveness across 35 tasks in 24 apps.
Abstract
Information seeking on mobile devices is often fragmented, trapping users in repetitive cycles of context switching and data re-entry, which increases cognitive load and disrupts workflow. Existing mobile agents provide limited cross-source integration and are largely opaque, presenting progress as a linear feed with few opportunities to intervene, steer, or take control. We present DroidRetriever, a transparent, steerable system for cross-source mobile information seeking. It accepts voice or typed input and the multi-LLM system decomposes the task, navigates to target pages, takes screenshots, and synthesizes a concise report with citation-linked screenshots. We make the process transparent through a progress dashboard combining sub-task progress and real-time exploration maps for seamless takeover. DroidRetriever also pauses on detected privacy or high-risk screens and prompts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
