PageGuide: Browser extension to assist users in navigating a webpage and locating information
Tin Nguyen, Thang T. Truong, Runtao Zhou, Trung Bui, Chirag Agarwal, Anh Totti Nguyen

TL;DR
PageGuide is a browser extension that helps users locate information, follow instructions, and manage distractions on webpages by visually grounding AI answers and providing interactive overlays.
Contribution
It introduces a novel browser extension that grounds language model outputs in the webpage's HTML DOM with visual overlays, enhancing user verification and interaction.
Findings
Find-locating relevant evidence improves by 26 percentage points.
Task completion time drops by 70%.
Manual search effort decreases with 80% less Ctrl+F usage.
Abstract
Users browsing the web daily struggle to quickly locate relevant information in cluttered pages, complete unfamiliar multi-step tasks, and stay focused amid distracting content. State-of-the-art AI assistants (e.g., ChatGPT, Gemini, Claude) and browser agents (e.g., OpenAI Operator, Browser Use) can answer questions and automate actions, yet they return answers without showing where the information comes from on the page, forcing users to manually verify results and blindly trust every automated steps. We present PageGuide, a browser extension that grounds LLM answers directly in the HTML DOM via visual overlays, addressing three core user needs: (a) Find-locating and highlighting relevant evidence in-situ so users can instantly verify answers on the page; (b) Guide-showing step-by-step instructions (e.g. how to change password) one at a time so users can follow and perform actions by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
