MCP vs RAG vs NLWeb vs HTML: A Comparison of the Effectiveness and Efficiency of Different Agent Interfaces to the Web (Technical Report)
Aaron Steiner, Ralph Peeters, Christian Bizer

TL;DR
This study compares four web agent interfaces—HTML, RAG, MCP, and NLWeb—using a controlled testbed, demonstrating that RAG, MCP, and NLWeb outperform HTML in effectiveness and efficiency across various tasks.
Contribution
It introduces a unified testbed for comparing different web agent interfaces and provides empirical evidence on their relative performance using multiple LLMs.
Findings
RAG, MCP, and NLWeb outperform HTML in effectiveness and efficiency.
F1 score increases from 0.67 (HTML) to 0.75-0.77 (others).
Token usage and runtime per task significantly decrease with advanced interfaces.
Abstract
Large language model agents are increasingly used to automate web tasks such as product search, offer comparison, and checkout. Current research explores different interfaces through which these agents interact with websites, including traditional HTML browsing, retrieval-augmented generation (RAG) over pre-crawled content, communication via Web APIs using the Model Context Protocol (MCP), and natural-language querying through the NLWeb interface. However, no prior work has compared these four architectures within a single controlled environment using identical tasks. To address this gap, we introduce a testbed consisting of four simulated e-shops, each offering its products via HTML, MCP, and NLWeb interfaces. For each interface (HTML, RAG, MCP, and NLWeb) we develop specialized agents that perform the same sets of tasks, ranging from simple product searches and price comparisons to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
