Beyond BeautifulSoup: Benchmarking LLM-Powered Web Scraping for Everyday Users

Arth Bhardwaj; Nirav Diwan; Gang Wang

arXiv:2601.06301·cs.CR·January 13, 2026

Beyond BeautifulSoup: Benchmarking LLM-Powered Web Scraping for Everyday Users

Arth Bhardwaj, Nirav Diwan, Gang Wang

PDF

Open Access 1 Video

TL;DR

This paper benchmarks how large language models enable everyday users to perform web scraping tasks on complex sites, demonstrating that end-to-end LLM agents can automate data extraction with minimal prompts and effort.

Contribution

It systematically evaluates LLM-based web scraping workflows across diverse security measures, highlighting their accessibility and practical effectiveness for non-expert users.

Findings

01

End-to-end LLM agents can automate complex scraping with minimal prompts.

02

LLM-assisted scripting is effective for static sites and faster in some cases.

03

Users can achieve successful scraping with less than five prompt refinements.

Abstract

Web scraping has historically required technical expertise in HTML parsing, session management, and authentication circumvention, which limited large-scale data extraction to skilled developers. We argue that large language models (LLMs) have democratized web scraping, enabling low-skill users to execute sophisticated operations through simple natural language prompts. While extensive benchmarks evaluate these tools under optimal expert conditions, we show that without extensive manual effort, current LLM-based workflows allow novice users to scrape complex websites that would otherwise be inaccessible. We systematically benchmark what everyday users can do with off-the-shelf LLM tools across 35 sites spanning five security tiers, including authentication, anti-bot, and CAPTCHA controls. We devise and evaluate two distinct workflows: (a) LLM-assisted scripting, where users prompt LLMs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Beyond BeautifulSoup: Benchmarking LLM-Powered Web Scraping for Everyday Users· underline

Taxonomy

TopicsWeb Application Security Vulnerabilities · Spam and Phishing Detection · Security and Verification in Computing