Development of an Automated Web Application for Efficient Web Scraping: Design and Implementation

Alok Dutta; Nilanjana Roy; Rhythm Sen; Sougata Dutta; Prabhat Das

arXiv:2510.21831·cs.IR·October 28, 2025

Development of an Automated Web Application for Efficient Web Scraping: Design and Implementation

Alok Dutta, Nilanjana Roy, Rhythm Sen, Sougata Dutta, Prabhat Das

PDF

TL;DR

This paper introduces a user-friendly, automated web application that simplifies web scraping for non-technical users by integrating data fetching, extraction, and organization within a secure, scalable platform.

Contribution

It presents a novel, accessible web scraping tool with a streamlined interface, supporting user management and data organization, built using Flask, MongoDB, and popular parsing libraries.

Findings

01

Enhanced accessibility for non-technical users

02

Efficient data extraction and organization

03

Scalable deployment using Flask and MongoDB

Abstract

This paper presents the design and implementation of a user-friendly, automated web application that simplifies and optimizes the web scraping process for non-technical users. The application breaks down the complex task of web scraping into three main stages: fetching, extraction, and execution. In the fetching stage, the application accesses target websites using the HTTP protocol, leveraging the requests library to retrieve HTML content. The extraction stage utilizes powerful parsing libraries like BeautifulSoup and regular expressions to extract relevant data from the HTML. Finally, the execution stage structures the data into accessible formats, such as CSV, ensuring the scraped content is organized for easy use. To provide personalized and secure experiences, the application includes user registration and login functionalities, supported by MongoDB, which stores user data and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.