Intelligent Self-Repairable Web Wrappers
Emilio Ferrara, Robert Baumgartner

TL;DR
This paper introduces an innovative approach for Web wrappers that can automatically repair themselves to adapt to structural changes in data sources, enhancing robustness and reducing manual maintenance.
Contribution
It presents a novel self-repairing Web wrapper system that automatically adapts to structural modifications of data sources, improving data extraction reliability.
Findings
Web wrappers can automatically repair after structural changes.
The approach reduces manual intervention in Web data extraction.
Enhanced robustness of Web data mining systems.
Abstract
The amount of information available on the Web grows at an incredible high rate. Systems and procedures devised to extract these data from Web sources already exist, and different approaches and techniques have been investigated during the last years. On the one hand, reliable solutions should provide robust algorithms of Web data mining which could automatically face possible malfunctioning or failures. On the other, in literature there is a lack of solutions about the maintenance of these systems. Procedures that extract Web data may be strictly interconnected with the structure of the data source itself; thus, malfunctioning or acquisition of corrupted data could be caused, for example, by structural modifications of data sources brought by their owners. Nowadays, verification of data integrity and maintenance are mostly manually managed, in order to ensure that these systems work…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
