The Hidden Web, XML and Semantic Web: A Scientific Data Management Perspective
Fabian Suchanek (INRIA Saclay - Ile de France), Aparna Varde, Richi, Nayak (QUT), Pierre Senellart

TL;DR
This paper explores the evolution of the Web beyond HTML, emphasizing the roles of the hidden Web, XML, and the Semantic Web in scientific data management and how they facilitate data organization and retrieval.
Contribution
It provides a detailed explanation of these Web developments and demonstrates their application in managing scientific data, aiming to bridge database research and Web technologies.
Findings
XML is the dominant language for Web data exchange.
Semantic Web enhances data integration and retrieval.
Real-world scientific examples illustrate practical benefits.
Abstract
The World Wide Web no longer consists just of HTML pages. Our work sheds light on a number of trends on the Internet that go beyond simple Web pages. The hidden Web provides a wealth of data in semi-structured form, accessible through Web forms and Web services. These services, as well as numerous other applications on the Web, commonly use XML, the eXtensible Markup Language. XML has become the lingua franca of the Internet that allows customized markups to be defined for specific domains. On top of XML, the Semantic Web grows as a common structured data source. In this work, we first explain each of these developments in detail. Using real-world examples from scientific domains of great interest today, we then demonstrate how these new developments can assist the managing, harvesting, and organization of data on the Web. On the way, we also illustrate the current research avenues in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Semantic Web and Ontologies · Advanced Database Systems and Queries
