Warehousing complex data from the Web
Omar Boussaid (ERIC), Jerome Darmont (ERIC), Fadila Bentayeb (ERIC),, Sabine Loudcher (ERIC)

TL;DR
This paper introduces a methodology for integrating, modeling, storing, and analyzing complex Web-derived data using XML in data warehouses, enhancing decision support with OLAP and data mining techniques.
Contribution
It presents a novel XML-based approach for complex data warehousing, including modeling, storage, and analysis, with a focus on performance optimization.
Findings
XML enables effective integration of complex Web data
The methodology supports OLAP and data mining on XML data
Performance considerations are addressed for XML warehouses
Abstract
The data warehousing and OLAP technologies are now moving onto handling complex data that mostly originate from the Web. However, intagrating such data into a decision-support process requires their representation under a form processable by OLAP and/or data mining techniques. We present in this paper a complex data warehousing methodology that exploits XML as a pivot language. Our approach includes the integration of complex data in an ODS, under the form of XML documents; their dimensional modeling and storage in an XML data warehouse; and their analysis with combined OLAP and data mining techniques. We also address the crucial issue of performance in XML warehouses.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Data Mining Algorithms and Applications
