Object-Relational Database Representations for Text Indexing
Panagiotis Papadakos, Yannis Theoharis, Yannis Marketakis, Nikos, Armenatzoglou, Yannis Tzitzikas

TL;DR
This paper introduces novel object-relational database representations for text indexing that significantly reduce storage size and improve query performance by leveraging non-1NF features of existing ORDBMS, demonstrated with experimental results.
Contribution
It presents four new database representations for text indexing that are more space-efficient and faster than traditional relational models, exploiting non-1NF features of ORDBMS.
Findings
Three representations are an order of magnitude more space-efficient.
Query evaluation is significantly faster with the new representations.
Experimental results are based on a dataset of one million pages.
Abstract
One of the distinctive features of Information Retrieval systems comparing to Database Management systems, is that they offer better compression for posting lists, resulting in better I/O performance and thus faster query evaluation. In this paper, we introduce database representations of the index that reduce the size (and thus the disk I/Os) of the posting lists. This is not achieved by redesigning the DBMS, but by exploiting the non 1NF features that existing Object-Relational DBM systems (ORDBMS) already offer. Specifically, four different database representations are described and detailed experimental results for one million pages are reported. Three of these representations are one order of magnitude more space efficient and faster (in query evaluation) than the plain relational representation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Peer-to-Peer Network Technologies
