A New Compression Based Index Structure for Efficient Information Retrieval
Md. Abdullah al Mamun, Md. Hanif, Md. Rakib Uddin, Tanvir Ahmed, Md., Mofizul Islam

TL;DR
This paper introduces a novel compression-based index structure for information retrieval systems, significantly reducing index size and improving efficiency through a new run-length encoding technique.
Contribution
The paper proposes a new compression method for index structures in IR systems, enhancing storage efficiency by over 67% compared to existing techniques.
Findings
Achieved 67.34% average compression improvement
Effective run-length coding mechanism for index compression
Enhanced IR system performance through smaller index size
Abstract
Finding desired information from large data set is a difficult problem. Information retrieval is concerned with the structure, analysis, organization, storage, searching, and retrieval of information. Index is the main constituent of an IR system. Now a day exponential growth of information makes the index structure large enough affecting the IR system's quality. So compressing the Index structure is our main contribution in this paper. We compressed the document number in inverted file entries using a new coding technique based on run-length encoding. Our coding mechanism uses a specified code which acts over run-length coding. We experimented and found that our coding mechanism on an average compresses 67.34% percent more than the other techniques.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Data Management and Algorithms
