DeepMapping: Learned Data Mapping for Lossless Compression and Efficient   Lookup

Lixi Zhou; K. Sel\c{c}uk Candan; Jia Zou

arXiv:2307.05861·cs.DB·September 27, 2024·2 cites

DeepMapping: Learned Data Mapping for Lossless Compression and Efficient Lookup

Lixi Zhou, K. Sel\c{c}uk Candan, Jia Zou

PDF

Open Access

TL;DR

DeepMapping leverages neural networks for lossless data compression and fast lookup, offering improved storage, speed, and update capabilities for tabular data, especially on capacity-limited devices.

Contribution

This work introduces DeepMapping, a novel neural network-based data mapping abstraction that enhances compression, retrieval speed, and update efficiency without retraining.

Findings

01

Outperforms existing methods in compression ratio and retrieval speed.

02

Effectively handles data insertions, deletions, and updates.

03

Demonstrates benefits on real-world and benchmark datasets.

Abstract

Storing tabular data to balance storage and query efficiency is a long-standing research question in the database community. In this work, we argue and show that a novel DeepMapping abstraction, which relies on the impressive memorization capabilities of deep neural networks, can provide better storage cost, better latency, and better run-time memory footprint, all at the same time. Such unique properties may benefit a broad class of use cases in capacity-limited devices. Our proposed DeepMapping abstraction transforms a dataset into multiple key-value mappings and constructs a multi-tasking neural network model that outputs the corresponding values for a given input key. To deal with memorization errors, DeepMapping couples the learned neural network with a lightweight auxiliary data structure capable of correcting mistakes. The auxiliary structure design further enables DeepMapping to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Storage Technologies · Data Quality and Management · Algorithms and Data Compression