Scalable Representation Learning for Multimodal Tabular Transactions

Natraj Raman; Sumitra Ganesh; Manuela Veloso

arXiv:2410.07851·cs.LG·October 11, 2024

Scalable Representation Learning for Multimodal Tabular Transactions

Natraj Raman, Sumitra Ganesh, Manuela Veloso

PDF

Open Access

TL;DR

This paper introduces a scalable multimodal representation learning method for tabular transaction data, addressing challenges like high-cardinality fields and numerical reasoning, and enabling effective downstream task performance with LLMs.

Contribution

It proposes a multi-tier partitioning, adaptive quantization, and a parameter-efficient decoder for improved tabular data representation learning with LLMs.

Findings

01

Effective handling of high-cardinality fields.

02

Improved numerical reasoning capabilities.

03

Enhanced downstream task performance.

Abstract

Large language models (LLMs) are primarily designed to understand unstructured text. When directly applied to structured formats such as tabular data, they may struggle to discern inherent relationships and overlook critical patterns. While tabular representation learning methods can address some of these limitations, existing efforts still face challenges with sparse high-cardinality fields, precise numerical reasoning, and column-heavy tables. Furthermore, leveraging these learned representations for downstream tasks through a language based interface is not apparent. In this paper, we present an innovative and scalable solution to these challenges. Concretely, our approach introduces a multi-tier partitioning mechanism that utilizes power-law dynamics to handle large vocabularies, an adaptive quantization mechanism to impose priors on numerical continuity, and a distinct treatment of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Service-Oriented Architecture and Web Services · Data Mining Algorithms and Applications

MethodsAdapter