Tabular Embedding Model (TEM): Finetuning Embedding Models For Tabular RAG Applications
Sujit Khanna, Shishir Subedi

TL;DR
This paper introduces TEM, a novel fine-tuning approach for embedding models tailored to tabular data in RAG workflows, significantly improving performance and efficiency over existing models in domain-specific tasks.
Contribution
The paper proposes a new fine-tuning method for embedding models to enhance their performance on tabular data in RAG applications, addressing scalability and domain-specific challenges.
Findings
TEM outperforms current SOTA embedding models on tabular data tasks.
TEM achieves better efficiency with a smaller model size.
The approach effectively mitigates scalability issues in tabular RAG workflows.
Abstract
In recent times Large Language Models have exhibited tremendous capabilities, especially in the areas of mathematics, code generation and general-purpose reasoning. However for specialized domains especially in applications that require parsing and analyzing large chunks of numeric or tabular data even state-of-the-art (SOTA) models struggle. In this paper, we introduce a new approach to solving domain-specific tabular data analysis tasks by presenting a unique RAG workflow that mitigates the scalability issues of existing tabular LLM solutions. Specifically, we present Tabular Embedding Model (TEM), a novel approach to fine-tune embedding models for tabular Retrieval-Augmentation Generation (RAG) applications. Embedding models form a crucial component in the RAG workflow and even current SOTA embedding models struggle as they are predominantly trained on textual datasets and thus…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModular Robots and Swarm Intelligence · Robotics and Automated Systems
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Weight Decay · Attention Dropout · Dropout · Residual Connection · Softmax · WordPiece · Linear Layer · Byte Pair Encoding
