LLM Embeddings Improve Test-time Adaptation to Tabular $Y|X$-Shifts

Yibo Zeng; Jiashuo Liu; Henry Lam; Hongseok Namkoong

arXiv:2410.07395·cs.LG·October 11, 2024

LLM Embeddings Improve Test-time Adaptation to Tabular $Y|X$-Shifts

Yibo Zeng, Jiashuo Liu, Henry Lam, Hongseok Namkoong

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper explores how leveraging large language model (LLM) embeddings can improve the robustness and adaptability of models to $Y|X$-shifts in tabular data, enabling effective domain adaptation with minimal labeled data.

Contribution

It introduces a method of using LLM embeddings for tabular data to enhance test-time adaptation to $Y|X$-shifts, supported by extensive empirical evaluation.

Findings

01

LLM embeddings alone offer inconsistent robustness improvements.

02

Models trained on LLM embeddings can be effectively fine-tuned with as few as 32 labeled samples.

03

The approach is validated across 7650 source-target pairs and 261,000 model configurations.

Abstract

For tabular datasets, the change in the relationship between the label and covariates ( $Y ∣ X$ -shifts) is common due to missing variables (a.k.a. confounders). Since it is impossible to generalize to a completely new and unknown domain, we study models that are easy to adapt to the target domain even with few labeled examples. We focus on building more informative representations of tabular data that can mitigate $Y ∣ X$ -shifts, and propose to leverage the prior world knowledge in LLMs by serializing (write down) the tabular data to encode it. We find LLM embeddings alone provide inconsistent improvements in robustness, but models trained on them can be well adapted/finetuned to the target domain even using 32 labeled observations. Our finding is based on a comprehensive and systematic study consisting of 7650 source-target pairs and benchmark against 261,000 model configurations trained by…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 6Confidence 4

Strengths

This paper excels in addressing a pressing issue in machine learning: adapting models to distributional shifts in tabular data, specifically Y|X-shifts. The authors take an innovative approach by leveraging LLM embeddings to encode tabular data, which allows the model to incorporate broader contextual knowledge. This strength is significant because it enables the model to adapt to changes in label-covariate relationships, even with minimal (32 examples) labeled data from the target domain. The c

Weaknesses

This method is only applicable when there is a description available for all features of the tabular data. It would be good to expand this method to datasets which include features without any description. A possible solution can be by learning embeddings of such features from scratch and utilizing them as is done in TabTransformer. This will provide a comprehensive solution for different types of tabular data.

Reviewer 02Rating 6Confidence 3

Strengths

1. The motivation is clear and the algorithm is sensible. 2. The proposed method is tested on several benchmarks. 3. The proposed method is easy to understand and simple. It does not require a large number of LLM calls and the computational complexity is not high.

Weaknesses

1. On adaptation stage: The method requires the target's true label to guide the model parameter update. This setting is different from the basic test-time adaptation (TTA) setting, which is generally an unsupervised objective function [1]. The setting in this article is more like a fine-tuning setting. 2. The experiments in this article are mainly based on the ACS dataset. In recent studies, there are some other benchmarks like TableShift [2] contains Y|X shifts datasets. In the report they pro

Reviewer 03Rating 3Confidence 4

Strengths

1. The paper is well-structured and easy to follow, with the three technical components clearly and accessibly presented. 2. The experiments are comprehensive, which can demonstrate the claims of this paper.

Weaknesses

1. The dataset variety is limited, as experiments are conducted on only three datasets within the same domain, which restricts the generalizability of the results and may affect the robustness of the findings. 2. The proposed method incorporates additional domain knowledge to enhance classification performance. However, it is unclear whether comparisons with tree-based methods are entirely fair, given that such methods may not effectively leverage domain knowledge. A thorough discussion on the s

Code & Models

Repositories

namkoong-lab/llm-tabular-shifts
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Computational Physics and Python Applications · Real-time simulation and control systems

MethodsFocus