Rethinking Table Instruction Tuning

Naihao Deng; Rada Mihalcea

arXiv:2501.14693·cs.CL·August 5, 2025

Rethinking Table Instruction Tuning

Naihao Deng, Rada Mihalcea

PDF

Open Access 1 Repo 4 Models 1 Datasets 1 Video 3 Reviews

TL;DR

This paper evaluates the impact of hyperparameters on table instruction tuning of LLMs, revealing that careful hyperparameter choices can improve table understanding and out-of-domain generalization, leading to the development of a new effective table LLM.

Contribution

It systematically analyzes hyperparameter effects on table LLMs and introduces TAMA, a tuned LLaMA 3.1 model with superior table understanding and generalization capabilities.

Findings

01

Hyperparameters significantly influence table-specific and general capabilities.

02

Smaller learning rates and fewer training instances can enhance table understanding.

03

TAMA outperforms GPT-3.5 and GPT-4 on table tasks while maintaining generalization.

Abstract

Recent advances in table understanding have focused on instruction-tuning large language models (LLMs) for table-related tasks. However, existing research has overlooked the impact of hyperparameter choices, and also lacks a comprehensive evaluation of the out-of-domain table understanding ability and the general capabilities of these table LLMs. In this paper, we evaluate these abilities in existing table LLMs, and find significant declines in both out-of-domain table understanding and general capabilities as compared to their base models. Through systematic analysis, we show that hyperparameters, such as learning rate, can significantly influence both table-specific and general capabilities. Contrary to the previous table instruction-tuning work, we demonstrate that smaller learning rates and fewer training instances can enhance table understanding while preserving general…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 8Confidence 4

Strengths

- Comprehensive Analysis: The paper provides a thorough analysis of existing table LLMs, highlighting the importance of hyperparameter tuning, which is often neglected in other studies. - Empirical Findings: The findings are helpful to subsequential research in this field of applying LLMs into tabular tasks. The effects of learning rate is close to my previous experimence in hyper-parameter tuning while further training LLMs over tables. - Based on the finding, the proposed TAMA demonstrates a

Weaknesses

These experiments were only based on Llama models. Further examination is needed for other LLMs, e.g., Qwen-* series, Phi-* series, etc. Whether the findings still hold requires further analysis.

Reviewer 02Rating 6Confidence 4

Strengths

1. Identification of Over-Specialization Issues in Table LLMs: This work reveals a critical trend in current table LLMs, where excessive fine-tuning for table-specific tasks often compromises the models' generalization capabilities. By shedding light on the potential trade-offs, the study encourages the community to reconsider whether the performance gains claimed in state-of-the-art table LLMs justify the loss of general capabilities. This insight points to a promising direction for future rese

Weaknesses

1. Limited Reproducibility Due to Closed Code and Weights: While the study’s findings are promising, the lack of open-source training and evaluation scripts, data, and TAMA’s model parameters during the review period restricts reproducibility. Although the study’s conclusions align with similar issues I observed in table LLMs like TableLLaMA and TableLLM, having access to TAMA’s scripts and parameters would allow a comprehensive verification of the training process and performance claims of TAMA

Reviewer 03Rating 5Confidence 4

Strengths

+ The paper illuminates the limitations/weaknesses of existing table LLMs. + TAMA is a good contribution to the open-source LLM community, which has a good performance. People can do follow-up work on TAMA to study LLM's specific capabilities.

Weaknesses

While TAMA is a good contribution to the community, I am personally not sure about the contribution of this paper itself. The takeaway for training TAMA seems to be just a good hyperparameter choice + multi-task training. The latter is a standard way and the former does not provide too much insight - I am not saying hyperparameter tuning/search is meaningless, and my point is that the current Section 3 is more like just an experimental report, where people (at least I) might want to see some fre

Code & Models

Repositories

spursgozmy/tabledreamer
pytorch

Models

Datasets

MichiganNLP/TAMA_Instruct
dataset· 178 dl
178 dl

Videos

Rethinking Table Instruction Tuning· underline

Taxonomy

TopicsMultimedia Communication and Technology · Video Analysis and Summarization · Experimental Learning in Engineering

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Absolute Position Encodings · Linear Layer · Weight Decay · Multi-Head Attention · Position-Wise Feed-Forward Layer · Label Smoothing