Robust LLM-based Column Type Annotation via Prompt Augmentation with LoRA Tuning

Hanze Meng; Jianhao Cao; Rachel Pottinger

arXiv:2512.22742·cs.DB·December 30, 2025

Robust LLM-based Column Type Annotation via Prompt Augmentation with LoRA Tuning

Hanze Meng, Jianhao Cao, Rachel Pottinger

PDF

Open Access

TL;DR

This paper introduces a parameter-efficient LoRA-based prompt augmentation method for column type annotation, improving robustness and accuracy across datasets and prompt variations without extensive re-training.

Contribution

It proposes a novel, efficient fine-tuning framework using LoRA and prompt augmentation to enhance robustness and performance in column type annotation tasks.

Findings

01

Achieves higher weighted F1 scores than single-prompt fine-tuning.

02

Maintains stable performance across diverse prompt patterns.

03

Reduces computational costs compared to full model fine-tuning.

Abstract

Column Type Annotation (CTA) is a fundamental step towards enabling schema alignment and semantic understanding of tabular data. Existing encoder-only language models achieve high accuracy when fine-tuned on labeled columns, but their applicability is limited to in-domain settings, as distribution shifts in tables or label spaces require costly re-training from scratch. Recent work has explored prompting generative large language models (LLMs) by framing CTA as a multiple-choice task, but these approaches face two key challenges: (1) model performance is highly sensitive to subtle changes in prompt wording and structure, and (2) annotation F1 scores remain modest. A natural extension is to fine-tune large language models. However, fully fine-tuning these models incurs prohibitive computational costs due to their scale, and the sensitivity to prompts is not eliminated. In this paper, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks