LIFT: Last-Mile Fine-Tuning for Table Explicitation
Divij Khaitan, Ashish Tiwari

TL;DR
This paper introduces 'Lift', a last-mile fine-tuning pipeline that improves table extraction accuracy from unstructured text using a small, fine-tuned language model, especially effective with limited data.
Contribution
The paper presents a novel last-mile fine-tuning approach that enhances table repair accuracy with minimal training data and increased robustness to input variability.
Findings
Lift matches or exceeds end-to-end fine-tuning on a large benchmark.
Requires as few as 1,000 training examples for effective performance.
More robust to input format variability than other methods.
Abstract
We propose last-mile fine-tuning, or Lift, a pipeline in which a pre-trained large language model extracts an initial table from unstructured clipboard text, and a fine-tuned small language model (1B-24B parameters SLM) repairs errors in the extracted table. On a benchmark of 2,596 tables from three datasets, Lift matches or exceeds end-to-end SLM fine-tuning on tree-edit-distance-based similarity (TEDS) metric while requiring as little as 1,000 training examples - where it outperforms end-to-end fine-tuning by up to 0.144 TEDS points. We term this approach last-mile fine-tuning and show it also more robust to input format variability. Comparisons with self-debug and end-to-end fine-tuning approaches show that last-mile fine-tuning provides an attractive option when training data is limited or when robustness to input variation is sought without compromising on accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
