TabReX : Tabular Referenceless eXplainable Evaluation

Tejas Anvekar; Junha Park; Aparna Garimella; Vivek Gupta

arXiv:2512.15907·cs.CL·April 22, 2026

TabReX : Tabular Referenceless eXplainable Evaluation

Tejas Anvekar, Junha Park, Aparna Garimella, Vivek Gupta

PDF

1 Repo 1 Datasets

TL;DR

TabReX is a novel reference-less, graph-based evaluation framework for assessing the quality of tables generated by large language models, emphasizing interpretability and robustness.

Contribution

It introduces a property-driven, graph-based metric that aligns source and generated tables without fixed references, improving robustness and interpretability.

Findings

01

TabReX achieves the highest correlation with expert rankings.

02

It remains stable under challenging perturbations.

03

Enables detailed model and prompt analysis.

Abstract

Evaluating the quality of tables generated by large language models (LLMs) remains an open challenge: existing metrics either flatten tables into text, ignoring structure, or rely on fixed references that limit generalization. We present TabReX, a reference-less, property-driven framework for evaluating tabular generation via graph-based reasoning. TabReX converts both source text and generated tables into canonical knowledge graphs, aligns them through an LLM-guided matching process, and computes interpretable, rubric-aware scores that quantify structural and factual fidelity. The resulting metric provides controllable trade-offs between sensitivity and specificity, yielding human-aligned judgments and cell-level error traces. To systematically asses metric robustness, we introduce TabReX-Bench, a large-scale benchmark spanning six domains and twelve planner-driven perturbation types…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

coral-asu/TabReX
github

Datasets

corallabasu/TabReX_Bench
dataset· 8 dl
8 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.