TL;DR
TabXEval introduces an exhaustive, explainable framework for table evaluation that combines structural alignment and detailed comparison, improving accuracy and interpretability over standard metrics.
Contribution
It presents a novel two-phase evaluation method with structural alignment and granular comparison, along with a benchmark dataset for diverse table evaluation tasks.
Findings
TabXEval outperforms existing metrics in robustness and explainability.
The framework provides interpretable feedback for table comparison.
Evaluation on TabXBench demonstrates high sensitivity and specificity.
Abstract
Evaluating tables qualitatively and quantitatively poses a significant challenge, as standard metrics often overlook subtle structural and content-level discrepancies. To address this, we propose a rubric-based evaluation framework that integrates multi-level structural descriptors with fine-grained contextual signals, enabling more precise and consistent table comparison. Building on this, we introduce TabXEval, an eXhaustive and eXplainable two-phase evaluation framework. TabXEval first aligns reference and predicted tables structurally via TabAlign, then performs semantic and syntactic comparison using TabCompare, offering interpretable and granular feedback. We evaluate TabXEval on TabXBench, a diverse, multi-domain benchmark featuring realistic table perturbations and human annotations. A sensitivity-specificity analysis further demonstrates the robustness and explainability of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
