TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning   Questions over Tabular Data

Fan Zhou; Mengkang Hu; Haoyu Dong; Zhoujun Cheng; Shi Han; Dongmei; Zhang

arXiv:2205.12682·cs.IR·May 26, 2022

TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions over Tabular Data

Fan Zhou, Mengkang Hu, Haoyu Dong, Zhoujun Cheng, Shi Han, Dongmei, Zhang

PDF

Open Access 1 Repo

TL;DR

TaCube pre-computes numerical data over tables to enhance PLMs' ability to answer numerical reasoning questions, significantly improving accuracy on benchmarks like TAT-QA and WikiTQ.

Contribution

It introduces a pre-computation method that covers various arithmetic operations over table segments, boosting PLMs' numerical reasoning performance.

Findings

01

F1 score on TAT-QA improved from 49.6% to 66.2%.

02

Achieved state-of-the-art 59.6% denotation accuracy on WikiTQ.

03

Significant gains in numerical reasoning accuracy, e.g., +39.6% on sum.

Abstract

Existing auto-regressive pre-trained language models (PLMs) like T5 and BART, have been well applied to table question answering by UNIFIEDSKG and TAPEX, respectively, and demonstrated state-of-the-art results on multiple benchmarks. However, auto-regressive PLMs are challenged by recent emerging numerical reasoning datasets, such as TAT-QA, due to the error-prone implicit calculation. In this paper, we present TaCube, to pre-compute aggregation/arithmetic results for the table in advance, so that they are handy and readily available for PLMs to answer numerical reasoning questions. TaCube systematically and comprehensively covers a collection of computational operations over table segments. By simply concatenating TaCube to the input sequence of PLMs, it shows significant experimental effectiveness. TaCube promotes the F1 score from 49.6% to 66.2% on TAT-QA and achieves new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

koalazf99/tacube
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research

MethodsAttention Is All You Need · Linear Layer · Table Pre-training via Execution · Inverse Square Root Schedule · SentencePiece · Attention Dropout · Adafactor · Residual Connection · Gated Linear Unit · Multi-Head Attention