RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated   Adversarial Perturbations

Yilun Zhao; Chen Zhao; Linyong Nan; Zhenting Qi; Wenlin Zhang; Xiangru; Tang; Boyu Mi; Dragomir Radev

arXiv:2306.14321·cs.CL·June 27, 2023

RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations

Yilun Zhao, Chen Zhao, Linyong Nan, Zhenting Qi, Wenlin Zhang, Xiangru, Tang, Boyu Mi, Dragomir Radev

PDF

Open Access 1 Repo

TL;DR

This paper introduces RobuT, a benchmark for evaluating the robustness of Table QA models against human-annotated adversarial perturbations, revealing their vulnerability and proposing adversarial training with large language models to improve robustness.

Contribution

The paper presents RobuT, a new benchmark with adversarial perturbations for Table QA, and demonstrates how adversarial training with large language models enhances model robustness.

Findings

01

State-of-the-art Table QA models are vulnerable to adversarial perturbations.

02

Large language models can generate effective adversarial examples.

03

Adversarial training significantly improves robustness of Table QA models.

Abstract

Despite significant progress having been made in question answering on tabular data (Table QA), it's unclear whether, and to what extent existing Table QA models are robust to task-specific perturbations, e.g., replacing key question entities or shuffling table columns. To systematically study the robustness of Table QA models, we propose a benchmark called RobuT, which builds upon existing Table QA datasets (WTQ, WikiSQL-Weak, and SQA) and includes human-annotated adversarial perturbations in terms of table header, table content, and question. Our results indicate that both state-of-the-art Table QA models and large language models (e.g., GPT-3) with few-shot learning falter in these adversarial sets. We propose to address this problem by using large language models to generate adversarial examples to enhance training, which significantly improves the robustness of Table QA models. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yilunzhao/robut
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Interpreting and Communication in Healthcare