AIT-QA: Question Answering Dataset over Complex Tables in the Airline Industry
Yannis Katsis, Saneem Chemmengath, Vishwajeet Kumar, Samarth, Bharadwaj, Mustafa Canim, Michael Glass, Alfio Gliozzo, Feifei Pan, Jaydeep, Sen, Karthik Sankaranarayanan, Soumen Chakrabarti

TL;DR
This paper introduces AIT-QA, a new dataset for question answering over complex airline industry tables, highlighting the limitations of current transformer models on domain-specific, hierarchically structured tables.
Contribution
The paper presents AIT-QA, a domain-specific dataset with complex table layouts and annotations, and evaluates existing models, revealing their limited performance on such data.
Findings
State-of-the-art models achieve only 51.8% accuracy on AIT-QA.
Complex table structures pose significant challenges for current transformer-based QA systems.
Pragmatic preprocessing improves model compatibility with complex table layouts.
Abstract
Recent advances in transformers have enabled Table Question Answering (Table QA) systems to achieve high accuracy and SOTA results on open domain datasets like WikiTableQuestions and WikiSQL. Such transformers are frequently pre-trained on open-domain content such as Wikipedia, where they effectively encode questions and corresponding tables from Wikipedia as seen in Table QA dataset. However, web tables in Wikipedia are notably flat in their layout, with the first row as the sole column header. The layout lends to a relational view of tables where each row is a tuple. Whereas, tables in domain-specific business or scientific documents often have a much more complex layout, including hierarchical row and column headers, in addition to having specialized vocabulary terms from that domain. To address this problem, we introduce the domain-specific Table QA dataset AIT-QA (Airline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Data Quality and Management
MethodsTaBERT
