MATE: Multi-view Attention for Table Transformer Efficiency

Julian Martin Eisenschlos; Maharshi Gor; Thomas M\"uller; William W.; Cohen

arXiv:2109.04312·cs.CL·September 10, 2021

MATE: Multi-view Attention for Table Transformer Efficiency

Julian Martin Eisenschlos, Maharshi Gor, Thomas M\"uller, William W., Cohen

PDF

Open Access 1 Repo

TL;DR

MATE introduces a sparse-attention Transformer architecture optimized for large web tables, enabling efficient modeling of extensive tabular data and achieving state-of-the-art results in table reasoning tasks.

Contribution

The paper presents MATE, a novel sparse-attention Transformer that scales linearly and effectively models large tables with a structure-aware bias, outperforming previous models.

Findings

01

Handles tables with over 8000 tokens efficiently.

02

Sets new state-of-the-art on three table reasoning datasets.

03

Improves HybridQA results by 19 points.

Abstract

This work presents a sparse-attention Transformer architecture for modeling documents that contain large tables. Tables are ubiquitous on the web, and are rich in information. However, more than 20% of relational tables on the web have 20 or more rows (Cafarella et al., 2008), and these large tables present a challenge for current Transformer models, which are typically limited to 512 tokens. Here we propose MATE, a novel Transformer architecture designed to model the structure of web tables. MATE uses sparse attention in a way that allows heads to efficiently attend to either rows or columns in a table. This architecture scales linearly with respect to speed and memory, and can handle documents containing more than 8000 tokens with current accelerators. MATE also has a more appropriate inductive bias for tabular data, and sets a new state-of-the-art for three table reasoning datasets.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research/tapas
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Topic Modeling · Machine Learning in Healthcare

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · MATE · Dropout · Layer Normalization · Softmax · Label Smoothing