Syntax-informed Question Answering with Heterogeneous Graph Transformer

Fangyi Zhu; Lok You Tan; See-Kiong Ng; St\'ephane Bressan

arXiv:2204.09655·cs.CL·May 24, 2022

Syntax-informed Question Answering with Heterogeneous Graph Transformer

Fangyi Zhu, Lok You Tan, See-Kiong Ng, St\'ephane Bressan

PDF

Open Access

TL;DR

This paper introduces a method to enhance pre-trained transformer models for question answering by incorporating explicit linguistic knowledge through a heterogeneous graph transformer, improving performance without retraining from scratch.

Contribution

It presents a novel approach to integrate symbolic linguistic information into pre-trained models using a heterogeneous graph transformer, extending their capabilities for question answering.

Findings

01

The approach is competitive with baseline models like BERT on SQuAD.

02

Incorporating syntactic structures improves question answering performance.

03

The method is extensible to other linguistic features like semantics and pragmatics.

Abstract

Large neural language models are steadily contributing state-of-the-art performance to question answering and other natural language and information processing tasks. These models are expensive to train. We propose to evaluate whether such pre-trained models can benefit from the addition of explicit linguistics information without requiring retraining from scratch. We present a linguistics-informed question answering approach that extends and fine-tunes a pre-trained transformer-based neural language model with symbolic knowledge encoded with a heterogeneous graph transformer. We illustrate the approach by the addition of syntactic information in the form of dependency and constituency graphic structures connecting tokens and virtual vertices. A comparative empirical performance evaluation with BERT as its baseline and with Stanford Question Answering Dataset demonstrates the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsAttention Is All You Need · Linear Layer · Adam · Multi-Head Attention · Residual Connection · Dense Connections · Attention Dropout · Softmax · Dropout · Linear Warmup With Linear Decay