Syntax-informed Question Answering with Heterogeneous Graph Transformer
Fangyi Zhu, Lok You Tan, See-Kiong Ng, St\'ephane Bressan

TL;DR
This paper introduces a method to enhance pre-trained transformer models for question answering by incorporating explicit linguistic knowledge through a heterogeneous graph transformer, improving performance without retraining from scratch.
Contribution
It presents a novel approach to integrate symbolic linguistic information into pre-trained models using a heterogeneous graph transformer, extending their capabilities for question answering.
Findings
The approach is competitive with baseline models like BERT on SQuAD.
Incorporating syntactic structures improves question answering performance.
The method is extensible to other linguistic features like semantics and pragmatics.
Abstract
Large neural language models are steadily contributing state-of-the-art performance to question answering and other natural language and information processing tasks. These models are expensive to train. We propose to evaluate whether such pre-trained models can benefit from the addition of explicit linguistics information without requiring retraining from scratch. We present a linguistics-informed question answering approach that extends and fine-tunes a pre-trained transformer-based neural language model with symbolic knowledge encoded with a heterogeneous graph transformer. We illustrate the approach by the addition of syntactic information in the form of dependency and constituency graphic structures connecting tokens and virtual vertices. A comparative empirical performance evaluation with BERT as its baseline and with Stanford Question Answering Dataset demonstrates the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsAttention Is All You Need · Linear Layer · Adam · Multi-Head Attention · Residual Connection · Dense Connections · Attention Dropout · Softmax · Dropout · Linear Warmup With Linear Decay
