Self-supervised Graph Masking Pre-training for Graph-to-Text Generation
Jiuzhou Han, Ehsan Shareghi

TL;DR
This paper introduces a graph masking pre-training method for Graph-to-Text generation that leverages structural information without needing supervision, improving performance especially in low-resource scenarios.
Contribution
The authors propose a novel graph masking pre-training strategy compatible with existing models like T5, addressing structural information loss and domain mismatch issues.
Findings
Achieves state-of-the-art results on WebNLG+2020 and EventNarrative datasets.
Effective in low-resource settings.
Does not require supervision signals or architecture changes.
Abstract
Large-scale pre-trained language models (PLMs) have advanced Graph-to-Text (G2T) generation by processing the linearised version of a graph. However, the linearisation is known to ignore the structural information. Additionally, PLMs are typically pre-trained on free text which introduces domain mismatch between pre-training and downstream G2T generation tasks. To address these shortcomings, we propose graph masking pre-training strategies that neither require supervision signals nor adjust the architecture of the underlying pre-trained encoder-decoder model. When used with a pre-trained T5, our approach achieves new state-of-the-art results on WebNLG+2020 and EventNarrative G2T generation datasets. Our method also shows to be very effective in the low-resource setting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Residual Connection · Dropout · Adafactor · SentencePiece · Dense Connections
