ReGen: Reinforcement Learning for Text and Knowledge Base Generation   using Pretrained Language Models

Pierre L. Dognin; Inkit Padhi; Igor Melnyk; Payel Das

arXiv:2108.12472·cs.CL·August 31, 2021

ReGen: Reinforcement Learning for Text and Knowledge Base Generation using Pretrained Language Models

Pierre L. Dognin, Inkit Padhi, Igor Melnyk, Payel Das

PDF

Open Access 1 Repo

TL;DR

ReGen introduces a reinforcement learning approach using self-critical sequence training to enhance bidirectional text and knowledge base generation, achieving state-of-the-art results on key datasets.

Contribution

The paper presents ReGen, a novel RL-based framework that improves bidirectional text and graph generation by framing tasks as sequence-to-sequence problems with self-critical training.

Findings

01

ReGen outperforms previous models on WebNLG+ 2020 dataset.

02

RL via SCST improves both text and graph generation quality.

03

State-of-the-art results achieved for text-to-graph and graph-to-text tasks.

Abstract

Automatic construction of relevant Knowledge Bases (KBs) from text, and generation of semantically meaningful text from KBs are both long-standing goals in Machine Learning. In this paper, we present ReGen, a bidirectional generation of text and graph leveraging Reinforcement Learning (RL) to improve performance. Graph linearization enables us to re-frame both tasks as a sequence to sequence generation problem regardless of the generative direction, which in turn allows the use of Reinforcement Learning for sequence training where the model itself is employed as its own critic leading to Self-Critical Sequence Training (SCST). We present an extensive investigation demonstrating that the use of RL via SCST benefits graph and text generation on WebNLG+ 2020 and TekGen datasets. Our system provides state-of-the-art results on WebNLG+ 2020 by significantly improving upon published results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IBM/regen
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques

MethodsSelf-critical Sequence Training · REINFORCE