Scalable Micro-planned Generation of Discourse from Structured Data

Anirban Laha; Parag Jain; Abhijit Mishra; Karthik; Sankaranarayanan

arXiv:1810.02889·cs.CL·October 8, 2019

Scalable Micro-planned Generation of Discourse from Structured Data

Anirban Laha, Parag Jain, Abhijit Mishra, Karthik, Sankaranarayanan

PDF

1 Repo

TL;DR

This paper introduces a scalable, modular pipeline for generating natural language descriptions from structured data, avoiding task-specific training and leveraging basic NLP tools for adaptability across domains.

Contribution

A novel pipeline-based approach that generates coherent paragraphs from structured data without requiring task-specific parallel data, enhancing scalability and domain adaptability.

Findings

01

Outperforms existing data-to-text systems on benchmark datasets.

02

Demonstrates robustness across diverse data types like Knowledge Graphs and Key-Value maps.

03

Operates effectively without task-specific labeled data.

Abstract

We present a framework for generating natural language description from structured data such as tables; the problem comes under the category of data-to-text natural language generation (NLG). Modern data-to-text NLG systems typically employ end-to-end statistical and neural architectures that learn from a limited amount of task-specific labeled data, and therefore, exhibit limited scalability, domain-adaptability, and interpretability. Unlike these systems, ours is a modular, pipeline-based approach, and does not require task-specific parallel data. It rather relies on monolingual corpora and basic off-the-shelf NLP tools. This makes our system more scalable and easily adaptable to newer domains. Our system employs a 3-staged pipeline that: (i) converts entries in the structured data to canonical form, (ii) generates simple sentences for each atomic entry in the canonicalized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

parajain/structscribe
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.