Synthetic Text Generation using Hypergraph Representations
Natraj Raman, Sameena Shah

TL;DR
This paper introduces a novel method for synthetic text generation that decomposes documents into semantic frames modeled by hypergraphs, enabling diverse and coherent text creation with controlled perturbations.
Contribution
It presents a new hypergraph-based approach for semantic frame decomposition and perturbation, enhancing diversity and coherence in synthetic text generation.
Findings
Generated documents are diverse and coherent.
Method accommodates complex relationships like hierarchy and temporal dynamics.
Produces variations in style, sentiment, and factual content.
Abstract
Generating synthetic variants of a document is often posed as text-to-text transformation. We propose an alternate LLM based method that first decomposes a document into semantic frames and then generates text using this interim sparse format. The frames are modeled using a hypergraph, which allows perturbing the frame contents in a principled manner. Specifically, new hyperedges are mined through topological analysis and complex polyadic relationships including hierarchy and temporal dynamics are accommodated. We show that our solution generates documents that are diverse, coherent and vary in style, sentiment, format, composition and facts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Digital Humanities and Scholarship · Web Data Mining and Analysis
