Automatic Document Sketching: Generating Drafts from Analogous Texts
Zeqiu Wu, Michel Galley, Chris Brockett, Yizhe Zhang, Bill Dolan

TL;DR
This paper introduces the task of document sketching, generating draft documents from similar existing texts, supported by a new dataset and weakly supervised learning methods, including transformers and reinforcement learning.
Contribution
It proposes the novel task of document sketching, creates a Wikipedia-based dataset, and explores weakly supervised models with transformers and reinforcement learning.
Findings
Transformers with mixture of experts improve draft quality.
Reinforcement learning enhances model performance.
Automated and human evaluations validate the approach.
Abstract
The advent of large pre-trained language models has made it possible to make high-quality predictions on how to add or change a sentence in a document. However, the high branching factor inherent to text generation impedes the ability of even the strongest language models to offer useful editing suggestions at a more global or document level. We introduce a new task, document sketching, which involves generating entire draft documents for the writer to review and revise. These drafts are built from sets of documents that overlap in form - sharing large segments of potentially reusable text - while diverging in content. To support this task, we introduce a Wikipedia-based dataset of analogous documents and investigate the application of weakly supervised methods, including use of a transformer-based mixture of experts, together with reinforcement learning. We report experiments using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Wikis in Education and Collaboration
