Constraints First: A New MDD-based Model to Generate Sentences Under   Constraints

Alexandre Bonlarron; Aur\'elie Calabr\`ese; Pierre Kornprobst,; Jean-Charles R\'egin

arXiv:2309.12415·cs.AI·September 25, 2023

Constraints First: A New MDD-based Model to Generate Sentences Under Constraints

Alexandre Bonlarron, Aur\'elie Calabr\`ese, Pierre Kornprobst,, Jean-Charles R\'egin

PDF

TL;DR

This paper presents a novel MDD-based method for generating highly constrained sentences, significantly improving the diversity and quality of standardized texts for vision screening and other applications.

Contribution

It introduces a new MDD-based approach for constrained sentence generation that computes all solutions without search, enhancing diversity and applicability across languages.

Findings

01

Exhaustive solution set generation with MDD improves sentence diversity.

02

Application of GPT-2 refines and selects the best constrained sentences.

03

Method outperforms traditional approaches in vision screening text generation.

Abstract

This paper introduces a new approach to generating strongly constrained texts. We consider standardized sentence generation for the typical application of vision screening. To solve this problem, we formalize it as a discrete combinatorial optimization problem and utilize multivalued decision diagrams (MDD), a well-known data structure to deal with constraints. In our context, one key strength of MDD is to compute an exhaustive set of solutions without performing any search. Once the sentences are obtained, we apply a language model (GPT-2) to keep the best ones. We detail this for English and also for French where the agreement and conjugation rules are known to be more complex. Finally, with the help of GPT-2, we get hundreds of bona-fide candidate sentences. When compared with the few dozen sentences usually available in the well-known vision screening test (MNREAD), this brings a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Cosine Annealing · Adam · Residual Connection · Attention Dropout · Layer Normalization · Byte Pair Encoding · Dropout · Linear Warmup With Cosine Annealing