TL;DR
COD3S is a new method that enhances semantic diversity in sequence generation by conditioning on semantic codes, improving variety in causal sentence generation without sacrificing accuracy.
Contribution
Introduces COD3S, a two-stage approach using LSH-based semantic codes to significantly increase diversity in seq2seq generated sentences for one-to-many tasks.
Findings
Enhanced diversity in generated sentences demonstrated by automatic and human evaluations.
Semantic codes based on Hamming distances correlate well with human judgments.
Method maintains task performance while improving output variety.
Abstract
We present COD3S, a novel method for generating semantically diverse sentences using neural sequence-to-sequence (seq2seq) models. Conditioned on an input, seq2seq models typically produce semantically and syntactically homogeneous sets of sentences and thus perform poorly on one-to-many sequence generation tasks. Our two-stage approach improves output diversity by conditioning generation on locality-sensitive hash (LSH)-based semantic sentence codes whose Hamming distances highly correlate with human judgments of semantic textual similarity. Though it is generally applicable, we apply COD3S to causal generation, the task of predicting a proposition's plausible causes or effects. We demonstrate through automatic and human evaluation that responses produced using our method exhibit improved diversity without degrading task performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Sequence to Sequence
