TL;DR
ETC-NLG introduces an unsupervised method for topic-conditioned natural language generation that works effectively in low-resource settings by leveraging topic modeling annotations without requiring labeled data.
Contribution
The paper presents a novel unsupervised approach for topic-conditioned NLG that operates over unlabeled data, reducing reliance on large labeled datasets and enabling emergent topic modeling.
Findings
Effective in low-resource Italian setting
Comparable performance in Italian and English
Automatic evaluation of conditioning effectiveness
Abstract
Plug-and-play language models (PPLMs) enable topic-conditioned natural language generation by pairing large pre-trained generators with attribute models used to steer the predicted token distribution towards the selected topic. Despite their computational efficiency, PPLMs require large amounts of labeled texts to effectively balance generation fluency and proper conditioning, making them unsuitable for low-resource settings. We present ETC-NLG, an approach leveraging topic modeling annotations to enable fully-unsupervised End-to-end Topic-Conditioned Natural Language Generation over emergent topics in unlabeled document collections. We first test the effectiveness of our approach in a low-resource setting for Italian, evaluating the conditioning for both topic models and gold annotations. We then perform a comparative evaluation of ETC-NLG for Italian and English using a parallel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · USD Coin Customer Service Number +1-833-534-1729 · Cosine Annealing · WordPiece · Linear Warmup With Linear Decay · BERT · RoBERTa · Residual Connection · Attention Dropout · Linear Warmup With Cosine Annealing
