LingGen: Scalable Multi-Attribute Linguistic Control via Power-Law Masking

Mohamed Elgaar; Hadi Amiri

arXiv:2410.24201·cs.CL·January 27, 2026

LingGen: Scalable Multi-Attribute Linguistic Control via Power-Law Masking

Mohamed Elgaar, Hadi Amiri

PDF

Open Access 1 Video

TL;DR

LingGen is a scalable controlled text generation model that enables fine-grained multi-attribute control using a novel Pareto-based masking technique and BOS embedding injection, achieving high accuracy and fluency.

Contribution

It introduces P-MASKING with Pareto distribution sampling and BOS embedding injection for robust multi-attribute control in language models.

Findings

01

Achieves lowest control error across 1-40 attributes

02

Maintains high fluency scores in human evaluations

03

Efficient inference with scalable attribute control

Abstract

We present LingGen, a controlled text generation model that allows fine-grained control over a large number of real-valued linguistic attributes. It encodes target attribute values with a dedicated linguistic attribute encoder and conditions the language model by injecting the resulting representation into the language model using the beginning-of-sequence (BOS) embeddings. To improve robustness when controlling different attribute subsets, we introduce P-MASKING, which samples per-example attribute masking rates from a truncated Pareto distribution during training. Across 1-40 control attributes, LingGen achieves the lowest average control error among evaluated methods, while remaining efficient at inference and receiving the highest fluency scores in human evaluation. Ablations show that Pareto-sampled masking and BOS-based injection are effective choices compared to alternative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

LingGen: Scalable Multi-Attribute Linguistic Control via Power-Law Masking· underline

Taxonomy

TopicsNeural Networks and Reservoir Computing · Power System Optimization and Stability · Neural Networks and Applications

MethodsBalanced Selection