Standardize: Aligning Language Models with Expert-Defined Standards for Content Generation
Joseph Marvin Imperial, Gail Forey, Harish Tayyar Madabushi

TL;DR
This paper introduces Standardize, a framework that uses expert-defined standards to guide large language models in generating content aligned with domain-specific quality criteria, improving accuracy significantly.
Contribution
It presents a retrieval-style in-context learning approach to incorporate standards into language model content generation, a novel method in controllable text generation.
Findings
Models improved 45% to 100% in accuracy when guided by standards.
The approach is effective across different large language models.
Standards-based guidance enhances content quality and consistency.
Abstract
Domain experts across engineering, healthcare, and education follow strict standards for producing quality content such as technical manuals, medication instructions, and children's reading materials. However, current works in controllable text generation have yet to explore using these standards as references for control. Towards this end, we introduce Standardize, a retrieval-style in-context learning-based framework to guide large language models to align with expert-defined standards. Focusing on English language standards in the education domain as a use case, we consider the Common European Framework of Reference for Languages (CEFR) and Common Core Standards (CCS) for the task of open-ended content generation. Our findings show that models can gain a 45% to 100% increase in precise accuracy across open and commercial LLMs evaluated, demonstrating that the use of knowledge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Topic Modeling
MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Adam · Softmax · Multi-Head Attention · Layer Normalization · Residual Connection · Absolute Position Encodings
