Limits of n-gram Style Control for LLMs via Logit-Space Injection
Sami-ul Ahmed

TL;DR
This paper explores a lightweight method for controlling the style of large language models by injecting n-gram style priors into logits during decoding, revealing its effectiveness only in narrow settings and outperforming prompting and LoRA.
Contribution
It introduces a logit-space n-gram style injection technique for LLMs, demonstrating its potential and limitations for style control without retraining or extensive fine-tuning.
Findings
Style control improves within a narrow lambda regime for Don Quixote corpus.
Outside the optimal regime, style and fluency deteriorate significantly.
The method is outperformed by prompting and LoRA in most scenarios.
Abstract
Large language models (LLMs) are typically personalized via prompt engineering or parameter-efficient fine-tuning such as LoRA. However, writing style can be difficult to distill into a single prompt, and LoRA fine-tuning requires computationally intensive training and infrastructure. We investigate a possible lightweight alternative: steering a frozen LLM with n-gram style priors injected in logit space at decoding time. We train an n-gram model on stylistically distinct corpora -- including Don Quixote, CNN/DailyMail news headlines, and arXiv abstracts -- constructing an interpolated 1-to-3-gram prior over next-token probabilities. During generation we modify the LLM's logits by adding a weighted sum of style log-probabilities from each n-gram order that matches the current context, scaled by a control parameter lambda in [0, 1]. We sweep lambda and style corpora and report style…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Authorship Attribution and Profiling · Topic Modeling
