Steer Model beyond Assistant: Controlling System Prompt Strength via Contrastive Decoding

Yijiang River Dong; Tiancheng Hu; Zheng Hui; Nigel Collier

arXiv:2601.06403·cs.CL·January 13, 2026

Steer Model beyond Assistant: Controlling System Prompt Strength via Contrastive Decoding

Yijiang River Dong, Tiancheng Hu, Zheng Hui, Nigel Collier

PDF

Open Access

TL;DR

This paper presents a training-free method called system prompt strength that uses contrastive decoding to dynamically control large language model behavior, improving adherence to specific system prompts across various tasks.

Contribution

It introduces a novel contrastive decoding technique to modulate system prompt influence without retraining, enhancing control over model behavior.

Findings

01

Up to +8.5 accuracy on IFEval

02

+45pp refusal rate on OffTopicEval

03

+13% steerability on Prompt-Steering

Abstract

Large language models excel at complex instructions yet struggle to deviate from their helpful assistant persona, as post-training instills strong priors that resist conflicting instructions. We introduce system prompt strength, a training-free method that treats prompt adherence as a continuous control. By contrasting logits from target and default system prompts, we isolate and amplify the behavioral signal unique to the target persona by a scalar factor alpha. Across five diverse benchmarks spanning constraint satisfaction, behavioral control, pluralistic alignment, capability modulation, and stylistic control, our method yields substantial improvements: up to +8.5 strict accuracy on IFEval, +45pp refusal rate on OffTopicEval, and +13% steerability on Prompt-Steering. Our approach enables practitioners to modulate system prompt strength, providing dynamic control over model behavior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning in Healthcare