Fusion Steering: Prompt-Specific Activation Control
Waldemar Chang, Alhassan Yasin

TL;DR
Fusion Steering introduces a dynamic, prompt-specific activation control method that significantly enhances factual accuracy in large language models for question-answering tasks by injecting tailored activation deltas across all transformer layers.
Contribution
This work presents a novel, flexible activation steering approach that employs prompt-specific, dynamic injection of activation deltas across the entire network, outperforming traditional fixed-layer methods.
Findings
Segmented steering achieves 25.4% accuracy on SimpleQA prompts, outperforming baseline and full-layer steering.
Boosts fully correct responses from 0.0% to 13.1% under stricter evaluation.
Demonstrates effectiveness of per-prompt, full-network activation control in improving LLM factual accuracy.
Abstract
We present Fusion Steering, an activation steering methodology that improves factual accuracy in large language models (LLMs) for question-answering (QA) tasks. This approach introduces flexible steering configurations, including full-layer steering and segmented steering. Unlike traditional methods constrained to single-layer or fixed-layer operations, Fusion Steering employs dynamic injection of prompt-specific activation deltas across all transformer layers. These activation deltas are derived from reference completions that combine the ground-truth answer with a model-generated explanation to facilitate semantically enriched, example-specific steering. The injection weights are optimized per prompt using Optuna, targeting a joint objective that balances token overlap (factual alignment) and perplexity (fluency proxy). Evaluation employs a composite score integrating token overlap…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
