GrAInS: Gradient-based Attribution for Inference-Time Steering of LLMs and VLMs

Duy Nguyen; Archiki Prasad; Elias Stengel-Eskin; Mohit Bansal

arXiv:2507.18043·cs.CL·July 25, 2025

GrAInS: Gradient-based Attribution for Inference-Time Steering of LLMs and VLMs

Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin, Mohit Bansal

PDF

Open Access

TL;DR

GrAInS introduces a gradient-based, inference-time steering method for LLMs and VLMs that enhances control over model outputs by identifying influential tokens and adjusting activations without retraining.

Contribution

It presents a novel, interpretable approach using Integrated Gradients for token attribution to steer models at inference time across multimodal tasks.

Findings

01

Achieves 13.22% accuracy improvement on TruthfulQA with Llama-3.1-8B.

02

Reduces hallucination rates on MMHal-Bench from 0.624 to 0.514.

03

Improves alignment win rates on SPA-VL by 8.11%.

Abstract

Inference-time steering methods offer a lightweight alternative to fine-tuning large language models (LLMs) and vision-language models (VLMs) by modifying internal activations at test time without updating model weights. However, most existing approaches rely on fixed, global intervention vectors, overlook the causal influence of individual input tokens, and fail to leverage informative gradients from the model's logits, particularly in multimodal settings where visual and textual inputs contribute unevenly. To address these limitations, we introduce GrAInS, an inference-time steering approach that operates across both language-only and vision-language models and tasks. GrAInS uses contrastive, gradient-based attribution via Integrated Gradients to identify the top-k most influential tokens, both positively and negatively attributed based on their contribution to preferred versus…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsElevator Systems and Control