vLLM Hook v0: A Plug-in for Programming Model Internals on vLLM
Ching-Yun Ko, Pin-Yu Chen

TL;DR
vLLM Hook is an open-source plug-in that enhances the programmability of internal states in vLLM models, enabling analysis and intervention for improved model alignment, detection, and response control.
Contribution
It introduces a configurable plug-in for vLLM that allows passive and active programming of internal states, facilitating new methods for model analysis and manipulation.
Findings
Enabled prompt injection detection using internal state probing.
Improved retrieval-augmented retrieval (RAG) with internal state manipulation.
Demonstrated activation steering for response control.
Abstract
Modern artificial intelligence (AI) models are deployed on inference engines to optimize runtime efficiency and resource allocation, particularly for transformer-based large language models (LLMs). The vLLM project is a major open-source library to support model serving and inference. However, the current implementation of vLLM limits programmability of the internal states of deployed models. This prevents the use of popular test-time model alignment and enhancement methods. For example, it prevents the detection of adversarial prompts based on attention patterns or the adjustment of model responses based on activation steering. To bridge this critical gap, we present vLLM Hook, an opensource plug-in to enable the programming of internal states for vLLM models. Based on a configuration file specifying which internal states to capture, vLLM Hook provides seamless integration to vLLM and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Explainable Artificial Intelligence (XAI) · Topic Modeling
