Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo
Stephen Zhao, Rob Brekelmans, Alireza Makhzani, Roger Grosse

TL;DR
This paper introduces a novel probabilistic inference method for language models using twisted Sequential Monte Carlo, improving sampling efficiency and evaluation accuracy for various LLM safety and generation tasks.
Contribution
It develops learned twist functions for SMC in language models, proposes a contrastive learning approach for these functions, and introduces bidirectional SMC bounds for inference evaluation.
Findings
Twisted SMC effectively samples undesirable outputs for red-teaming.
The method generates diverse sentiment reviews.
It accurately estimates the KL divergence between distributions.
Abstract
Numerous capability and safety techniques of Large Language Models (LLMs), including RLHF, automated red-teaming, prompt engineering, and infilling, can be cast as sampling from an unnormalized target distribution defined by a given reward or potential function over the full sequence. In this work, we leverage the rich toolkit of Sequential Monte Carlo (SMC) for these probabilistic inference problems. In particular, we use learned twist functions to estimate the expected future value of the potential at each timestep, which enables us to focus inference-time computation on promising partial sequences. We propose a novel contrastive method for learning the twist functions, and establish connections with the rich literature of soft reinforcement learning. As a complementary application of our twisted SMC framework, we present methods for evaluating the accuracy of language model inference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
MethodsFocus
