SLIME: Stabilized Likelihood Implicit Margin Enforcement for Preference Optimization
Maksim Afanasyev, Illarion Iov

TL;DR
SLIME introduces a novel alignment method for large language models that stabilizes preference learning, prevents unlearning, and improves generation quality by combining anchoring, stabilization, and dual-margin constraints.
Contribution
It proposes a new reference-free objective that decouples preference optimization from generation quality, addressing critical issues in existing methods.
Findings
SLIME outperforms state-of-the-art baselines in alignment tasks.
It maintains higher generation stability compared to previous methods.
The approach effectively prevents unlearning and formatting collapse.
Abstract
Direct preference optimization methods have emerged as a computationally efficient alternative to Reinforcement Learning from Human Feedback (RLHF) for aligning Large Language Models (LLMs). Latest approaches have streamlined the alignment process by deriving implicit reward functions, yet they often suffer from a critical objective mismatch: optimizing the relative margin between chosen and rejected responses does not guarantee the preservation of the chosen response's absolute likelihood. This can lead to unlearning, where the model degrades the probability of high-quality outputs to satisfy margin constraints, and formatting collapse caused by the over-penalization of rejected sequences. In this work, we introduce SLIME (Stabilized Likelihood Implicit Margin Enforcement), a reference-free alignment objective designed to decouple preference learning from generation quality. SLIME…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Recommender Systems and Techniques · Explainable Artificial Intelligence (XAI)
