The Viscosity of Logic: Phase Transitions and Hysteresis in DPO Alignment

Marco Pollanen

arXiv:2601.17260·cs.LG·January 27, 2026

The Viscosity of Logic: Phase Transitions and Hysteresis in DPO Alignment

Marco Pollanen

PDF

Open Access

TL;DR

This paper investigates how the alignment parameter in DPO affects model behavior, revealing non-monotonic capability responses, phase transitions, and hysteresis effects, suggesting the need for nuanced evaluation methods.

Contribution

It provides a detailed analysis of DPO's phase transitions and hysteresis in model alignment, highlighting the complex relationship between alignment pressure and capability.

Findings

01

Capability peaks sharply near a specific $eta$ value

02

Different architectures exhibit distinct response modes

03

Training exposure to high $eta$ causes persistent capability loss

Abstract

Direct Preference Optimization (DPO) is often tuned as if increasing alignment pressure (controlled by $β$ ) yields progressively "better" behavior. We instead treat $β$ as a control parameter and densely sweep it for three 7B open-weight families under a fixed DPO recipe. In Mistral, capability is sharply non-monotonic: aggregated logic-probe margins become positive only in a narrow band near $β \approx 1 0^{- 2}$ and revert outside it, with boundary points that are seed-sensitive. Across architectures under the same sweep, we observe qualitatively different response modes: sharp reorganization in Mistral, selective changes in Llama, and smooth trade-offs in Qwen. Critically, the DPO preference margin can anticorrelate with reasoning capability (Pearson $r = - 0.91$ for Llama logic), so margin-based selection can prefer capability-impaired models. Training path also matters:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVLSI and FPGA Design Techniques · Advanced Multi-Objective Optimization Algorithms · Constraint Satisfaction and Optimization