Loading paper
TAB-PO: Preference Optimization with a Token-Level Adaptive Barrier for Token-Critical Structured Generation | Tomesphere