Sem-DPO: Mitigating Semantic Inconsistency in Preference Optimization for Prompt Engineering

Anas Mohamed; Azal Ahmad Khan; Xinran Wang; Ahmad Faraz Khan; Shuwen Ge; Saman Bahzad Khan; Ayaan Ahmad; Ali Anwar

arXiv:2507.20133·cs.CL·July 30, 2025

Sem-DPO: Mitigating Semantic Inconsistency in Preference Optimization for Prompt Engineering

Anas Mohamed, Azal Ahmad Khan, Xinran Wang, Ahmad Faraz Khan, Shuwen Ge, Saman Bahzad Khan, Ayaan Ahmad, Ali Anwar

PDF

TL;DR

Sem-DPO enhances preference optimization in prompt engineering by maintaining semantic consistency, leading to improved image and language model outputs while providing theoretical bounds on semantic drift.

Contribution

Introduces Sem-DPO, a novel variant of DPO that incorporates semantic-aware weighting to reduce semantic drift in prompt optimization.

Findings

01

Sem-DPO achieves 8-12% higher CLIP similarity than DPO.

02

Sem-DPO attains 5-9% higher human-preference scores.

03

Sem-DPO outperforms state-of-the-art baselines on standard benchmarks.

Abstract

Generative AI can now synthesize strikingly realistic images from text, yet output quality remains highly sensitive to how prompts are phrased. Direct Preference Optimization (DPO) offers a lightweight, off-policy alternative to RL for automatic prompt engineering, but its token-level regularization leaves semantic inconsistency unchecked as prompts that win higher preference scores can still drift away from the user's intended meaning. We introduce Sem-DPO, a variant of DPO that preserves semantic consistency yet retains its simplicity and efficiency. Sem-DPO adjusts the DPO loss using a weight based on how different the winning prompt is from the original, reducing the impact of training examples that are semantically misaligned. We provide the first analytical bound on semantic drift for preference-tuned prompt generators, showing that Sem-DPO keeps learned prompts within a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.