Causal Direct Preference Optimization for Distributionally Robust Generative Recommendation

Chu Zhao; Enneng Yang; Jianzhe Zhao; Guibing Guo

arXiv:2603.22335·cs.IR·March 25, 2026

Causal Direct Preference Optimization for Distributionally Robust Generative Recommendation

Chu Zhao, Enneng Yang, Jianzhe Zhao, Guibing Guo

PDF

Open Access

TL;DR

This paper introduces CausalDPO, a causal extension of DPO, to improve the out-of-distribution robustness of LLM-based recommendation systems by mitigating environmental confounders through invariance learning.

Contribution

We propose CausalDPO, which incorporates causal invariance learning and backdoor adjustment to enhance the generalization of recommendation models across diverse environments.

Findings

01

CausalDPO outperforms DPO in OOD scenarios with a 17.17% average improvement.

02

Theoretical analysis confirms CausalDPO's ability to capture stable user preferences.

03

Extensive experiments validate the effectiveness of the proposed causal approach.

Abstract

Direct Preference Optimization (DPO) guides large language models (LLMs) to generate recommendations aligned with user historical behavior distributions by minimizing preference alignment loss. However, our systematic empirical research and theoretical analysis reveal that DPO tends to amplify spurious correlations caused by environmental confounders during the alignment process, significantly undermining the generalization capability of LLM-based generative recommendation methods in out of distribution (OOD) scenarios. To mitigate this issue, we propose CausalDPO, an extension of DPO that incorporates a causal invariance learning mechanism. This method introduces a backdoor adjustment strategy during the preference alignment phase to eliminate interference from environmental confounders, explicitly models the latent environmental distribution using a soft clustering approach, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Constraint Satisfaction and Optimization · Mobile Crowdsensing and Crowdsourcing