KARMA: Knowledge-Action Regularized Multimodal Alignment for Personalized Search at Taobao

Zhi Sun; Wenming Zhang; Yi Wei; Liren Yu; Zhixuan Zhang; Dan Ou; Haihong Tang

arXiv:2603.22779·cs.IR·April 1, 2026

KARMA: Knowledge-Action Regularized Multimodal Alignment for Personalized Search at Taobao

Zhi Sun, Wenming Zhang, Yi Wei, Liren Yu, Zhixuan Zhang, Dan Ou, Haihong Tang

PDF

TL;DR

KARMA is a novel framework that enhances personalized search by balancing semantic knowledge preservation and action alignment, leading to improved retrieval and ranking performance in Taobao.

Contribution

It introduces a regularization approach that mitigates semantic collapse in LLM fine-tuning for personalized search, achieving significant performance gains.

Findings

01

Semantic collapse is mitigated by KARMA's regularization.

02

KARMA improves HR@200 by up to 22.5 points.

03

Online deployment increases GMV by 0.9%.

Abstract

Large Language Models (LLMs) are equipped with profound semantic knowledge, making them a natural choice for injecting semantic generalization into personalized search systems. However, in practice we find that directly fine-tuning LLMs on industrial personalized tasks (e.g. next item prediction) often yields suboptimal results. We attribute this bottleneck to a critical Knowledge--Action Gap: the inherent conflict between preserving pre-trained semantic knowledge and aligning with specific personalized actions by discriminative objectives. Empirically, action-only training objectives induce Semantic Collapse, such as attention "sinks". This degradation severely cripples the LLM's generalization, failing to bring improvements to personalized search systems. We propose KARMA (Knowledge--Action Regularized Multimodal Alignment), a unified framework that treats semantic reconstruction as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.