A Survey on Large Language Models in Multimodal Recommender Systems
Alejo Lopez-Avila, Jinhua Du

TL;DR
This survey reviews how large language models are transforming multimodal recommender systems by enabling semantic reasoning and flexible data handling, highlighting recent techniques, challenges, and future directions.
Contribution
It provides a comprehensive taxonomy, summarizes recent methods, and identifies future research directions for integrating LLMs into multimodal recommender systems.
Findings
LLMs enhance semantic understanding in MRS.
Prompting and fine-tuning are key strategies.
Future research should address scalability and accessibility.
Abstract
Multimodal recommender systems (MRS) integrate heterogeneous user and item data, such as text, images, and structured information, to enhance recommendation performance. The emergence of large language models (LLMs) introduces new opportunities for MRS by enabling semantic reasoning, in-context learning, and dynamic input handling. Compared to earlier pre-trained language models (PLMs), LLMs offer greater flexibility and generalisation capabilities but also introduce challenges related to scalability and model accessibility. This survey presents a comprehensive review of recent work at the intersection of LLMs and MRS, focusing on prompting strategies, fine-tuning methods, and data adaptation techniques. We propose a novel taxonomy to characterise integration patterns, identify transferable techniques from related recommendation domains, provide an overview of evaluation metrics and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Recommender Systems and Techniques · Text and Document Classification Technologies
