Evaluation on Entity Matching in Recommender Systems

Zihan Huang; Rohan Surana; Zhouhang Xie; Junda Wu; Yu Xia; Julian McAuley

arXiv:2601.17218·cs.IR·February 3, 2026

Evaluation on Entity Matching in Recommender Systems

Zihan Huang, Rohan Surana, Zhouhang Xie, Junda Wu, Yu Xia, Julian McAuley

PDF

Open Access

TL;DR

This paper introduces a new dataset for cross-dataset entity matching in recommender systems, evaluates various matching methods, and provides a gold standard to facilitate future research in this area.

Contribution

It presents Reddit-Amazon-EM, a manually annotated dataset for entity matching, and offers a comprehensive evaluation of multiple matching techniques including LLM-based approaches.

Findings

01

LLM-based methods perform competitively in entity matching.

02

The dataset enables reproducible evaluation of matching algorithms.

03

The best method achieves high accuracy in cross-dataset entity alignment.

Abstract

Entity matching is a crucial component in various recommender systems, including conversational recommender systems (CRS) and knowledge-based recommender systems. However, the lack of rigorous evaluation frameworks for cross-dataset entity matching impedes progress in areas such as LLM-driven conversational recommendations and knowledge-grounded dataset construction. In this paper, we introduce Reddit-Amazon-EM, a novel dataset comprising naturally occurring items from Reddit and the Amazon '23 dataset. Through careful manual annotation, we identify corresponding movies across Reddit-Movies and Amazon'23, two existing recommender system datasets with inherently overlapping catalogs. Leveraging Reddit-Amazon-EM, we conduct a comprehensive evaluation of state-of-the-art entity matching methods, including rule-based, graph-based, lexical-based, embedding-based, and LLM-based approaches.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Recommender Systems and Techniques · Machine Learning in Healthcare