TRACE: Tourism Recommendation with Accountable Citation Evidence
Zixu Zhao, Sijin Wang, Yu Hou, Yuanyuan Xu, Yufan Sheng, Xike Xie, Wenjie Zhang, Won-Yong Shin, Xin Cao

TL;DR
TRACE introduces a multi-turn tourism recommendation benchmark with verifiable evidence and rejection recovery, addressing trustworthiness and adaptiveness gaps in existing conversational recommender systems.
Contribution
It provides a new dataset and evaluation framework for trustworthy, verifiable, and adaptive tourism recommendations with multi-turn dialogues and citation evidence.
Findings
LLMs excel in recall and rejection recovery but cite less densely.
Retrievers achieve better grounding but lower accuracy.
Grounding Score correlates strongly with human citation precision.
Abstract
Tourism is a high-stakes setting for conversational recommender systems (CRS): a plausible-sounding suggestion can waste real money and trip time once a traveler acts on it. Existing CRS benchmarks primarily evaluate systems with a single Recall@k score over entity mentions, and tourism-specific resources add spatial or knowledge-graph context, yet none of them couple multi-turn recommendation with verbatim review-span evidence and rejection recovery. This leaves an evaluation gap for tourism recommendation that is simultaneously trustworthy, verifiable, and adaptive: recommend the right point of interest (POI) for multi-aspect preferences (such as cuisine, price, atmosphere, walking distance), justify each suggestion with verifiable evidence from prior visitors so the traveler can act without trial and error, and recover when the first recommendation is rejected mid-dialogue. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
