Synthetic Data and Simulators for Recommendation Systems: Current State and Future Directions
Adam Lesnikowski, Gabriel de Souza Pereira Moreira, Sara Rabhi, Karl, Byleen-Higley

TL;DR
This paper reviews the current state and future prospects of using synthetic data and simulators to enhance recommendation systems, focusing on data fidelity, privacy, and integration of real and synthetic data.
Contribution
It provides a comprehensive overview of synthetic data and simulators in recommendation systems, highlighting key trade-offs, current successes, limitations, and future research directions.
Findings
Synthetic data improves recommendation robustness.
Trade-off between data fidelity and privacy is critical.
Future directions include mixing real and synthetic data.
Abstract
Synthetic data and simulators have the potential to markedly improve the performance and robustness of recommendation systems. These approaches have already had a beneficial impact in other machine-learning driven fields. We identify and discuss a key trade-off between data fidelity and privacy in the past work on synthetic data and simulators for recommendation systems. For the important use case of predicting algorithm rankings on real data from synthetic data, we provide motivation and current successes versus limitations. Finally we outline a number of exciting future directions for recommendation systems that we believe deserve further attention and work, including mixing real and synthetic data, feedback in dataset generation, robust simulations, and privacy-preserving methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Recommender Systems and Techniques · Traffic Prediction and Management Techniques
