REALEDIT: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations
Peter Sushko, Ayana Bharadwaj, Zhi Yang Lim, Vasily Ilin, Ben Caffee,, Dongping Chen, Mohammadreza Salehi, Cheng-Yu Hsieh, Ranjay Krishna

TL;DR
REALEDIT is a large-scale, real-world dataset of user-initiated image edits from Reddit, enabling more realistic training and evaluation of image editing models, which significantly improves their performance and applicability.
Contribution
The paper introduces REALEDIT, the first large-scale dataset with authentic user edits, and demonstrates its effectiveness in training models that outperform existing ones on real-world tasks.
Findings
Existing models perform poorly on real user requests.
Training on REALEDIT improves model performance significantly.
REALEDIT enhances deepfake detection capabilities.
Abstract
Existing image editing models struggle to meet real-world demands. Despite excelling in academic benchmarks, they have yet to be widely adopted for real user needs. Datasets that power these models use artificial edits, lacking the scale and ecological validity necessary to address the true diversity of user requests. We introduce REALEDIT, a large-scale image editing dataset with authentic user requests and human-made edits sourced from Reddit. REALEDIT includes a test set of 9300 examples to evaluate models on real user requests. Our results show that existing models fall short on these tasks, highlighting the need for realistic training data. To address this, we introduce 48K training examples and train our REALEDIT model, achieving substantial gains - outperforming competitors by up to 165 Elo points in human judgment and 92 percent relative improvement on the automated VIEScore…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Cell Image Analysis Techniques · Image Retrieval and Classification Techniques
MethodsSparse Evolutionary Training
