Generalization and Memorization in Rectified Flow
Mingxing Rao, Daniel Moyer

TL;DR
This paper investigates how Rectified Flow models memorize training data, introduces new metrics to measure memorization, and proposes a temporal regularization method to reduce memorization without sacrificing image quality.
Contribution
It develops a complexity-calibrated metric for memorization detection, uncovers temporal patterns in memorization susceptibility, and introduces a novel sampling strategy to mitigate memorization in RF models.
Findings
Memorization peaks at the training midpoint under standard methods.
The new metric $T_{mc extunderscore cal}$ improves attack detection performance.
Temporal regularization reduces memorization while maintaining image quality.
Abstract
Generative models based on the Flow Matching objective, particularly Rectified Flow, have emerged as a dominant paradigm for efficient, high-fidelity image synthesis. However, while existing research heavily prioritizes generation quality and architectural scaling, the underlying dynamics of how RF models memorize training data remain largely underexplored. In this paper, we systematically investigate the memorization behaviors of RF through the test statistics of Membership Inference Attacks (MIA). We progressively formulate three test statistics, culminating in a complexity-calibrated metric () that successfully decouples intrinsic image spatial complexity from genuine memorization signals. This calibration yields a significant performance surge -- boosting attack AUC by up to 15\% and the privacy-critical TPR@1\%FPR metric by up to 45\% -- establishing the first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security
