Learning Parametric Distributions from Samples and Preferences
Marc Jourdan, Gizem Y\"uce, Nicolas Flammarion

TL;DR
This paper demonstrates that preference feedback can significantly improve parameter estimation in continuous distributions, achieving faster convergence rates than sample-only methods under certain conditions.
Contribution
It introduces preference-based estimators that outperform traditional sample-based estimators, achieving an estimation error of order 1/n, and provides theoretical lower bounds for these rates.
Findings
Preference-based estimators have lower asymptotic variance.
Deterministic preferences lead to an estimation error of order 1/n.
Lower bounds match the accelerated convergence rate.
Abstract
Recent advances in language modeling have underscored the role of preference feedback in enhancing model performance. This paper investigates the conditions under which preference feedback improves parameter estimation in classes of continuous parametric distributions. In our framework, the learner observes pairs of samples from an unknown distribution along with their relative preferences depending on the same unknown parameter. We show that preference-based M-estimators achieve a better asymptotic variance than sample-only M-estimators, further improved by deterministic preferences. Leveraging the hard constraints revealed by deterministic preferences, we propose an estimator achieving an estimation error scaling of -- a significant improvement over the rate attainable with samples alone. Next, we establish a lower bound that matches this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Bayesian Modeling and Causal Inference
