Loading paper
Why Self-Rewarding Works: Theoretical Guarantees for Iterative Alignment of Language Models | Tomesphere