SPARTA ALIGNMENT: Collectively Aligning Multiple Language Models through Combat
Yuru Jiang, Wenxuan Ding, Shangbin Feng, Greg Durrett, Yulia Tsvetkov

TL;DR
SPARTA ALIGNMENT introduces a collective, competitive approach to align multiple language models by having them compete and evaluate each other, leading to improved performance and diversity in outputs.
Contribution
It presents a novel iterative alignment method where multiple LLMs compete and learn from each other using an Elo-based evaluation system, enhancing alignment and diversity.
Findings
Outperforms initial models and baselines on 10/12 tasks with 7.0% average improvement.
Generalizes well to unseen tasks, leveraging model diversity.
Produces more logical, direct, and informative outputs.
Abstract
We propose SPARTA ALIGNMENT, an algorithm to collectively align multiple LLMs through competition and combat. To complement a single model's lack of diversity in generation and biases in evaluation, multiple LLMs form a "sparta tribe" to compete against each other in fulfilling instructions while serving as judges for the competition of others. For each iteration, one instruction and two models are selected for a duel, the other models evaluate the two responses, and their evaluation scores are aggregated through a adapted elo-ranking based reputation system, where winners/losers of combat gain/lose weight in evaluating others. The peer-evaluated combat results then become preference pairs where the winning response is preferred over the losing one, and all models learn from these preferences at the end of each iteration. SPARTA ALIGNMENT enables the self-evolution of multiple LLMs in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsEthics and Social Impacts of AI · Artificial Intelligence in Games · Mobile Crowdsensing and Crowdsourcing
MethodsALIGN
