Dynamic Alignment for Collective Agency: Toward a Scalable Self-Improving Framework for Open-Ended LLM Alignment
Panatchakorn Anantaprayoon, Nataliia Babina, Jad Tarifi, Nima Asgharbeygi

TL;DR
This paper proposes a scalable, self-improving framework for aligning large language models with a holistic value called Collective Agency, using iterative self-assessment and dataset generation, aiming to surpass traditional alignment methods.
Contribution
It introduces Dynamic Alignment, a novel framework enabling LLMs to self-align with Collective Agency through iterative self-evaluation and data generation, enhancing scalability and scope.
Findings
Successfully aligns LLMs to Collective Agency values.
Preserves general NLP capabilities during alignment.
Demonstrates scalability over conventional methods.
Abstract
Large Language Models (LLMs) are typically aligned with human values using preference data or predefined principles such as helpfulness, honesty, and harmlessness. However, as AI systems progress toward Artificial General Intelligence (AGI) and Artificial Superintelligence (ASI), such value systems may become insufficient. In addition, human feedback-based alignment remains resource-intensive and difficult to scale. While AI-feedback-based self-improving alignment methods have been explored as a scalable alternative, they have largely remained constrained to conventional alignment values. In this work, we explore both a more holistic alignment objective and a scalable, self-improving alignment approach. Aiming to transcend conventional alignment norms, we introduce Collective Agency (CA)-a unified and open-ended alignment value that encourages integrated agentic capabilities. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsEthics and Social Impacts of AI · Artificial Intelligence in Healthcare and Education · Topic Modeling
