Inference-Aware Meta-Alignment of LLMs via Non-Linear GRPO

Shokichi Takakura; Akifumi Wachi; Rei Higuchi; Kohei Miyaguchi; Taiji Suzuki

arXiv:2602.01603·stat.ML·February 3, 2026

Inference-Aware Meta-Alignment of LLMs via Non-Linear GRPO

Shokichi Takakura, Akifumi Wachi, Rei Higuchi, Kohei Miyaguchi, Taiji Suzuki

PDF

Open Access

TL;DR

This paper introduces IAMA, a method that trains large language models to efficiently align with multiple human preferences at inference time, reducing computational costs through a novel non-linear optimization approach.

Contribution

It proposes a new training framework and a non-linear GRPO algorithm enabling LLMs to be aligned to multiple criteria efficiently at inference time.

Findings

01

IAMA effectively reduces inference-time computational costs.

02

Non-linear GRPO converges to optimal solutions in probability measure space.

03

The approach improves multi-criteria alignment flexibility.

Abstract

Aligning large language models (LLMs) to diverse human preferences is fundamentally challenging since criteria can often conflict with each other. Inference-time alignment methods have recently gained popularity as they allow LLMs to be aligned to multiple criteria via different alignment algorithms at inference time. However, inference-time alignment is computationally expensive since it often requires multiple forward passes of the base model. In this work, we propose inference-aware meta-alignment (IAMA), a novel approach that enables LLMs to be aligned to multiple criteria with limited computational budget at inference time. IAMA trains a base model such that it can be effectively aligned to multiple tasks via different inference-time alignment algorithms. To solve the non-linear optimization problems involved in IAMA, we propose non-linear GRPO, which provably converges to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)