Token-level Accept or Reject: A Micro Alignment Approach for Large Language Models

Yang Zhang; Yu Yu; Bo Tang; Yu Zhu; Chuxiong Sun; Wenqiang Wei; Jie Hu; Zipeng Xie; Zhiyu Li; Feiyu Xiong; Edward Chung

arXiv:2505.19743·cs.CL·August 19, 2025

Token-level Accept or Reject: A Micro Alignment Approach for Large Language Models

Yang Zhang, Yu Yu, Bo Tang, Yu Zhu, Chuxiong Sun, Wenqiang Wei, Jie Hu, Zipeng Xie, Zhiyu Li, Feiyu Xiong, Edward Chung

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper introduces MARA, a token-level alignment method for LLMs that improves alignment accuracy and efficiency without requiring model fine-tuning, by classifying tokens as accepted or rejected.

Contribution

MARA is a novel, model-independent approach that simplifies alignment by decomposing sentence preferences into token-level decisions using a lightweight classifier.

Findings

01

MARA outperforms existing methods in alignment accuracy across multiple LLMs.

02

MARA reduces computational costs compared to traditional fine-tuning methods.

03

MARA is effective across diverse datasets and models.

Abstract

With the rapid development of Large Language Models (LLMs), aligning these models with human preferences and values is critical to ensuring ethical and safe applications. However, existing alignment techniques such as RLHF or DPO often require direct fine-tuning on LLMs with billions of parameters, resulting in substantial computational costs and inefficiencies. To address this, we propose Micro token-level Accept-Reject Aligning (MARA) approach designed to operate independently of the language models. MARA simplifies the alignment process by decomposing sentence-level preference learning into token-level binary classification, where a compact three-layer fully-connected network determines whether candidate tokens are "Accepted" or "Rejected" as part of the response. Extensive experiments across seven different LLMs and three open-source datasets show that MARA achieves significant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

iaar-shanghai/mara
pytorchOfficial

Models

🤗
IAAR-Shanghai/MARA_AGENTS
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsDirect Preference Optimization