Highly Efficient and Effective LLMs with Multi-Boolean Architectures

Ba-Hien Tran; Van Minh Nguyen

arXiv:2505.22811·stat.ML·April 22, 2026

Highly Efficient and Effective LLMs with Multi-Boolean Architectures

Ba-Hien Tran, Van Minh Nguyen

PDF

1 Video

TL;DR

This paper introduces a novel multi-Boolean architecture for large language models that allows direct finetuning in the Boolean domain, significantly reducing complexity and outperforming existing low-bit methods.

Contribution

It proposes a new framework for LLMs using multi-kernel Boolean parameters, enabling direct Boolean finetuning without latent weights, enhancing efficiency and capacity.

Findings

01

Outperforms recent ultra low-bit quantization techniques

02

Enables direct finetuning in the Boolean domain

03

Reduces complexity during finetuning and inference

Abstract

Weight binarization has emerged as a promising strategy to reduce the complexity of large language models (LLMs). Existing approaches fall into post-training binarization, which is simple but causes severe performance loss, and training-aware methods, which depend on full-precision latent weights, adding complexity and limiting efficiency. We propose a novel framework that represents LLMs with multi-kernel Boolean parameters and, for the first time, enables direct finetuning LMMs in the Boolean domain, eliminating the need for latent weights. This enhances representational capacity and dramatically reduces complexity during both finetuning and inference. Extensive experiments across diverse LLMs show our method outperforms recent ultra low-bit quantization and binarization techniques.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Highly Efficient and Effective LLMs with Multi-Boolean Architectures· slideslive