BitRL: Reinforcement Learning with 1-bit Quantized Language Models for Resource-Constrained Edge Deployment

Md. Ashiq Ul Islam Sajid; Mohammad Sakib Mahmood; Md. Tareq Hasan; Md Abdur Rahim; Rafat Ara; Md. Arafat Hossain

arXiv:2604.24273·cs.LG·April 28, 2026

BitRL: Reinforcement Learning with 1-bit Quantized Language Models for Resource-Constrained Edge Deployment

Md. Ashiq Ul Islam Sajid, Mohammad Sakib Mahmood, Md. Tareq Hasan, Md Abdur Rahim, Rafat Ara, Md. Arafat Hossain

PDF

TL;DR

BitRL introduces 1-bit quantized language models for reinforcement learning, enabling efficient on-device decision-making with significant resource savings while maintaining high task performance.

Contribution

The paper presents a novel framework combining 1-bit quantized language models with reinforcement learning for resource-constrained edge deployment, including theoretical analysis and practical implementation.

Findings

01

Achieves 10-16x memory reduction and 3-5x energy efficiency improvements.

02

Maintains 85-98% of task performance across benchmarks.

03

Provides theoretical bounds for quantized policy gradient convergence.

Abstract

The deployment of intelligent reinforcement learning (RL) agents on resource-constrained edge devices remains a fundamental challenge due to the substantial memory, computational, and energy requirements of modern deep learning systems. While large language models (LLMs) have emerged as powerful architectures for decision-making agents, their multi-billion parameter scale confines them to cloud-based deployment, raising concerns about latency, privacy, and connectivity dependence. We introduce BitRL, a framework for building RL agents using 1-bit quantized language models that enables practical on-device learning and inference under severe resource constraints. Leveraging the BitNet b1.58 architecture with ternary weights (-1, 0, +1) and an optimized inference stack, BitRL achieves 10-16x memory reduction and 3-5x energy efficiency improvements over full-precision baselines while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.