LM-Fix: Lightweight Bit-Flip Detection and Rapid Recovery Framework for Language Models

Ahmad Tahmasivand; Noureldin Zahran; Saba Al-Sayouri; Mohammed Fouda; Khaled N. Khasawneh

arXiv:2511.02866·cs.SE·February 25, 2026

LM-Fix: Lightweight Bit-Flip Detection and Rapid Recovery Framework for Language Models

Ahmad Tahmasivand, Noureldin Zahran, Saba Al-Sayouri, Mohammed Fouda, Khaled N. Khasawneh

PDF

TL;DR

LM-Fix is a lightweight framework that detects and repairs bit-flip faults in large language models rapidly, ensuring reliability with minimal overhead and avoiding full model reloads.

Contribution

Introduces LM-Fix, a novel lightweight detection and recovery framework for fault tolerance in large language models that is faster and less resource-intensive than existing methods.

Findings

01

Detects over 94% of single-bit flips at TVL=200

02

Nearly 100% detection of multi-bit flips

03

Recovery is over 100x faster than reloading the model

Abstract

This paper presents LM-Fix, a lightweight detection and rapid recovery framework for faults in large language models (LLMs). Existing integrity approaches are often heavy or slow for modern LLMs. LM-Fix runs a short test-vector pass and uses hash-guided checks to detect bit-flip faults, then repairs them locally without a full reload. Across multiple models, it detects over 94% of single-bit flips at TVL=200 and nearly 100% of multi-bit flips with approximately 1% to 7.7% runtime overhead; recovery is more than 100x faster than reloading. These results show a practical, low-overhead solution to keep LLMs reliable in production

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.