ReTern: Exploiting Natural Redundancy and Sign Transformations for Enhanced Fault Tolerance in Compute-in-Memory based Ternary LLMs

Akul Malhotra; Sumeet Kumar Gupta

arXiv:2506.01140·cs.AR·June 3, 2025

ReTern: Exploiting Natural Redundancy and Sign Transformations for Enhanced Fault Tolerance in Compute-in-Memory based Ternary LLMs

Akul Malhotra, Sumeet Kumar Gupta

PDF

Open Access

TL;DR

ReTern enhances fault tolerance in ternary LLMs on TCiM accelerators by using fault-aware sign transformations and exploiting natural redundancy, significantly reducing perplexity under faults with minimal overhead.

Contribution

The paper introduces ReTern, a novel method combining fault-aware sign transformations and bit-cell reprogramming to improve fault tolerance in ternary LLMs on TCiM hardware.

Findings

01

35% reduction in perplexity under faults

02

Less than 3% energy overhead

03

Less than 7% latency overhead

Abstract

Ternary large language models (LLMs), which utilize ternary precision weights and 8-bit activations, have demonstrated competitive performance while significantly reducing the high computational and memory requirements of full-precision LLMs. The energy efficiency and performance of Ternary LLMs can be further improved by deploying them on ternary computing-in-memory (TCiM) accelerators, thereby alleviating the von-Neumann bottleneck. However, TCiM accelerators are prone to memory stuck-at faults (SAFs) leading to degradation in the model accuracy. This is particularly severe for LLMs due to their low weight sparsity. To boost the SAF tolerance of TCiM accelerators, we propose ReTern that is based on (i) fault-aware sign transformations (FAST) and (ii) TCiM bit-cell reprogramming exploiting their natural redundancy. The key idea is to utilize FAST to minimize computations errors due to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFuel Cells and Related Materials · Fault Detection and Control Systems · Brain Tumor Detection and Classification