PAHQ: Accelerating Automated Circuit Discovery through Mixed-Precision Inference Optimization
Xinhai Wang, Shu Yang, Liangyu Wang, Lin Zhang, Huanyi Xie, Lijie Hu, Di Wang

TL;DR
PAHQ significantly accelerates automated circuit discovery in large language models by optimizing mixed-precision quantization, reducing runtime and memory use while maintaining interpretability accuracy.
Contribution
PAHQ introduces a novel mixed-precision quantization approach that enhances the efficiency of automated circuit discovery without sacrificing analytical faithfulness.
Findings
Reduces runtime by up to 80%
Lowers memory consumption by up to 30%
Maintains high interpretability accuracy
Abstract
Circuit discovery, which involves identifying sparse and task-relevant subnetworks in pre-trained language models, is a cornerstone of mechanistic interpretability. Automated Circuit Discovery (ACDC) has emerged as a pivotal methodology in circuit discovery, but its application to large language models is severely limited by computational inefficiency and prohibitively high memory requirements. Although several accelerated approaches have been proposed, they primarily rely on linear approximations to ACDC, which significantly compromises analytical faithfulness. Our proposed method for accelerating automated circuit discovery, Per Attention Head Quantization (PAHQ), takes a fundamentally different approach by optimizing the efficiency of each individual patching operation. PAHQ leverages a fundamental alignment between activation patching and mixed-precision quantization (MPQ):…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
