Matrix-Driven Identification and Reconstruction of LLM Weight Homology

Ruichong Zhang; Daniel Goldstein

arXiv:2508.06309·cs.CL·February 2, 2026

Matrix-Driven Identification and Reconstruction of LLM Weight Homology

Ruichong Zhang, Daniel Goldstein

PDF

Open Access

TL;DR

The paper introduces MDIR, a novel matrix-based method for identifying and reconstructing weight homology in large language models, achieving state-of-the-art accuracy without requiring model inference.

Contribution

MDIR is the first method to accurately detect weight correspondences and their statistical significance between models using matrix analysis and Large Deviation Theory.

Findings

01

Achieves perfect AUC and accuracy scores on LeaFBench.

02

Does not require model inference, suitable for low-resource devices.

03

Effectively detects unattributed weight reuse or replication.

Abstract

We propose Matrix-Driven Identification and Reconstruction (MDIR), a SOTA large language model homology method that accurately detects weight correspondences between models and provides rigorous $p$ -value estimation of the statistical significance of these correspondences. Our method does not require model inference, and allows the detection of unattributed reuse or replication of model weights even on low-resource devices as it compares only a single pair of matrices at a time. We leverage matrix analysis, polar decomposition, and Large Deviation Theory (LDT) to achieve accurate reconstruction of weight relationships between models. Notably, MDIR is the first method to achieve perfect scores on both Area-Under-Curve (AUC) and accuracy metrics across different source models on LeaFBench.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAcademic integrity and plagiarism