DOCS: Quantifying Weight Similarity for Deeper Insights into Large   Language Models

Zeping Min; Xinshang Wang

arXiv:2501.16650·cs.CL·January 29, 2025

DOCS: Quantifying Weight Similarity for Deeper Insights into Large Language Models

Zeping Min, Xinshang Wang

PDF

Open Access

TL;DR

This paper introduces DOCS, a new index for measuring weight matrix similarity in large language models, revealing layer clustering and functional patterns to enhance understanding and interpretability of LLM architectures.

Contribution

The paper presents DOCS, a novel similarity index for LLM weights, and demonstrates its effectiveness in analyzing layer relationships and functional specialization.

Findings

01

Adjacent layers often have high weight similarity.

02

Weight matrices tend to form clusters indicating specialization.

03

DOCS is theoretically effective for orthogonal matrices.

Abstract

We introduce a novel index, the Distribution of Cosine Similarity (DOCS), for quantitatively assessing the similarity between weight matrices in Large Language Models (LLMs), aiming to facilitate the analysis of their complex architectures. Leveraging DOCS, our analysis uncovers intriguing patterns in the latest open-source LLMs: adjacent layers frequently exhibit high weight similarity and tend to form clusters, suggesting depth-wise functional specialization. Additionally, we prove that DOCS is theoretically effective in quantifying similarity for orthogonal matrices, a crucial aspect given the prevalence of orthogonal initializations in LLMs. This research contributes to a deeper understanding of LLM architecture and behavior, offering tools with potential implications for developing more efficient and interpretable models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Biomedical Text Mining and Ontologies