Unveiling the Mystery of Weight in Large Foundation Models: Gaussian   Distribution Never Fades

Chongjie Si; Jingjing Jiang; Wei Shen

arXiv:2501.10661·cs.LG·January 22, 2025

Unveiling the Mystery of Weight in Large Foundation Models: Gaussian Distribution Never Fades

Chongjie Si, Jingjing Jiang, Wei Shen

PDF

Open Access

TL;DR

This paper investigates the weight distributions in large foundation models, revealing they follow Gaussian patterns and that transformation weights help adapt models by increasing weight variability, which aids in downstream tasks.

Contribution

It uncovers the Gaussian nature of LFM weights, their relationship with Gaussian noise, and how transformation weights facilitate model adaptation, providing foundational insights.

Findings

01

Weights follow Gaussian distribution regardless of initialization

02

Transformation weights increase weight standard deviation with depth

03

Effective in LFM adaptation and editing tasks

Abstract

This paper presents a pioneering exploration of the mechanisms underlying large foundation models' (LFMs) weights, aiming to simplify AI research. Through extensive observation and analysis on prevailing LFMs, we find that regardless of initialization strategies, their weights predominantly follow a Gaussian distribution, with occasional sharp, inverted T-shaped, or linear patterns. We further discover that the weights share the i.i.d. properties of Gaussian noise, and explore their direct relationship. We find that transformation weights can be derived from Gaussian noise, and they primarily serve to increase the standard deviation of pre-trained weights, with their standard deviation growing with layer depth. In other words, transformation weights broaden the acceptable deviation from the optimal weights, facilitating adaptation to downstream tasks. Building upon the above…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGroundwater flow and contamination studies