Towards a Data-Parameter Correspondence for LLMs: A Preliminary Discussion
Ou Wu

TL;DR
This paper introduces a unified geometric framework linking data-centric and model-centric methods in large language model optimization, revealing their underlying mathematical correspondence.
Contribution
It establishes a formal data-parameter correspondence grounded in information geometry, unifying diverse optimization techniques in LLMs.
Findings
Data pruning and parameter sparsification reduce model manifold volume similarly.
In-context learning and LoRA explore identical subspaces on the Grassmannian.
Adversarial attacks and defenses exhibit dual behaviors in data and parameter spaces.
Abstract
Large language model optimization has historically bifurcated into isolated data-centric and model-centric paradigms: the former manipulates involved samples through selection, augmentation, or poisoning, while the latter tunes model weights via masking, quantization, or low-rank adaptation. This paper establishes a unified \emph{data-parameter correspondence} revealing these seemingly disparate operations as dual manifestations of the same geometric structure on the statistical manifold . Grounded in the Fisher-Rao metric and Legendre duality between natural () and expectation () parameters, we identify three fundamental correspondences spanning the model lifecycle: 1. Geometric correspondence: data pruning and parameter sparsification equivalently reduce manifold volume via dual coordinate constraints; 2. Low-rank correspondence: in-context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
