PLDR-LLMs Learn A Generalizable Tensor Operator That Can Replace Its Own Deep Neural Net At Inference
Burc Gokden

TL;DR
This paper introduces PLDR-LLM, a foundational model that learns an invariant tensor operator capable of replacing traditional neural networks at inference, leading to improved efficiency and fidelity in deductive outputs.
Contribution
The paper presents a novel tensor operator learned by PLDR-LLM that can replace neural networks at inference, with high fidelity and caching strategies for efficiency.
Findings
Deductive outputs are invariant tensors with high precision.
Caching of the energy-curvature tensor improves inference speed.
Invariance persists across different training conditions and model initializations.
Abstract
We show that Large Language Model from Power Law Decoder Representations (PLDR-LLM) is a foundational model whose deductive outputs are invariant tensors up to a small perturbation. PLDR-LLM learns a singularity condition for the deductive outputs that enable the once-inferred energy-curvature tensor to replace the deep neural network of power law graph attention (PLGA) generating the deductive outputs at inference. We demonstrate that a cache for (G-cache) and KV-cache can be implemented in a straightforward manner to improve the inference time. The invariance and generalizable nature of deductive outputs is at a very high fidelity where deductive outputs have same RMSE and determinant values up to 15 decimal places after caching, and zero-shot benchmark scores remain unchanged. Ablation studies show that learned deductive outputs have distinct loss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗fromthesky/PLDR-LLM-v51-104Mmodel· 2 dl2 dl
- 🤗fromthesky/PLDR-LLM-v51-110M-1model· 6 dl6 dl
- 🤗fromthesky/PLDR-LLM-v51-110M-2model· 3 dl3 dl
- 🤗fromthesky/PLDR-LLM-v51-110M-3model· 6 dl6 dl
- 🤗fromthesky/PLDR-LLM-v51-110M-4model· 5 dl5 dl
- 🤗fromthesky/PLDR-LLM-v51-110M-5model· 2 dl2 dl
- 🤗fromthesky/PLDR-LLM-v51-DAG-110Mmodel· 1 dl1 dl
- 🤗fromthesky/PLDR-LLM-v51G-106M-1model· 2 dl2 dl
- 🤗fromthesky/PLDR-LLM-v51G-106M-2model· 5 dl5 dl
- 🤗fromthesky/PLDR-LLM-v51G-106M-3model· 5 dl5 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications
MethodsSoftmax · Attention Is All You Need
