Large Language Models as Attribution Regularizers for Efficient Model Training
Davor Vukadin, Marin \v{S}ili\'c, Goran Dela\v{c}

TL;DR
This paper introduces a simple regularization technique that uses Large Language Model-generated feature attributions to improve training efficiency and robustness of smaller models, especially in few-shot and imbalanced data scenarios.
Contribution
The paper presents a novel attribution-matching regularization method that leverages black-box LLMs to enhance small model training without significant computational costs.
Findings
Improves small model performance in few-shot learning
Enhances robustness against data skewness and bias
Requires only black-box API access to LLMs
Abstract
Large Language Models (LLMs) have demonstrated remarkable performance across diverse domains. However, effectively leveraging their vast knowledge for training smaller downstream models remains an open challenge, especially in domains like tabular data learning, where simpler models are often preferred due to interpretability and efficiency. In this paper, we introduce a novel yet straightforward method for incorporating LLM-generated global task feature attributions into the training process of smaller networks. Specifically, we propose an attribution-matching regularization term that aligns the training dynamics of the smaller model with the insights provided by the LLM. By doing so, our approach yields superior performance in few-shot learning scenarios. Notably, our method requires only black-box API access to the LLM, making it easy to integrate into existing training pipelines…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Domain Adaptation and Few-Shot Learning
