Relational Knowledge Distillation Using Fine-tuned Function Vectors
Andrea Kang, Yingnian Wu, Hongjing Lu

TL;DR
This paper demonstrates that fine-tuning function vectors with minimal examples improves relational understanding and reasoning in language models, enhancing interpretability and performance on analogy tasks.
Contribution
It introduces a method for fine-tuning function vectors to better encode relational knowledge and proposes the composite function vector for improved analogical reasoning.
Findings
Fine-tuning with ~20 examples improves relation-based word completion.
Fine-tuned vectors outperform original vectors in decoding relation words.
Composite function vectors enhance analogy reasoning performance.
Abstract
Representing relations between concepts is a core prerequisite for intelligent systems to make sense of the world. Recent work using causal mediation analysis has shown that a small set of attention heads encodes task representation in in-context learning, captured in a compact representation known as the function vector. We show that fine-tuning function vectors with only a small set of examples (about 20 word pairs) yields better performance on relation-based word-completion tasks than using the original vectors derived from causal mediation analysis. These improvements hold for both small and large language models. Moreover, the fine-tuned function vectors yield improved decoding performance for relation words and show stronger alignment with human similarity judgments of semantic relations. Next, we introduce the composite function vector - a weighted combination of fine-tuned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Child and Animal Learning Development · Advanced Graph Neural Networks
