Enhancing Japanese Large Language Models with Reasoning Vectors
Carolina Minami Oguchi, Leo Wei, Koyo Kobayashi, Hsin-Tai Wu, Dipak Ghosal

TL;DR
This paper introduces a method to improve Japanese large language models by using reasoning vectors extracted from reasoning LLMs, achieving significant performance boosts despite resource constraints.
Contribution
The paper proposes a novel approach of applying reasoning vectors to Japanese LLMs, addressing resource limitations and enhancing reasoning capabilities.
Findings
Significant performance improvements in Japanese LLMs
Effective use of reasoning vectors from reasoning LLMs
Resource-efficient method for language-specific model enhancement
Abstract
Post-training methods have improved the performance and enhanced the reasoning capability for mainstream large language models (LLMs), but the same is challenging for Japanese LLMs to achieve due to the amount of resources required. Inspired by task vectors that extract the change of weights before and after training, specifically for a certain task, we obtain reasoning vectors from reasoning LLMs and apply them to Japanese LLMs to boost their performance. While the resources available present a challenge to improve Japanese LLMs, we present a simple and effective way to obtain high improvement and hope to inspire for other languages.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
