Self-Training Large Language Models for Tool-Use Without Demonstrations
Ne Luo, Aryo Pradipta Gema, Xuanli He, Emile van Krieken, Pietro, Lesci, Pasquale Minervini

TL;DR
This paper explores enabling large language models to learn tool use without demonstrations by analyzing prompting strategies and self-training, aiming to improve factual accuracy and reasoning without relying on curated examples.
Contribution
It introduces a self-training method for LLMs to learn tool use without demonstrations, comparing fine-tuning techniques and analyzing zero-shot prompting strategies.
Findings
Tool-use improves performance on long-tail knowledge tasks by 3.7%.
Mixed results on standard QA datasets indicate challenges in generalization.
Self-training shows potential but requires further refinement.
Abstract
Large language models (LLMs) remain prone to factual inaccuracies and computational errors, including hallucinations and mistakes in mathematical reasoning. Recent work augmented LLMs with tools to mitigate these shortcomings, but often requires curated gold tool-use demonstrations. In this paper, we investigate whether LLMs can learn to use tools without demonstrations. First, we analyse zero-shot prompting strategies to guide LLMs in tool utilisation. Second, we propose a self-training method to synthesise tool-use traces using the LLM itself. We compare supervised fine-tuning and preference fine-tuning techniques for fine-tuning the model on datasets constructed using existing Question Answering (QA) datasets, i.e., TriviaQA and GSM8K. Experiments show that tool-use enhances performance on a long-tail knowledge task: 3.7% on PopQA, which is used solely for evaluation, but leads to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Data Classification
