Small Language Models for Efficient Agentic Tool Calling: Outperforming Large Models with Targeted Fine-tuning
Polaris Jhandi, Owais Kazi, Shreyas Subramanian, Neel Sendas

TL;DR
This paper demonstrates that small language models, when properly fine-tuned, can outperform larger models in specific agentic tool calling tasks, offering a cost-effective alternative for enterprise AI applications.
Contribution
The study shows that targeted fine-tuning of a 350M parameter model can surpass large models in tool use performance, reducing costs and infrastructure needs.
Findings
Fine-tuned SLM achieved 77.55% pass rate on ToolBench
Outperformed larger models like ChatGPT-CoT and ToolLLaMA variants
Cost-effective approach enables scalable enterprise AI deployment
Abstract
As organizations scale adoption of generative AI, model cost optimization and operational efficiency have emerged as critical factors determining sustainability and accessibility. While Large Language Models (LLMs) demonstrate impressive capabilities across diverse tasks, their extensive computational requirements make them cost-prohibitive for routine enterprise use. This limitation motivates the exploration of Small Language Models (SLMs), which can deliver comparable performance in targeted applications while drastically reducing infrastructure overhead (Irugalbandara et al., 2023). In this work, we investigate the feasibility of replacing LLM-driven workflows with optimized SLMs. We trained a domain-adapted SLM to execute representative tasks traditionally handled by LLMs, such as document summarization, query answering, and structured data interpretation. As part of the experiment,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Materials Science · Topic Modeling
