Integrating External Tools with Large Language Models to Improve Accuracy

Nripesh Niketan; Hadj Batatia

arXiv:2507.08034·cs.CL·July 14, 2025

Integrating External Tools with Large Language Models to Improve Accuracy

Nripesh Niketan, Hadj Batatia

PDF

TL;DR

This paper introduces Athena, a framework that integrates external tools with large language models to significantly improve their accuracy in educational and reasoning tasks by accessing APIs and computational tools.

Contribution

The paper presents a novel framework for integrating external APIs and tools with LLMs, enhancing their reasoning accuracy in educational contexts.

Findings

01

Achieves 83% accuracy in mathematical reasoning

02

Achieves 88% accuracy in scientific reasoning

03

Outperforms state-of-the-art models like GPT-4o and LLaMA-Large

Abstract

This paper deals with improving querying large language models (LLMs). It is well-known that without relevant contextual information, LLMs can provide poor quality responses or tend to hallucinate. Several initiatives have proposed integrating LLMs with external tools to provide them with up-to-date data to improve accuracy. In this paper, we propose a framework to integrate external tools to enhance the capabilities of LLMs in answering queries in educational settings. Precisely, we develop a framework that allows accessing external APIs to request additional relevant information. Integrated tools can also provide computational capabilities such as calculators or calendars. The proposed framework has been evaluated using datasets from the Multi-Modal Language Understanding (MMLU) collection. The data consists of questions on mathematical and scientific reasoning. Results compared to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.