Loading paper
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use | Tomesphere