Division-of-Thoughts: Harnessing Hybrid Language Model Synergy for   Efficient On-Device Agents

Chenyang Shao; Xinyuan Hu; Yutang Lin; Fengli Xu

arXiv:2502.04392·cs.CL·February 10, 2025

Division-of-Thoughts: Harnessing Hybrid Language Model Synergy for Efficient On-Device Agents

Chenyang Shao, Xinyuan Hu, Yutang Lin, Fengli Xu

PDF

Open Access 1 Repo

TL;DR

Division-of-Thoughts (DoT) is a hybrid reasoning framework that combines local smaller language models with cloud-based large models, significantly reducing costs and reasoning time while maintaining accuracy for on-device AI assistants.

Contribution

We introduce DoT, a novel collaborative reasoning framework that decomposes tasks, schedules sub-tasks, and allocates models efficiently, enabling cost-effective on-device AI reasoning.

Findings

01

Reduces reasoning time by 66.12%.

02

Cuts API costs by 83.57%.

03

Maintains competitive reasoning accuracy.

Abstract

The rapid expansion of web content has made on-device AI assistants indispensable for helping users manage the increasing complexity of online tasks. The emergent reasoning ability in large language models offer a promising path for next-generation on-device AI agents. However, deploying full-scale Large Language Models (LLMs) on resource-limited local devices is challenging. In this paper, we propose Division-of-Thoughts (DoT), a collaborative reasoning framework leveraging the synergy between locally deployed Smaller-scale Language Models (SLMs) and cloud-based LLMs. DoT leverages a Task Decomposer to elicit the inherent planning abilities in language models to decompose user queries into smaller sub-tasks, which allows hybrid language models to fully exploit their respective strengths. Besides, DoT employs a Task Scheduler to analyze the pair-wise dependency of sub-tasks and create a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tsinghua-fib-lab/DoT
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Business Process Modeling and Analysis

MethodsAdapter