Octopus v4: Graph of language models
Wei Chen, Zhiyuan Li

TL;DR
Octopus v4 introduces a novel graph-based approach using functional tokens to coordinate multiple open-source language models, achieving state-of-the-art performance on MMLU with models under 10B parameters.
Contribution
The paper presents Octopus v4, a new model that employs functional tokens and graph structures to effectively coordinate multiple open-source models for improved task performance.
Findings
Achieved SOTA MMLU score of 74.8 with models under 10B parameters.
Demonstrated effective coordination of multiple models via graph data structures.
Enhanced model selection and query reformulation capabilities.
Abstract
Language models have been effective in a wide range of applications, yet the most sophisticated models are often proprietary. For example, GPT-4 by OpenAI and various models by Anthropic are expensive and consume substantial energy. In contrast, the open-source community has produced competitive models, like Llama3. Furthermore, niche-specific smaller language models, such as those tailored for legal, medical or financial tasks, have outperformed their proprietary counterparts. This paper introduces a novel approach that employs \textit{functional tokens} to integrate \textbf{multiple open-source models}, each optimized for particular tasks. Our newly developed Octopus v4 model leverages \textit{functional tokens} to intelligently direct user queries to the most appropriate vertical model and reformat the query to achieve the best performance. Octopus v4, an evolution of the Octopus v1,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗NexaAI/Octopus-v2model· 487 dl· ♡ 890487 dl♡ 890
- 🤗NexaAI/octo-netmodel· 36 dl· ♡ 14436 dl♡ 144
- 🤗NexaAI/octo-net-ggufmodel· 872 dl· ♡ 42872 dl♡ 42
- 🤗QuantFactory/Octopus-v2-GGUFmodel· 165 dl· ♡ 2165 dl♡ 2
- 🤗RichardErkhov/NexaAIDev_-_octo-net-4bitsmodel· 2 dl2 dl
- 🤗RichardErkhov/NexaAIDev_-_octo-net-8bitsmodel· 1 dl1 dl
- 🤗RichardErkhov/NexaAIDev_-_octo-net-awqmodel· 1 dl1 dl
- 🤗RichardErkhov/NexaAIDev_-_Octopus-v2-8bitsmodel· 5 dl5 dl
- 🤗RichardErkhov/NexaAIDev_-_Octopus-v2-awqmodel· 10 dl10 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpecies Distribution and Climate Change
MethodsAttention Is All You Need · Dropout · Softmax · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Absolute Position Encodings · Linear Layer · Dense Connections · Label Smoothing · Residual Connection
