TroL: Traversal of Layers for Large Language and Vision Models

Byung-Kwan Lee; Sangyun Chung; Chae Won Kim; Beomchan Park; Yong Man; Ro

arXiv:2406.12246·cs.LG·September 26, 2024

TroL: Traversal of Layers for Large Language and Vision Models

Byung-Kwan Lee, Sangyun Chung, Chae Won Kim, Beomchan Park, Yong Man, Ro

PDF

Open Access 1 Repo 3 Models 1 Video

TL;DR

TroL introduces a layer traversal technique for large language and vision models, enabling smaller models to achieve performance comparable to larger ones by reusing layers efficiently during inference.

Contribution

The paper proposes a novel layer traversal method that allows smaller LLVMs to mimic larger models' performance without increasing model size.

Findings

01

TroL outperforms larger open-source LLVMs in various tasks.

02

TroL rivals closed-source LLVMs with much larger sizes.

03

Layer traversal enhances efficiency and performance of small models.

Abstract

Large language and vision models (LLVMs) have been driven by the generalization power of large language models (LLMs) and the advent of visual instruction tuning. Along with scaling them up directly, these models enable LLVMs to showcase powerful vision language (VL) performances by covering diverse tasks via natural language instructions. However, existing open-source LLVMs that perform comparably to closed-source LLVMs such as GPT-4V are often considered too large (e.g., 26B, 34B, and 110B parameters), having a larger number of layers. These large models demand costly, high-end resources for both training and inference. To address this issue, we present a new efficient LLVM family with 1.8B, 3.8B, and 7B LLM model sizes, Traversal of Layers (TroL), which enables the reuse of layers in a token-wise manner. This layer traversing technique simulates the effect of looking back and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

byungkwanlee/trol
pytorchOfficial

Models

Videos

TroL: Traversal of Layers for Large Language and Vision Models· underline

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques