Direct Behavior Optimization: Unlocking the Potential of Lightweight LLMs

Hongming Yang; Shi Lin; Jun Shao; Changting Lin; Donghai Zhu; Meng Han; Qinglei Kong

arXiv:2506.06401·cs.CL·June 10, 2025

Direct Behavior Optimization: Unlocking the Potential of Lightweight LLMs

Hongming Yang, Shi Lin, Jun Shao, Changting Lin, Donghai Zhu, Meng Han, Qinglei Kong

PDF

Open Access

TL;DR

This paper introduces DeBoP, an automatic behavior optimization method for lightweight LLMs that enhances their reasoning capabilities and performance on complex tasks by optimizing behavior directly through a gradient-free search technique.

Contribution

DeBoP is a novel, automatic optimization paradigm that improves lightweight LLM performance by directly optimizing their behavior using Monte Carlo Tree Search, reducing manual effort and computational costs.

Findings

01

DeBoP significantly outperforms recent prompt optimization methods.

02

DeBoP-optimized LwLLMs surpass GPT-3.5 on most tasks.

03

DeBoP reduces computational time by approximately 60%.

Abstract

Lightweight Large Language Models (LwLLMs) are reduced-parameter, optimized models designed to run efficiently on consumer-grade hardware, offering significant advantages in resource efficiency, cost-effectiveness, and data privacy. However, these models often struggle with limited inference and reasoning capabilities, which restrict their performance on complex tasks and limit their practical applicability. Moreover, existing prompt optimization methods typically rely on extensive manual effort or the meta-cognitive abilities of state-of-the-art LLMs, making them less effective for LwLLMs. To address these challenges, we introduce DeBoP, a new Direct Behavior Optimization Paradigm, original from the Chain-of-Thought (CoT) prompting technique. Unlike CoT Prompting, DeBoP is an automatic optimization method, which focuses on the optimization directly on the behavior of LwLLMs. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Big Data and Digital Economy · Explainable Artificial Intelligence (XAI)

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Attention Is All You Need · Cosine Annealing · Multi-Head Attention · {Dispute@FaQ-s}How to file a dispute with Expedia? · Dropout · Dense Connections · 15 Ways to Contact How can i speak to someone at Delta Airlines · Residual Connection