System Message Generation for User Preferences using Open-Source Models

Minbyul Jeong; Jungho Cho; Minsoo Khang; Dawoon Jung; Teakgyu Hong

arXiv:2502.11330·cs.CL·May 26, 2025

System Message Generation for User Preferences using Open-Source Models

Minbyul Jeong, Jungho Cho, Minsoo Khang, Dawoon Jung, Teakgyu Hong

PDF

Open Access

TL;DR

This paper introduces SysGen, a pipeline that generates effective system messages for LLMs using existing datasets, significantly improving conversation quality and adaptability in user interactions.

Contribution

The paper presents a novel method for generating system messages from existing datasets, enhancing LLM responses without requiring manually annotated system prompts.

Findings

01

Improved performance on Multifacet and SysBench benchmarks.

02

Significant gains in short, early-stage conversations.

03

Enhanced diversity and structure in system messages improve adaptability.

Abstract

System messages play a crucial role in interactions with large language models (LLMs), often serving as prompts to initiate conversations. Through system messages, users can assign specific roles, perform intended tasks, incorporate background information, and specify various output formats and communication styles. Despite such versatility, publicly available datasets often lack system messages and are subject to strict license constraints in industrial applications. Moreover, manually annotating system messages that align with user instructions is resource-intensive. In light of these challenges, we introduce SysGen, a pipeline for generating system messages that better align assistant responses with user instructions using existing supervised fine-tuning datasets that lack system messages. Training open-source models on SysGen data yields substantial improvements in both single-turn…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsContext-Aware Activity Recognition Systems · Speech and dialogue systems · Service-Oriented Architecture and Web Services

MethodsALIGN