Task Facet Learning: A Structured Approach to Prompt Optimization
Gurusha Juneja, Gautam Jajoo, Nagarajan Natarajan, Hua Li, Jian Jiao, Amit Sharma

TL;DR
This paper introduces UniPrompt, a structured algorithm for prompt optimization that learns multiple task facets from training data, resulting in more accurate and complex prompts than existing methods.
Contribution
UniPrompt is a novel algorithm that leverages task structure and clustering to generate comprehensive prompts by capturing multiple facets, outperforming prior prompt tuning approaches.
Findings
UniPrompt achieves higher accuracy than human-tuned prompts.
It can generate longer, more complex prompts.
Outperforms state-of-the-art prompt methods.
Abstract
Given a task in the form of a basic description and its training examples, prompt optimization is the problem of synthesizing the given information into a text prompt for a large language model. Humans solve this problem by also considering the different facets that define a task (e.g., counter-examples, explanations, analogies) and including them in the prompt. However, it is unclear whether existing algorithmic approaches, based on iteratively editing a given prompt or automatically selecting a few in-context examples, can cover the multiple facets required to solve a complex task. In this work, we view prompt optimization as that of learning multiple facets of a task from a set of training examples. We exploit structure in the prompt optimization problem and break down a prompt into loosely coupled semantic sections. The proposed algorithm, UniPrompt, (1) clusters the input space and…
Peer Reviews
Decision·Submitted to ICLR 2025
The strategy for creating mini-batches to generate edits and then aggregating them at the batch level to yield the final edit in a feedback mechanism is interesting. Experimental results achieves SOTA.
1.The paper contains some detailed issues, e.g., $U_{ni}P_{rompt}$ mentioned in line 100-101; an extra space before "So" mentioned in line 197-198. 2. The experiments should be re-organized and polished carefully before submission, especially in Section 4.1, 4.2 and 4.5. 3. Some qualitative experiments are supported be added to validate the differences in report accuracies under various mini-batch sizes.
- UNIPROMPT clusters examples to capture task-specific facets effectively, with a two-tier feedback mechanism ensuring that prompt modifications are generalizable and not overly specific to individual examples. - The authors provides strong empirical results, with UNIPROMPT outperforming state-of-the-art methods on multiple datasets, notably achieving significant improvements on hate speech classification. - By focusing on automatic prompt generation without relying heavily on human-engineered p
- All experiments are conducted on powerful proprietary LLMs, I wonder whether the proposed method remains effective for opensource LLMs such as LLaMA or Qwen. - It seems that the proposed method underperforms DSPy on MedQA, while surpassing it on other benchmarks, what might be the reason? - How to determine the number of clusters when constructing the minibatches? How does this affect the method's effectiveness?
(1) The paper introduces a new method for prompt optimization and consider multiple facets of a task and leveraging the structure in the prompt. (2) The proposed clustering approach and two-tier feedback mechanism provide a systematic way to generate prompt edits that capture generalizable concepts. (3) The experimental results on several datasets shows that UNIPROMPT outperforms human-tuned prompts and SOTA methods.
(1) I am skeptical about the current importance of prompt optimization. Prompt optimization is more of a need in an era when large models are not powerful (for example last year). Nowadays, the instruction model follows human instructions very well, and many of them are 0-shot, see llama-3.1 model card. I would like to hear the author's opinion on this. (2) For the results in table 1, it is recommended to add an average column to make the presentation clearer. (3) I found that on the ARC task,
Videos
Taxonomy
TopicsReservoir Engineering and Simulation Methods · AI-based Problem Solving and Planning · Anomaly Detection Techniques and Applications
MethodsSparse Evolutionary Training
