Towards Self-Robust LLMs: Intrinsic Prompt Noise Resistance via CoIPO
Xin Yang, Letian Li, Abudukelimu Wuerkaixi, Xuxin Cheng, Cao Liu, Ke Zeng, Xunliang Cai, Wenyuan Jiang

TL;DR
This paper introduces CoIPO, a contrastive learning method that enhances LLM robustness against prompt noise by aligning logits from clean and noisy prompts, improving performance without external preprocessing.
Contribution
The paper proposes a novel intrinsic robustness training method for LLMs using contrastive learning on paired prompts, reducing reliance on external prompt refinement tools.
Findings
Significant accuracy improvements on NoisyPromptBench
Effective reduction of prompt sensitivity in LLMs
Outperforms existing robustness approaches
Abstract
Large language models (LLMs) have demonstrated remarkable and steadily improving performance across a wide range of tasks. However, LLM performance may be highly sensitive to prompt variations especially in scenarios with limited openness or strict output formatting requirements, indicating insufficient robustness. In real-world applications, user prompts provided to LLMs often contain imperfections, which may undermine the quality of the model's responses. To address this issue, previous work has primarily focused on preprocessing prompts, employing external tools or even LLMs to refine prompt formulations in advance. However, these approaches overlook the intrinsic robustness of LLMs, and their reliance on external components introduces additional computational overhead and uncertainty. In this work, we propose a Contrastive Learning-based Inverse Direct Preference Optimization…
Peer Reviews
Decision·ICLR 2026 Poster
Well-motivated problem: The paper clearly articulates limitations of external preprocessing approaches and makes a strong case for intrinsic robustness. Solid theoretical foundation: The mutual information analysis (Equations 9-16) provides principled justification for the method. Comprehensive evaluation: NoisyPromptBench with multiple perturbation types and the decoding radius analysis (Section 4.1, Figure 5) provide thorough robustness assessment. Ablation studies: Table 3 effectively demonst
Limited scope of evaluation: ● Only 7B parameter models tested; unclear if findings generalize to larger models (13B, 70B+) ● Only 5 datasets from GLUE-style tasks; robustness on generation tasks, reasoning, or code generation is unexplored ● Training data limited to 25 FLAN subsets; impact of training data scale not studied Insufficient baseline comparisons: ● Only compares to COIN for intrinsic robustness methods ● Missing comparisons to recent prompt optimization methods (e.g., PromptAg
* CoIPO tackles a significant issue, handle imperfect user input in daily use of the LLMs. * The CoIPO algorithm leverages contrastive learning and direct alignment, backed with theoretical insights from the perspective and relative entropy gap maximization. * Comprehensive experiments are conducted for different architectures.
* Though CoIPO alone is a novel and effective algorithm for increasing LLM robustness. My concern is that if this procedure will hurt the models performance on tasks like math reasoning and coding. Since it would appear costly to me if we sacrifice these reasoning capabilities to replace pre-processing tools for imperfect input. * No algorithm specific hyper-parameter is presented in the paper, could the author elaborate a bit more on the details hyper-parameter if they are included in the algor
1. This paper proposes a principled intrinsic robustness enhancement method (CoIPO) for LLMs based on contrastive learning and inverse DPO. The formulation is clear and easy to understand. 2. The authors offer an information-theoretic perspective (mutual information) to justify the approach, which is interesting.
1. The research problem of prompt optimization is important but the research scope of this work is somehow limited, since the noisy prompts in this paper primarily refer to typos (character level, word level etc.), while real-world prompt imperfections can be more varied, such as semantic ambiguity, non-standard grammar, than those benchmarked here. 2. More experiment baselines should be incorporated. For example, DPO should be considered as a baseline, since CoIPO uses both contrastive training
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications
