Conformal Structured Prediction
Botong Zhang, Shuo Li, Osbert Bastani

TL;DR
This paper introduces a general conformal prediction framework for structured outputs, enabling uncertainty quantification in complex prediction tasks like text generation and hierarchical classification.
Contribution
It extends conformal prediction to structured outputs, allowing implicit representation of label sets and application to domains with graph-structured labels.
Findings
Effective in hierarchical label prediction
Guarantees high coverage probability
Applicable to complex structured prediction tasks
Abstract
Conformal prediction has recently emerged as a promising strategy for quantifying the uncertainty of a predictive model; these algorithms modify the model to output sets of labels that are guaranteed to contain the true label with high probability. However, existing conformal prediction algorithms have largely targeted classification and regression settings, where the structure of the prediction set has a simple form as a level set of the scoring function. However, for complex structured outputs such as text generation, these prediction sets might include a large number of labels and therefore be hard for users to interpret. In this paper, we propose a general framework for conformal prediction in the structured prediction setting, that modifies existing conformal prediction algorithms to output structured prediction sets that implicitly represent sets of labels. In addition, we…
Peer Reviews
Decision·ICLR 2025 Poster
1. The paper is well-organized for the most part. 2. The paper is technically sound in its description of problem formulation and the marginal and PAC guarantees. 3. Construction of prediction sets in the structured prediction setting and in the context of nodes in a directed acyclic graph is an important problem.
1. Missing discussion of important related work [1, 2]: I believe the paper misses citing and comparison with important related work on conformal risk control [1]. [1] considers hierarchical image classification in ImageNet similar to the paper and controls the graph distance between nodes. Additionally, RAPS method in [2] is a conformal prediction method that introduces regularization to encourage smaller and stable sets, and is worth comparing to given the focus of the paper on reducing averag
* It is an interesting problem, particularly how best to use the external structure of the labels to generate a better 'curve', i.e. recall at given output size. * The experimental setups were quite interesting, e.g. MNIST with number ranges. * The proposed method seems to extend well to DAG spaces (beyond trees). Though I suppose it is still restricted to DAG instead of Graphs to sum up the probs of final leaf nodes.
* I would love to see a baseline where we dont use the structure at all and instead rely on regular P/R curve characteristics. Does the AUC of this model behave better? It is not clear to me as such. * Even if we do use the external structure and forced to only predict internal nodes in the DAG (as opposed to arbitrary set of leaf nodes), it would still be useful to understand the P/R curve look significantly different with the proposed models. There are plenty of baselines where we can do pred
1. Extension of conformal prediction to structured outputs using DAGs, combining conformal prediction with hierarchical representations. 2. Rigorous theoretical development with both marginal and PAC coverage guarantees, validated through experiments in diverse domains. 3. Generally well-organized with clear explanations and helpful visual aids, making complex concepts accessible. 4. Addresses an important gap, potentially impacting applications that require structu
1. While the paper’s application of conformal prediction to structured outputs is valuable, similar approaches have been explored in hierarchical classification and structured prediction. For instance, previous works have used hierarchical structures (e.g., DAGs or trees) to improve interpretability in label prediction. The paper could benefit from a more thorough comparison to these existing methods, as well as a deeper explanation of what sets its approach apart. Highlighting any unique techni
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsSparse Evolutionary Training
