Non-Exchangeable Conformal Risk Control
Ant\'onio Farinhas, Chrysoula Zerva, Dennis Ulmer, Andr\'e F. T., Martins

TL;DR
This paper introduces a flexible conformal risk control framework that provides statistical guarantees for non-exchangeable data, accommodating distribution shifts and offering tighter bounds through data weighting.
Contribution
It extends conformal prediction to non-exchangeable data by controlling the expected loss for any monotone function, with minimal assumptions and data weighting capabilities.
Findings
Effective in handling distribution drift and change points.
Provides tighter bounds with appropriate data weighting.
Validated on synthetic and real-world datasets.
Abstract
Split conformal prediction has recently sparked great interest due to its ability to provide formally guaranteed uncertainty sets or intervals for predictions made by black-box neural models, ensuring a predefined probability of containing the actual ground truth. While the original formulation assumes data exchangeability, some extensions handle non-exchangeable data, which is often the case in many real-world scenarios. In parallel, some progress has been made in conformal methods that provide statistical guarantees for a broader range of objectives, such as bounding the best -score or minimizing the false negative rate in expectation. In this paper, we leverage and extend these two lines of work by proposing non-exchangeable conformal risk control, which allows controlling the expected value of any monotone loss function when the data is not exchangeable. Our framework is…
Peer Reviews
Decision·ICLR 2024 poster
The paper connects two modern techniques in conformal prediction, and is thus very relevant for the community. It is well-written and easy to follow as an expert. The writing is simple, and I expect the paper will also be easy-to-follow for readers unfamiliar with conformal prediction.
The method combines previous work in a relatively straightforward way. The proof of Theorem 1 does not introduce new techniques. The experiments follow settings proposed in previous works. The paper is solving a completely new problem, and naturally, there are no baselines for it. Thus, while the proposed method is novel and useful, the paper would be strengthened with a more in-depth theoretical/experimental study. Some suggestions are, - Writing down full-conformal and cross-conformal versio
This paper is clear, and does a good job at deriving bounds for how weighted conformal risk control performs under non-exchangeability. Non-exchangeability will happen often in practice, so it is impactful to explore more robust weighting schemes and their implications. The empirical results are encouraging. It's a bit unclear as to how _useful_ the guarantees are, in the sense that they can be too loose if $\sum w_i TV(Z, Z^i)$ is very large, or more likely yet, simply unknown. When some practi
While, again, the paper is nicely written, it is a somewhat incremental step from previous work in Barber et. al. and Angelopoulos et. al. It's also a bit of an over-claim to say that risk is _controlled_ in a non-exchangeable setting, rather what the paper does is develop a conservative upper bound for the risk under non-exchangeability that depends on quantiles that we cannot realistically know, i.e., $TV(Z_i, Z_{n+1})$.
Originality: This paper combines the results by Barber et al. (2023) and Angelopoulos et al. (2023a), leading to a new result. Quality: the claim is well-justified via Theorem 1 and its proof. Clairity: the paper is mostly well-written. Significance: considering that the conformal prediction can be extended to the non-exchangeable setup by Barber et al. (2023), so it is not surprising that conformal risk control can be extended in a similar way. But, it is still a new result.
The following includes my concerns. 1. Under the non-exchangeable setup, the CRC should be broken, and this is why we need non-exchangeable extension of CRC. But, I cannot see the trend in Setting 3 in Figure 1 and Figure 3, which is unsatisfactory. In particular, I’m not convinced why open-domain QA experiments (related to Figure 3) fit the non-exchangeable setup – the concrete scenario on why we need to consider the non-exchangeable setup here is required. Moreover, the way to generate w_i i
Code & Models
Videos
Taxonomy
TopicsFault Detection and Control Systems · Advanced Control Systems Optimization · Formal Methods in Verification
