Explicit Abstention Knobs for Predictable Reliability in Video Question Answering

Jorge Ortiz

arXiv:2601.00138·cs.AI·January 16, 2026

Explicit Abstention Knobs for Predictable Reliability in Video Question Answering

Jorge Ortiz

PDF

Open Access

TL;DR

This paper explores confidence-based abstention in video question answering systems, demonstrating its effectiveness in controlling error rates in-distribution and under distribution shifts, with implications for high-stakes applications.

Contribution

It introduces and evaluates confidence thresholding as a method for predictable reliability in video question answering models, especially under distribution shifts.

Findings

01

Confidence thresholding offers mechanistic control over error rates.

02

Smooth risk-coverage tradeoffs are achievable by adjusting thresholds.

03

Control remains robust under distribution shifts.

Abstract

High-stakes deployment of vision-language models (VLMs) requires selective prediction, where systems abstain when uncertain rather than risk costly errors. We investigate whether confidence-based abstention provides reliable control over error rates in video question answering, and whether that control remains robust under distribution shift. Using NExT-QA and Gemini 2.0 Flash, we establish two findings. First, confidence thresholding provides mechanistic control in-distribution. Sweeping threshold epsilon produces smooth risk-coverage tradeoffs, reducing error rates f

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning