Hidden Clones: Exposing and Fixing Family Bias in Vision-Language Model Ensembles

Zacharie Bugaud

arXiv:2603.17111·cs.CV·March 19, 2026

Hidden Clones: Exposing and Fixing Family Bias in Vision-Language Model Ensembles

Zacharie Bugaud

PDF

Open Access

TL;DR

This paper identifies family bias in vision-language model ensembles, showing correlated errors reduce ensemble effectiveness, and proposes three family-aware methods to improve accuracy and robustness across multiple benchmarks.

Contribution

It introduces three novel family-aware ensemble methods that mitigate correlated errors and improve performance on vision-language benchmarks.

Findings

01

Family-correlated errors reduce effective ensemble diversity.

02

Hierarchical Family Voting improves accuracy by 18-26 percentage points.

03

Learned Candidate Scoring achieves significant gains and never degrades performance.

Abstract

Ensembling Vision-Language Models (VLMs) from different providers maximizes benchmark accuracy, yet models from the same architectural family share correlated errors that standard voting ignores. We study this structure across 17 VLMs from 8 families on VQAv2, TextVQA, and GQA. Family-correlated errors reduce effective ensemble dimensionality to 2.5-3.6 independent voters and create a Misleading tier (1.5-6.5% of questions) where correlated majority errors destroy accuracy to 0% despite the best model being correct. We propose three family-aware methods. Hierarchical Family Voting (HFV) aggregates within families before voting across them, recovering +18-26 pp on the Misleading tier. QualRCCV, a training-free method weighting models by calibration, family quality, and inverse family size, is the first to beat calibrated voting on all three benchmarks (p<0.05). Learned Candidate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Graph Neural Networks