Product-of-Experts Training Reduces Dataset Artifacts in Natural Language Inference
Aby Mammen Mathew

TL;DR
This paper introduces Product-of-Experts training to reduce dataset artifacts in neural natural language inference models, maintaining accuracy while decreasing reliance on spurious correlations.
Contribution
The proposed PoE training method effectively diminishes dataset artifact reliance in NLI models without significantly sacrificing accuracy.
Findings
PoE reduces bias agreement from 49.85% to 45%.
PoE maintains nearly the same accuracy as baseline models.
Ablation finds lambda=1.5 balances debiasing and accuracy.
Abstract
Neural NLI models overfit dataset artifacts instead of truly reasoning. A hypothesis-only model gets 57.7% in SNLI, showing strong spurious correlations, and 38.6% of the baseline errors are the result of these artifacts. We propose Product-of-Experts (PoE) training, which downweights examples where biased models are overconfident. PoE nearly preserves accuracy (89.10% vs. 89.30%) while cutting bias reliance by 4.71% (bias agreement 49.85% to 45%). An ablation finds lambda = 1.5 that best balances debiasing and accuracy. Behavioral tests still reveal issues with negation and numerical reasoning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
