Product-of-Experts Training Reduces Dataset Artifacts in Natural Language Inference

Aby Mammen Mathew

arXiv:2604.19069·cs.CL·April 22, 2026

Product-of-Experts Training Reduces Dataset Artifacts in Natural Language Inference

Aby Mammen Mathew

PDF

TL;DR

This paper introduces Product-of-Experts training to reduce dataset artifacts in neural natural language inference models, maintaining accuracy while decreasing reliance on spurious correlations.

Contribution

The proposed PoE training method effectively diminishes dataset artifact reliance in NLI models without significantly sacrificing accuracy.

Findings

01

PoE reduces bias agreement from 49.85% to 45%.

02

PoE maintains nearly the same accuracy as baseline models.

03

Ablation finds lambda=1.5 balances debiasing and accuracy.

Abstract

Neural NLI models overfit dataset artifacts instead of truly reasoning. A hypothesis-only model gets 57.7% in SNLI, showing strong spurious correlations, and 38.6% of the baseline errors are the result of these artifacts. We propose Product-of-Experts (PoE) training, which downweights examples where biased models are overconfident. PoE nearly preserves accuracy (89.10% vs. 89.30%) while cutting bias reliance by 4.71% (bias agreement 49.85% to 45%). An ablation finds lambda = 1.5 that best balances debiasing and accuracy. Behavioral tests still reveal issues with negation and numerical reasoning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.