LPF: A Language-Prior Feedback Objective Function for De-biased Visual   Question Answering

Zujie Liang; Haifeng Hu; Jiaying Zhu

arXiv:2105.14300·cs.CV·June 24, 2021

LPF: A Language-Prior Feedback Objective Function for De-biased Visual Question Answering

Zujie Liang, Haifeng Hu, Jiaying Zhu

PDF

1 Repo

TL;DR

This paper introduces the LPF objective function that reduces language bias in VQA systems by adaptively reweighting training samples, leading to improved visual reasoning and performance on bias-sensitive benchmarks.

Contribution

The novel LPF method dynamically adjusts sample weights based on language bias, enhancing VQA models' ability to reason from visual clues.

Findings

01

Significant performance improvements across various VQA models

02

Effective reduction of language bias in training

03

Competitive results on VQA-CP v2 benchmark

Abstract

Most existing Visual Question Answering (VQA) systems tend to overly rely on language bias and hence fail to reason from the visual clue. To address this issue, we propose a novel Language-Prior Feedback (LPF) objective function, to re-balance the proportion of each answer's loss value in the total VQA loss. The LPF firstly calculates a modulating factor to determine the language bias using a question-only branch. Then, the LPF assigns a self-adaptive weight to each training sample in the training process. With this reweighting mechanism, the LPF ensures that the total VQA loss can be reshaped to a more balanced form. By this means, the samples that require certain visual information to predict will be efficiently used during training. Our method is simple to implement, model-agnostic, and end-to-end trainable. We conduct extensive experiments and the results show that the LPF (1)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jokieleung/LPF-VQA
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.