An Efficient Modern Baseline for FloodNet VQA
Aditya Kane, Sahil Khose

TL;DR
This paper introduces a simple, efficient VQA system for flood disaster management that outperforms existing methods on FloodNet with less training and inference time.
Contribution
It revisits basic combination methods with modern features to create a lightweight, high-performing VQA baseline for flood-related visual question answering.
Findings
Outperforms previous methods on FloodNet dataset
Requires less training and inference time
Analyzes various backbone architectures
Abstract
Designing efficient and reliable VQA systems remains a challenging problem, more so in the case of disaster management and response systems. In this work, we revisit fundamental combination methods like concatenation, addition and element-wise multiplication with modern image and text feature abstraction models. We design a simple and efficient system which outperforms pre-existing methods on the FloodNet dataset and achieves state-of-the-art performance. This simplified system requires significantly less training and inference time than modern VQA architectures. We also study the performance of various backbones and report their consolidated results. Code is available at https://github.com/sahilkhose/floodnet_vqa.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Image Retrieval and Classification Techniques · COVID-19 diagnosis using AI
