# Revisiting Modality Imbalance In Multimodal Pedestrian Detection

**Authors:** Arindam Das, Sudip Das, Ganesh Sistu, Jonathan Horgan, Ujjwal, Bhattacharya, Edward Jones, Martin Glavin, and Ciar\'an Eising

arXiv: 2302.12589 · 2023-07-10

## TL;DR

This paper addresses the issue of modality imbalance in multimodal pedestrian detection by introducing a regularizer and decoupling strategy, leading to improved robustness and state-of-the-art results on key datasets.

## Contribution

It proposes a novel regularizer and decoupling approach to mitigate modality bias and enhance feature fusion in multimodal pedestrian detection.

## Key findings

- Improved detection performance on KAIST and UTokyo datasets.
- Enhanced robustness of feature fusion against modality imbalance.
- State-of-the-art results achieved with the proposed method.

## Abstract

Multimodal learning, particularly for pedestrian detection, has recently received emphasis due to its capability to function equally well in several critical autonomous driving scenarios such as low-light, night-time, and adverse weather conditions. However, in most cases, the training distribution largely emphasizes the contribution of one specific input that makes the network biased towards one modality. Hence, the generalization of such models becomes a significant problem where the non-dominant input modality during training could be contributing more to the course of inference. Here, we introduce a novel training setup with regularizer in the multimodal architecture to resolve the problem of this disparity between the modalities. Specifically, our regularizer term helps to make the feature fusion method more robust by considering both the feature extractors equivalently important during the training to extract the multimodal distribution which is referred to as removing the imbalance problem. Furthermore, our decoupling concept of output stream helps the detection task by sharing the spatial sensitive information mutually. Extensive experiments of the proposed method on KAIST and UTokyo datasets shows improvement of the respective state-of-the-art performance.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.12589/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/2302.12589/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/2302.12589/full.md

---
Source: https://tomesphere.com/paper/2302.12589