Video Waterdrop Removal via Spatio-Temporal Fusion in Driving Scenes
Qiang Wen, Yue Wu, Qifeng Chen

TL;DR
This paper introduces an attention-based spatio-temporal fusion framework for removing waterdrops from driving videos, supported by a large synthetic dataset and cross-modality training, improving real-world performance.
Contribution
The paper presents a novel attention-based method for video waterdrop removal, along with a large synthetic dataset and a cross-modality training strategy to enhance generalization.
Findings
Achieves state-of-the-art waterdrop removal in driving scenes
Generalizes well to complex real-world scenarios
Outperforms existing methods in visual restoration quality
Abstract
The waterdrops on windshields during driving can cause severe visual obstructions, which may lead to car accidents. Meanwhile, the waterdrops can also degrade the performance of a computer vision system in autonomous driving. To address these issues, we propose an attention-based framework that fuses the spatio-temporal representations from multiple frames to restore visual information occluded by waterdrops. Due to the lack of training data for video waterdrop removal, we propose a large-scale synthetic dataset with simulated waterdrops in complex driving scenes on rainy days. To improve the generality of our proposed method, we adopt a cross-modality training strategy that combines synthetic videos and real-world images. Extensive experiments show that our proposed method can generalize well and achieve the best waterdrop removal performance in complex real-world driving scenes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Enhancement Techniques · Advanced Neural Network Applications · Advanced Vision and Imaging
