Learning to Exploit Multiple Vision Modalities by Using Grafted Networks
Yuhuang Hu, Tobi Delbruck, Shih-Chii Liu

TL;DR
This paper introduces a Network Grafting Algorithm that enables the integration of novel vision sensors with pretrained deep neural networks, improving performance without additional inference costs or extensive labeled data.
Contribution
The proposed NGA method allows new sensor modalities to leverage existing pretrained models through self-supervised training, reducing data requirements and training time.
Findings
Grafted networks achieve competitive object detection accuracy.
Thermal-driven grafted network shows 49.11% relative improvement.
Training is efficient, taking only hours on a single GPU.
Abstract
Novel vision sensors such as thermal, hyperspectral, polarization, and event cameras provide information that is not available from conventional intensity cameras. An obstacle to using these sensors with current powerful deep neural networks is the lack of large labeled training datasets. This paper proposes a Network Grafting Algorithm (NGA), where a new front end network driven by unconventional visual inputs replaces the front end network of a pretrained deep network that processes intensity frames. The self-supervised training uses only synchronously-recorded intensity frames and novel sensor data to maximize feature similarity between the pretrained network and the grafted network. We show that the enhanced grafted network reaches competitive average precision (AP50) scores to the pretrained network on an object detection task using thermal and event camera datasets, with no…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Machine Learning and ELM
