TL;DR
The paper introduces FUN, an end-to-end neural network that jointly reconstructs hyperspectral images and detects objects in real-time, using a novel focal modulation mechanism to reduce computational complexity.
Contribution
It proposes a new multi-task learning framework with a shared U-shaped backbone and focal modulation, enabling efficient joint hyperspectral image reconstruction and object detection.
Findings
Achieves state-of-the-art performance on reconstruction and detection tasks.
Uses 40% fewer parameters and 30% less computation than recent methods.
Provides a new annotated dataset for hyperspectral object detection.
Abstract
Conventional push-broom hyperspectral imaging suffers from slow acquisition speeds, precluding real-time object detection; in contrast, snapshot spectral imaging enables instantaneous hyperspectral images (HSIs) capture, making real-time object detection feasible, yet its potential is often compromised by time-consuming post-capture reconstruction. To address this issue, we propose the Focal U-shaped Network (FUN), a novel end-to-end framework that jointly performs HSI reconstruction and object detection via multi-task learning. FUN employs a shared U-shaped backbone, where reconstruction provides underlying spectral information while detection guides semantic-aware priors learning, facilitating mutually beneficial task interaction. Crucially, we introduce focal modulation, an efficient alternative to self-attention that modulates spatial and spectral features while reducing quadratic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
