LUM-ViT: Learnable Under-sampling Mask Vision Transformer for Bandwidth Limited Optical Signal Acquisition
Lingfeng Liu, Dong Ni, Hangjie Yuan

TL;DR
LUM-ViT introduces a learnable under-sampling mask within a Vision Transformer framework to enable efficient optical signal acquisition with minimal data, maintaining high accuracy even with significant data reduction.
Contribution
The paper presents a novel learnable under-sampling mask integrated into a Vision Transformer for pre-acquisition modulation, optimized for optical hardware implementation.
Findings
Sampling 10% of pixels retains within 1.8% accuracy loss on ImageNet.
Maintains near-original accuracy on real-world optical hardware.
Proposes kernel-level weight binarization and a three-stage fine-tuning strategy.
Abstract
Bandwidth constraints during signal acquisition frequently impede real-time detection applications. Hyperspectral data is a notable example, whose vast volume compromises real-time hyperspectral detection. To tackle this hurdle, we introduce a novel approach leveraging pre-acquisition modulation to reduce the acquisition volume. This modulation process is governed by a deep learning model, utilizing prior information. Central to our approach is LUM-ViT, a Vision Transformer variant. Uniquely, LUM-ViT incorporates a learnable under-sampling mask tailored for pre-acquisition modulation. To further optimize for optical calculations, we propose a kernel-level weight binarization technique and a three-stage fine-tuning strategy. Our evaluations reveal that, by sampling a mere 10% of the original image pixels, LUM-ViT maintains the accuracy loss within 1.8% on the ImageNet classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhotonic and Optical Devices · Advanced optical system design · Optical Systems and Laser Technology
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Dropout · Multi-Head Attention · Softmax · Dense Connections · Label Smoothing · Adam
