LUMEN: Low-light Unified Multi-stage Enhancement Network using depth-guided flash, clustering, and attention-based Transformers
Bibhabasu Debnath, Sahana Ray, and Sanjay Ghosh

TL;DR
LUMEN is a multi-stage low-light image enhancement framework that uses depth estimation, clustering, and attention-based transformers to improve visual quality and detail preservation.
Contribution
It introduces a novel depth-guided multi-stage enhancement method combining virtual flash simulation and transformer-based feature fusion.
Findings
Achieves state-of-the-art performance on LOL-v1 and LOL-v2 benchmarks.
Produces visually natural and detailed enhanced images.
Effectively preserves structural and color fidelity in low-light conditions.
Abstract
Low-light image enhancement remains a challenging problem due to severe noise, color distortion, contrast degradation, and loss of structural details under insufficient illumination. Existing methods typically apply uniform enhancement without considering the depth-dependent nature of light attenuation and sensor noise in real-world scenes. To address this limitation, we propose LUMEN, a multi-stage enhancement framework that integrates virtual flash simulation with transformer-based feature fusion. The proposed framework first estimates scene depth from low-light inputs using a dedicated encoder-decoder network, after which a soft clustering module partitions pixels into depth-aware regions, enabling depth-dependent flash simulation. The simulated flash features, together with depth representations, are fused with image features through efficient attention-based fusion blocks to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
