EfficientPENet: Real-Time Depth Completion from Sparse LiDAR via Lightweight Multi-Modal Fusion

Johny J. Lopez; Md Meftahul Ferdaus; Mahdi Abdelguerfi; Anton Netchaev; Steven Sloan; Ken Pathak; Kendall N. Niles

arXiv:2604.18790·cs.CV·April 22, 2026

EfficientPENet: Real-Time Depth Completion from Sparse LiDAR via Lightweight Multi-Modal Fusion

Johny J. Lopez, Md Meftahul Ferdaus, Mahdi Abdelguerfi, Anton Netchaev, Steven Sloan, Ken Pathak, Kendall N. Niles

PDF

TL;DR

EfficientPENet is a lightweight, real-time depth completion network combining multi-modal fusion, modern backbone architectures, and test-time augmentation to achieve high accuracy on embedded hardware.

Contribution

The paper introduces EfficientPENet, a novel efficient architecture with sparsity-invariant convolutions and position-aware augmentation for real-time depth completion.

Findings

01

Achieves 631.94 mm RMSE on KITTI benchmark.

02

Operates at 48.76 FPS with 36.24M parameters.

03

Reduces parameters by 3.7x and speeds up by 23x compared to BP-Net.

Abstract

Depth completion from sparse LiDAR measurements and corresponding RGB images is a prerequisite for accurate 3D perception in robotic systems. Existing methods achieve high accuracy on standard benchmarks but rely on heavy backbone architectures that preclude real-time deployment on embedded hardware. We present EfficientPENet, a two-branch depth completion network that replaces the conventional ResNet encoder with a modernized ConvNeXt backbone, introduces sparsity-invariant convolutions for the depth stream, and refines predictions through a Convolutional Spatial Propagation Network (CSPN). The RGB branch leverages ImageNet-pretrained ConvNeXt blocks with Layer Normalization, 7x7 depthwise convolutions, and stochastic depth regularization. Features from both branches are merged via late fusion and decoded through a multi-scale deep supervision strategy. We further introduce a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.