LiteViLNet: Lightweight Vision-LiDAR Fusion Network for Efficient Road Segmentation

Daojie Peng; Bingtao Wang; Fulong Ma; Liang Zhang; Jun Ma

arXiv:2605.21007·cs.CV·May 21, 2026

LiteViLNet: Lightweight Vision-LiDAR Fusion Network for Efficient Road Segmentation

Daojie Peng, Bingtao Wang, Fulong Ma, Liang Zhang, Jun Ma

PDF

TL;DR

LiteViLNet is a lightweight multi-modal network combining RGB and LiDAR data, achieving high accuracy and real-time performance for road segmentation on resource-limited devices.

Contribution

The paper introduces a novel lightweight architecture with efficient feature fusion and long-range dependency capture, enabling real-time road segmentation on embedded platforms.

Findings

01

Achieves 96.36% MaxF score with only 14.04M parameters.

02

Runs at 163.79 FPS on RTX 4060 Ti and 22.18 FPS on Jetson Orin NX.

03

Outperforms many heavy-weight methods in inference speed while maintaining competitive accuracy.

Abstract

Road segmentation is a fundamental perception task for autonomous driving and intelligent robotic systems, requiring both high accuracy and real-time inference, especially for deployment on resource-constrained edge devices. Existing multi-modal road segmentation methods often rely on heavy transformer-based encoders to achieve state-of-the-art performance, but their enormous computational cost prohibits real-time deployment on embedded platforms. To address this dilemma, we propose \textbf{LiteViLNet}, a lightweight multi-modal network that fuses RGB texture information and LiDAR geometric information for efficient road segmentation. Specifically, we design a dual-stream lightweight encoder and depth-wise separable convolutions to extract hierarchical features from both modalities with minimal parameters. We further propose a Multi-Scale Feature Fusion Module (MSFM) to facilitate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.