Depth Completion as Parameter-Efficient Test-Time Adaptation

Bingxin Ke; Qunjie Zhou; Jiahui Huang; Xuanchi Ren; Tianchang Shen; Konrad Schindler; Laura Leal-Taix\'e; Shengyu Huang

arXiv:2602.14751·cs.CV·February 17, 2026

Depth Completion as Parameter-Efficient Test-Time Adaptation

Bingxin Ke, Qunjie Zhou, Jiahui Huang, Xuanchi Ren, Tianchang Shen, Konrad Schindler, Laura Leal-Taix\'e, Shengyu Huang

PDF

Open Access

TL;DR

CAPA is a parameter-efficient test-time adaptation method for depth completion that fine-tunes pre-trained 3D models using sparse cues, improving accuracy and robustness without overfitting.

Contribution

It introduces a novel, model-agnostic framework that updates minimal parameters at inference time, leveraging sparse geometric cues and sequence-level adaptation for depth completion.

Findings

01

Achieves state-of-the-art results on indoor and outdoor datasets.

02

Effectively grounds geometric priors with minimal parameter updates.

03

Enhances robustness and multi-frame consistency in videos.

Abstract

We introduce CAPA, a parameter-efficient test-time optimization framework that adapts pre-trained 3D foundation models (FMs) for depth completion, using sparse geometric cues. Unlike prior methods that train task-specific encoders for auxiliary inputs, which often overfit and generalize poorly, CAPA freezes the FM backbone. Instead, it updates only a minimal set of parameters using Parameter-Efficient Fine-Tuning (e.g. LoRA or VPT), guided by gradients calculated directly from the sparse observations available at inference time. This approach effectively grounds the foundation model's geometric prior in the scene-specific measurements, correcting distortions and misplaced structures. For videos, CAPA introduces sequence-level parameter sharing, jointly adapting all frames to exploit temporal correlations, improve robustness, and enforce multi-frame consistency. CAPA is model-agnostic,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Human Pose and Action Recognition