PoRF: Pose Residual Field for Accurate Neural Surface Reconstruction
Jia-Wang Bian, Wenjing Bian, Victor Adrian Prisacariu, Philip Torr

TL;DR
PoRF introduces an implicit pose residual field with an epipolar geometry loss to robustly refine camera poses, significantly improving neural surface reconstruction accuracy in real-world datasets.
Contribution
The paper proposes PoRF, a novel implicit pose refinement method using an MLP and epipolar geometry loss, enhancing pose accuracy for neural surface reconstruction.
Findings
Reduced rotation error by 78% on DTU dataset
Improved reconstruction F1 score from 69.18 to 75.67 on MobileBrick dataset
Outperformed ground-truth pose accuracy in some scenarios
Abstract
Neural surface reconstruction is sensitive to the camera pose noise, even if state-of-the-art pose estimators like COLMAP or ARKit are used. More importantly, existing Pose-NeRF joint optimisation methods have struggled to improve pose accuracy in challenging real-world scenarios. To overcome the challenges, we introduce the pose residual field (PoRF), a novel implicit representation that uses an MLP for regressing pose updates. This is more robust than the conventional pose parameter optimisation due to parameter sharing that leverages global information over the entire sequence. Furthermore, we propose an epipolar geometry loss to enhance the supervision that leverages the correspondences exported from COLMAP results without the extra computational overhead. Our method yields promising results. On the DTU dataset, we reduce the rotation error by 78\% for COLMAP poses, leading to the…
Peer Reviews
Decision·ICLR 2024 poster
- The utilization of global data across all frames by PoRF marks a notable advancement compared to techniques that individually optimize each image. This approach significantly enhances the refinement of camera poses, thereby achieving greater accuracy and efficiency in reconstructions. - Additionally, the method proves to be highly effective in adjusting poses from various sources, such as COLMAP and ARKit, demonstrating top-tier performance in practical, real-world dataset applications.
I've noticed that the concept of applying Epipolar Constraints in situations involving inaccurate or absent poses has been previously examined in [1]. It would be interesting to see if there's any discussion or comparison of this aspect in the context of PoRF's approach. [1] Chen, S., Zhang, Y., Xu, Y., & Zou, B. (2022). Structure-Aware NeRF without Posed Camera via Epipolar Constraint. arXiv preprint arXiv:2210.00183.
The addition of the pose residual learning with a small MLP is a straight forward addition to any neural representation learning method. The fact that sparse correspondences needed for the epipolar loss are available from COLMAP without additional cost is an added benefit that makes it easy to incorporate into existing methods. The quantitative experiments are high quality and compare the proposed approach against a diverse set of existing methods. They clearly show the improvements in reconst
The use of L1-L6 is unclear and not well introduced. To my knowledge this notation is not common to denote different experiments or ablations. I highly recommend introducing it upfront or using short acronyms to denote different configs. This would improve readability of the manuscript. The effects of training the additional MLP and computing the additional epipolar loss is likely not significant when compared to training without them but it would still be good to quantify any difference in ter
- The proposed method is simple yet effective for joint optimization of camera pose and neural surface reconstruction. - The results are impressive. Tabs. 1-4 show the proposed method achieves high-quality camera pose estimation and neural surface reconstruction. Moreover, Figs. 3-5 show the proposed method is robust to noise. - The paper is well organized and easy to follow.
[Fig.2] Based on Eq.6, $\alpha$ should multiply the residuals. However, Fig.2 shows $\alpha$ multiply the input. [Ablation of PoRF] - [Shared MLP] The paper claims that the shared MLP captures the underlying global information and therefore boosts performance. However, this is not discussed in Fig.S5 or ablation study. - [$\alpha$] It would be better to discuss the use of $\alpha$ like w/ and w/o the fixing factor $\alpha$ and its value. This is not discussed in Fig.S5. [Design of PoRF] The Po
Code & Models
Videos
Taxonomy
TopicsAdvanced X-ray Imaging Techniques · Medical Imaging Techniques and Applications · Digital Image Processing Techniques
