Learning Unorthogonalized Matrices for Rotation Estimation

Kerui Gu; Zhihao Li; Shiyong Liu; Jianzhuang Liu; Songcen Xu; Youliang; Yan; Michael Bi Mi; Kenji Kawaguchi; Angela Yao

arXiv:2312.00462·cs.CV·December 4, 2023·1 cites

Learning Unorthogonalized Matrices for Rotation Estimation

Kerui Gu, Zhihao Li, Shiyong Liu, Jianzhuang Liu, Songcen Xu, Youliang, Yan, Michael Bi Mi, Kenji Kawaguchi, Angela Yao

PDF

Open Access 3 Reviews

TL;DR

This paper proposes learning unorthogonalized pseudo rotation matrices (PRoM) for 3D rotation estimation, demonstrating faster convergence and improved accuracy over traditional orthogonalization methods in pose estimation tasks.

Contribution

It introduces a novel approach to learn unorthogonalized matrices for rotation estimation, removing orthogonalization steps to enhance training efficiency and performance.

Findings

01

PRoM converges faster than traditional methods.

02

PRoM achieves state-of-the-art results on pose estimation benchmarks.

03

Removing orthogonalization improves training efficiency and solution quality.

Abstract

Estimating 3D rotations is a common procedure for 3D computer vision. The accuracy depends heavily on the rotation representation. One form of representation -- rotation matrices -- is popular due to its continuity, especially for pose estimation tasks. The learning process usually incorporates orthogonalization to ensure orthonormal matrices. Our work reveals, through gradient analysis, that common orthogonalization procedures based on the Gram-Schmidt process and singular value decomposition will slow down training efficiency. To this end, we advocate removing orthogonalization from the learning process and learning unorthogonalized `Pseudo' Rotation Matrices (PRoM). An optimization analysis shows that PRoM converges faster and to a better solution. By replacing the orthogonalization incorporated representation with our proposed PRoM in various rotation-related tasks, we achieve…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 3· reject, not good enoughConfidence 5

Strengths

The topic of the paper (rotation estimation in deep learning) has some importance in the field.

Weaknesses

- To tell the truth, I was one of the reviewers of this paper in a previous venue. I see that the authors have removed some wrong derivations, but there are still many vague and wrong claims in the paper. - Most claims in the paper are either vague or not surprising (not showing what the authors intended to show). Most importantly, the proposed method (using a plain 3x3 matrix without any orthogonalization and instead guiding it to a rotation matrix using an additional loss, i.e., soft constrai

Reviewer 02Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

1. The work focus on the choice of rotation representation, the very basic but vital ingredient in rotation estimation task. This is in contrast to most work in 3D pose estimation, and has the potential of larger impact in the field. 2. The method is simple yet effective. The motivation for the method is put clear. Gradient update is affected by the extra step of orthogonalization. By simply removing it, the baseline method could be improved. 3. Evaluations tasks are very extensive. These incl

Weaknesses

1. The evaluations consider two baselines (Zhou et al., 2019; Levinson et al., 2020). Both methods use 6D representations with differences in normalization. A comparison to more recent methods [1,2] should be more convincing. 2. The proposed method chose to base on CLIFF (Li et al., 2022). Since the proposed method is not a novel framework, it is generally applicable to any existing methods for pose estimation. Hence, I assume that PROM could be used together with baselines to also improve lea

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 4

Strengths

S1. I think the paper does a good job analyzing the problems that orthogonalization methods introduce when used in the learning frameworks. S2. I find the experiments interesting because they show the improvements PRoM can bring to human pose estimation, cloud pose estimation, among other applications. S3. The paper is well written and is easy to follow and understand. Clarity thus is high and should be easy to replicate.

Weaknesses

While I think the paper touches a very interesting topic (i.e., rotation estimation in deep learning frameworks) and shows interesting results, I have several concerns that make me a bit skeptical about the approach: 1. Lack of discussion about other possible ways to constrain orthonormality in learning problems. While the problems of orthogonalization in the learning problem may bring issues as described in the paper, there are other possible solutions that the paper does not discuss. For exam

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Human Pose and Action Recognition · Advanced Vision and Imaging