A Representation Optimization Dichotomy, Lie-Algebraic Policy Optimization

Sooraj KC; Vivek Mishra

arXiv:2603.25525·math.OC·March 27, 2026

A Representation Optimization Dichotomy, Lie-Algebraic Policy Optimization

Sooraj KC, Vivek Mishra

PDF

Open Access

TL;DR

This paper establishes a fundamental dichotomy in the smoothness of Lie-algebraic policy objectives in reinforcement learning, showing how algebraic structure influences optimization complexity and enabling more efficient algorithms.

Contribution

It introduces a representation-optimization dichotomy for Lie-algebra-parameterized policies, linking algebra type to gradient Lipschitz constants and proposing scalable optimization methods.

Findings

01

Compact algebras have constant smoothness, enabling faster convergence.

02

Exponential growth in smoothness for non-compact algebras like SE(3).

03

Projection-based algorithms outperform Fisher inversion in experiments.

Abstract

Structured reinforcement learning and stochastic optimization often involve parameters evolving on matrix Lie groups such as rotations and rigid-body transformations. We establish a representation-optimization dichotomy for Lie-algebra-parameterized Gaussian policy objectives in the Lie Group MDP class: the gradient Lipschitz constant L(R), governing step size, convergence, and sample complexity of first-order methods, depends only on the algebraic type of g, uniformly over all objectives, independent of reward or transition structure. Specifically, L = O(1) for compact g (e.g., so(n), su(n)), and L = Theta(exp(2R)) for g = gl(n), with O(exp(2R)) for all algebras with a hyperbolic element. A key lower bound shows this exponential growth cannot be canceled by interaction between the exponential map and the objective, making the dichotomy intrinsic to the algebra.This yields an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Reinforcement Learning in Robotics