Stabilizing Backpropagation Through Time to Learn Complex Physics

Patrick Schnell; Nils Thuerey

arXiv:2405.02041·cs.LG·May 6, 2024

Stabilizing Backpropagation Through Time to Learn Complex Physics

Patrick Schnell, Nils Thuerey

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces a modified backpropagation through time method tailored for physics simulations, which stabilizes gradient flow and improves learning in complex, long-horizon control tasks.

Contribution

It proposes a novel stabilization technique for backpropagation through time that maintains minima positions and enhances training stability in physics-based recurrent models.

Findings

01

Improved control performance on complex physics tasks

02

Stabilized gradient flow reduces exploding and vanishing updates

03

Method scales well with task complexity

Abstract

Of all the vector fields surrounding the minima of recurrent learning setups, the gradient field with its exploding and vanishing updates appears a poor choice for optimization, offering little beyond efficient computability. We seek to improve this suboptimal practice in the context of physics simulations, where backpropagating feedback through many unrolled time steps is considered crucial to acquiring temporally coherent behavior. The alternative vector field we propose follows from two principles: physics simulators, unlike neural networks, have a balanced gradient flow, and certain modifications to the backpropagation pass leave the positions of the original minima unchanged. As any modification of backpropagation decouples forward and backward pass, the rotation-free character of the gradient field is lost. Therefore, we discuss the negative implications of using such a rotational…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 8· accept, good paperConfidence 3

Strengths

- Very well written. Excellent presentation with nice examples. - The toy examples and figures really helped with illustrating the points. - Experiments are clear and support the claims. - Differentiable control of simulators is a topic of interest. This could have high impact.

Weaknesses

- I think the paper could have benefited from having more discussion of the chosen experiment applications. Knowing how these applications compare to potential real-world applications in terms of complexity would have been valuable context for the section on computational cost. It also would have given some context on how well this method might scale to complicated simulations.

Reviewer 02Rating 8· accept, good paperConfidence 2

Strengths

- The paper is well written. The toy example and visuals are very helpful in terms of understanding the problem and solution. The choice of the example is well explained. - The proposed method is conceptually simple (gradient stopping and sign check) but sheds light on using gradient modification to stabilize optimization.

Weaknesses

**Experiments** - The visualization of results could be improved. For example in Fig 3, multiple curves have the same color and overlap each other, making it difficult to draw a clear conclusion. It might be better to draw a mean-std plot for each method where the std (computed over trials) is shaded.

Reviewer 03Rating 8· accept, good paperConfidence 4

Strengths

1. The visualizations of the problems are well done and the problems that the authors are trying to convey are clearly communicated to the reader. 2. The method shows convincing improvement over competitors on simple experiments.

Weaknesses

(1) I have spent some time trying to understand the authors' argument on why it would be beneficial to stop the gradient of the policy with respect to the current state (i.e. $\partial_x N$ in author's notation). But I still find that the motivations and the justifications for this is quite weak. Fundamental theorem of calculus (generalized Stokes) tells us a nice connection about the loss and its gradient, so for smooth systems that the authors are considering, if we integrate back the gradie

Code & Models

Repositories

tum-pbs/stablebptt
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExperimental Learning in Engineering