Stabilizing Backpropagation Through Time to Learn Complex Physics
Patrick Schnell, Nils Thuerey

TL;DR
This paper introduces a modified backpropagation through time method tailored for physics simulations, which stabilizes gradient flow and improves learning in complex, long-horizon control tasks.
Contribution
It proposes a novel stabilization technique for backpropagation through time that maintains minima positions and enhances training stability in physics-based recurrent models.
Findings
Improved control performance on complex physics tasks
Stabilized gradient flow reduces exploding and vanishing updates
Method scales well with task complexity
Abstract
Of all the vector fields surrounding the minima of recurrent learning setups, the gradient field with its exploding and vanishing updates appears a poor choice for optimization, offering little beyond efficient computability. We seek to improve this suboptimal practice in the context of physics simulations, where backpropagating feedback through many unrolled time steps is considered crucial to acquiring temporally coherent behavior. The alternative vector field we propose follows from two principles: physics simulators, unlike neural networks, have a balanced gradient flow, and certain modifications to the backpropagation pass leave the positions of the original minima unchanged. As any modification of backpropagation decouples forward and backward pass, the rotation-free character of the gradient field is lost. Therefore, we discuss the negative implications of using such a rotational…
Peer Reviews
Decision·ICLR 2024 poster
- Very well written. Excellent presentation with nice examples. - The toy examples and figures really helped with illustrating the points. - Experiments are clear and support the claims. - Differentiable control of simulators is a topic of interest. This could have high impact.
- I think the paper could have benefited from having more discussion of the chosen experiment applications. Knowing how these applications compare to potential real-world applications in terms of complexity would have been valuable context for the section on computational cost. It also would have given some context on how well this method might scale to complicated simulations.
- The paper is well written. The toy example and visuals are very helpful in terms of understanding the problem and solution. The choice of the example is well explained. - The proposed method is conceptually simple (gradient stopping and sign check) but sheds light on using gradient modification to stabilize optimization.
**Experiments** - The visualization of results could be improved. For example in Fig 3, multiple curves have the same color and overlap each other, making it difficult to draw a clear conclusion. It might be better to draw a mean-std plot for each method where the std (computed over trials) is shaded.
1. The visualizations of the problems are well done and the problems that the authors are trying to convey are clearly communicated to the reader. 2. The method shows convincing improvement over competitors on simple experiments.
(1) I have spent some time trying to understand the authors' argument on why it would be beneficial to stop the gradient of the policy with respect to the current state (i.e. $\partial_x N$ in author's notation). But I still find that the motivations and the justifications for this is quite weak. Fundamental theorem of calculus (generalized Stokes) tells us a nice connection about the loss and its gradient, so for smooth systems that the authors are considering, if we integrate back the gradie
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExperimental Learning in Engineering
