Is the reconstruction loss culprit? An attempt to outperform JEPA
Alexey Potapov, Oleg Shcherbakov, Ivan Kravchenko

TL;DR
This paper compares predictive representation learning and reconstruction-based autoencoders on a controlled dynamical system, revealing that autoencoder failures are due to objective asymmetries and proposing gated autoencoders that improve stability and performance.
Contribution
It introduces gated predictive autoencoders that learn to select predictable components, outperforming JEPA in a controlled setting.
Findings
Gated autoencoders are stable across noise levels.
Gated autoencoders match or outperform JEPA.
Autoencoder failures are linked to objective asymmetries.
Abstract
We evaluate JEPA-style predictive representation learning versus reconstruction-based autoencoders on a controlled "TV-series" linear dynamical system with known latent state and a single noise parameter. While an initial comparison suggests JEPA is markedly more robust to noise, further diagnostics show that autoencoder failures are strongly influenced by asymmetries in objectives and by bottleneck/component-selection effects (confirmed by PCA baselines). Motivated by these findings, we introduce gated predictive autoencoders that learn to select predictable components, mimicking the beneficial feature-selection behavior observed in over-parameterized PCA. On this toy testbed, the proposed gated model is stable across noise levels and matches or outperforms JEPA.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks
