BiJEPA: Bi-directional Joint Embedding Predictive Architecture for Symmetric Representation Learning

Yongchao Huang

arXiv:2603.00049·cs.LG·March 3, 2026

BiJEPA: Bi-directional Joint Embedding Predictive Architecture for Symmetric Representation Learning

Yongchao Huang

PDF

Open Access

TL;DR

BiJEPA introduces a bi-directional, cycle-consistent SSL architecture with norm regularization, improving stability and representation quality across diverse data modalities.

Contribution

It proposes a novel bi-directional predictive framework with cycle consistency and regularization, enhancing stability and semantic capture in SSL.

Findings

01

Achieves stable convergence without collapse.

02

Captures semantic structure of chaotic systems.

03

Learns robust representations for generation and generalization.

Abstract

Self-Supervised Learning (SSL) has shifted from pixel-level reconstruction to latent space prediction, spearheaded by the Joint Embedding Predictive Architecture (JEPA). While effective, standard JEPA models typically rely on a uni-directional prediction mechanism (e.g. Context $\to$ Target), potentially neglecting the informative signal inherent in the inverse relationship, degrading its performance. In this work, we propose \textbf{BiJEPA}, a \textit{Bi-Directional Joint Embedding Predictive Architecture} that enforces cycle-consistent predictability between data segments. We address the inherent instability of symmetric prediction (representation explosion) by introducing a critical norm regularization mechanism on the representation vectors. We evaluate BiJEPA on three distinct modalities: synthetic periodic signals, chaotic Lorenz attractor trajectories, and high-dimensional image…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Face recognition and analysis