Loading paper
From Pixels to Tokens: A Systematic Study of Latent Action Supervision for Vision-Language-Action Models | Tomesphere