Loading paper
Video Prediction Models as General Visual Encoders | Tomesphere