Loading paper
Vision encoders should be image size agnostic and task driven | Tomesphere