Loading paper
Interpreting the structure of multi-object representations in vision encoders | Tomesphere