Loading paper
Vision-as-Inverse-Graphics Agent via Interleaved Multimodal Reasoning | Tomesphere