Loading paper
Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agents | Tomesphere