Loading paper
Enhancing image captioning with depth information using a Transformer-based framework | Tomesphere