Loading paper
Embedded Heterogeneous Attention Transformer for Cross-lingual Image Captioning | Tomesphere