Loading paper
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features | Tomesphere