Loading paper
Align before Attend: Aligning Visual and Textual Features for Multimodal Hateful Content Detection | Tomesphere