Loading paper
Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet | Tomesphere