Loading paper
SHARCS: Efficient Transformers through Routing with Dynamic Width Sub-networks | Tomesphere