Loading paper
An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training | Tomesphere